GPU Instance Types Guide Intermediate
Choosing the right GPU instance is one of the most impactful decisions in AI infrastructure. This lesson provides a comprehensive comparison of GPU instances across AWS, GCP, and Azure, with guidance on matching instance types to specific workloads.
AWS GPU Instances
| Family | GPU | Count | GPU Memory | Use Case |
|---|---|---|---|---|
| p5 | H100 | 8 | 640GB | Large-scale training |
| p4d | A100 | 8 | 320GB | Training, large inference |
| g5 | A10G | 1-8 | 24-192GB | Inference, fine-tuning |
| g6 | L4 | 1-8 | 24-192GB | Inference, video AI |
| inf2 | Inferentia2 | 1-12 | 32-384GB | Cost-efficient inference |
GCP GPU Instances
| Machine Type | GPU | Count | GPU Memory | Use Case |
|---|---|---|---|---|
| a3-highgpu | H100 | 8 | 640GB | Large-scale training |
| a2-highgpu | A100 | 1-16 | 40-640GB | Training, inference |
| g2-standard | L4 | 1-8 | 24-192GB | Inference, video AI |
| TPU v5p | TPU | pods | variable | JAX/TF training |
Instance Selection Decision Tree
Decision Tree
Q: What is the workload? Training LLM (>13B params): → p5 (AWS) / a3 (GCP) / ND H100 v5 (Azure) Fine-tuning (7B-13B params): → g5.2xlarge (AWS) / a2-highgpu-1g (GCP) Real-time inference (LLM): → g5 (AWS) / g2 (GCP) - size by model memory need Batch inference (high throughput): → inf2 (AWS) / g2 (GCP) - optimize for cost/token Computer vision training: → g5 or p4d (AWS) / a2 (GCP) - based on model size
Cost Tip: Always check spot pricing across multiple instance types. A g5.2xlarge on spot may be cheaper than a g6.xlarge on-demand while delivering comparable inference performance.
Ready for Multi-GPU Training?
The next lesson covers distributed training strategies across multiple GPUs and nodes.
Next: Multi-GPU →
Lilly Tech Systems