GPU Instance Types Guide Intermediate

Choosing the right GPU instance is one of the most impactful decisions in AI infrastructure. This lesson provides a comprehensive comparison of GPU instances across AWS, GCP, and Azure, with guidance on matching instance types to specific workloads.

AWS GPU Instances

FamilyGPUCountGPU MemoryUse Case
p5H1008640GBLarge-scale training
p4dA1008320GBTraining, large inference
g5A10G1-824-192GBInference, fine-tuning
g6L41-824-192GBInference, video AI
inf2Inferentia21-1232-384GBCost-efficient inference

GCP GPU Instances

Machine TypeGPUCountGPU MemoryUse Case
a3-highgpuH1008640GBLarge-scale training
a2-highgpuA1001-1640-640GBTraining, inference
g2-standardL41-824-192GBInference, video AI
TPU v5pTPUpodsvariableJAX/TF training

Instance Selection Decision Tree

Decision Tree
Q: What is the workload?

Training LLM (>13B params):
  → p5 (AWS) / a3 (GCP) / ND H100 v5 (Azure)

Fine-tuning (7B-13B params):
  → g5.2xlarge (AWS) / a2-highgpu-1g (GCP)

Real-time inference (LLM):
  → g5 (AWS) / g2 (GCP) - size by model memory need

Batch inference (high throughput):
  → inf2 (AWS) / g2 (GCP) - optimize for cost/token

Computer vision training:
  → g5 or p4d (AWS) / a2 (GCP) - based on model size
Cost Tip: Always check spot pricing across multiple instance types. A g5.2xlarge on spot may be cheaper than a g6.xlarge on-demand while delivering comparable inference performance.

Ready for Multi-GPU Training?

The next lesson covers distributed training strategies across multiple GPUs and nodes.

Next: Multi-GPU →