GPU Instance Types Guide Intermediate

Choosing the right GPU instance is one of the most impactful decisions in AI infrastructure. This lesson provides a comprehensive comparison of GPU instances across AWS, GCP, and Azure, with guidance on matching instance types to specific workloads.

AWS GPU Instances

Family	GPU	Count	GPU Memory	Use Case
p5	H100	8	640GB	Large-scale training
p4d	A100	8	320GB	Training, large inference
g5	A10G	1-8	24-192GB	Inference, fine-tuning
g6	L4	1-8	24-192GB	Inference, video AI
inf2	Inferentia2	1-12	32-384GB	Cost-efficient inference

GCP GPU Instances

Machine Type	GPU	Count	GPU Memory	Use Case
a3-highgpu	H100	8	640GB	Large-scale training
a2-highgpu	A100	1-16	40-640GB	Training, inference
g2-standard	L4	1-8	24-192GB	Inference, video AI
TPU v5p	TPU	pods	variable	JAX/TF training

Instance Selection Decision Tree

Decision Tree

Q: What is the workload?

Training LLM (>13B params):
  → p5 (AWS) / a3 (GCP) / ND H100 v5 (Azure)

Fine-tuning (7B-13B params):
  → g5.2xlarge (AWS) / a2-highgpu-1g (GCP)

Real-time inference (LLM):
  → g5 (AWS) / g2 (GCP) - size by model memory need

Batch inference (high throughput):
  → inf2 (AWS) / g2 (GCP) - optimize for cost/token

Computer vision training:
  → g5 or p4d (AWS) / a2 (GCP) - based on model size

Cost Tip: Always check spot pricing across multiple instance types. A g5.2xlarge on spot may be cheaper than a g6.xlarge on-demand while delivering comparable inference performance.

Ready for Multi-GPU Training?

The next lesson covers distributed training strategies across multiple GPUs and nodes.

Next: Multi-GPU →

← AMD GPUs Multi-GPU →