Introduction to Pretrained Models Beginner

Pretrained models are AI models that have already been trained on large datasets and can be reused for new tasks. Instead of training a model from scratch (which requires massive data and compute), you can leverage existing models and adapt them to your specific needs.

What Are Pretrained Models?

A pretrained model is a neural network that has been trained on a large dataset to learn general features and patterns. These learned representations can then be applied to new, related tasks — a technique called transfer learning.

Analogy: Think of a pretrained model like a person who has a university education. They have broad knowledge that can be applied to many specific jobs. You don't need to re-teach them everything from scratch — just the specifics of the new role.

Transfer Learning

Transfer learning is the practice of taking a model trained on one task and reusing it (with or without modification) for a different task:

  1. Feature Extraction

    Use the pretrained model as a fixed feature extractor. Feed your data through the model and use its outputs as inputs to a simpler classifier. No retraining of the pretrained model is needed.

  2. Fine-tuning

    Start with the pretrained weights and continue training on your specific dataset. The model adapts its learned features to your task while retaining general knowledge.

  3. Zero-shot / Few-shot

    Modern large models can perform tasks they were never explicitly trained on, using natural language instructions or just a few examples.

Why Use Pretrained Models?

BenefitDetails
Save timeSkip weeks or months of training. Use a model that is ready in minutes.
Less data neededFine-tune with hundreds of samples instead of millions.
Better performancePretrained models learn rich features from massive datasets that small custom models cannot match.
Lower costNo need for expensive GPU clusters to train from scratch.
State-of-the-artAccess the same models used by top research labs and companies.

Types of Pretrained Models

TypeTasksExample Models
VisionImage classification, object detection, segmentation, generationResNet, YOLO, SAM, Stable Diffusion
LanguageText generation, classification, translation, summarization, embeddingsGPT-2, Llama, BERT, T5, sentence-transformers
AudioSpeech-to-text, text-to-speech, music generation, audio classificationWhisper, Bark, MusicGen, wav2vec
Multi-ModalImage+text understanding, text-to-image, video, document AICLIP, LLaVA, DALL-E, LayoutLM

Where to Find Pretrained Models

PlatformModels AvailableURL
Hugging Face Hub900,000+huggingface.co/models
TensorFlow Hub1,000+tfhub.dev
PyTorch Hub100+pytorch.org/hub
Kaggle Models3,000+kaggle.com/models
ONNX Model Zoo100+github.com/onnx/models

Model Formats

FormatFrameworkUse Case
PyTorch (.pt, .pth)PyTorchMost common in research
SafeTensors (.safetensors)Framework-agnosticSafe, fast loading (Hugging Face standard)
ONNX (.onnx)Cross-frameworkOptimized inference, deployment
GGUF (.gguf)llama.cppQuantized LLMs for CPU inference
TensorFlow (.pb, SavedModel)TensorFlowTF Serving, TFLite deployment
Core ML (.mlmodel)AppleiOS/macOS deployment

Ready to Explore?

Let's start by exploring the largest model hub in the world — Hugging Face.

Next: Hugging Face Hub →