Intermediate

CNNs & Computer Vision

Build convolutional neural networks for image classification, leverage transfer learning with pre-trained models, and explore TensorFlow Hub for ready-to-use vision models.

What Are CNNs?

Convolutional Neural Networks (CNNs) are specialized neural networks designed for processing grid-like data such as images. Instead of connecting every neuron to every input (like Dense layers), CNNs use small learnable filters that slide across the image to detect patterns like edges, textures, and shapes.

Building a CNN with Keras

Python
from tensorflow import keras

model = keras.Sequential([
    # First convolutional block
    keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    keras.layers.MaxPooling2D((2, 2)),

    # Second convolutional block
    keras.layers.Conv2D(64, (3, 3), activation='relu'),
    keras.layers.MaxPooling2D((2, 2)),

    # Third convolutional block
    keras.layers.Conv2D(64, (3, 3), activation='relu'),

    # Classification head
    keras.layers.Flatten(),
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dense(10, activation='softmax')
])

model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

Key CNN Layers

Layer Purpose Example
Conv2D Apply learnable filters to detect spatial features Conv2D(32, (3,3), activation='relu')
MaxPooling2D Downsample feature maps by taking the maximum value MaxPooling2D((2,2))
AveragePooling2D Downsample by averaging values in each window AveragePooling2D((2,2))
GlobalAveragePooling2D Reduce each feature map to a single number GlobalAveragePooling2D()

Image Classification with CIFAR-10

Python
from tensorflow import keras

# Load CIFAR-10 (60,000 32x32 color images in 10 classes)
(x_train, y_train), (x_test, y_test) = keras.datasets.cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

# Data augmentation for better generalization
data_augmentation = keras.Sequential([
    keras.layers.RandomFlip('horizontal'),
    keras.layers.RandomRotation(0.1),
    keras.layers.RandomZoom(0.1),
])

# Build model with augmentation
model = keras.Sequential([
    data_augmentation,
    keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
    keras.layers.MaxPooling2D((2,2)),
    keras.layers.Conv2D(64, (3,3), activation='relu'),
    keras.layers.MaxPooling2D((2,2)),
    keras.layers.Conv2D(128, (3,3), activation='relu'),
    keras.layers.GlobalAveragePooling2D(),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=20, validation_data=(x_test, y_test))

Transfer Learning

Transfer learning lets you leverage models pre-trained on millions of images (like ImageNet) and fine-tune them for your specific task. This is far more effective than training from scratch, especially when you have limited data:

Python
from tensorflow import keras

# Load a pre-trained model without the classification head
base_model = keras.applications.MobileNetV2(
    weights='imagenet',
    include_top=False,
    input_shape=(224, 224, 3)
)

# Freeze the base model weights
base_model.trainable = False

# Add a custom classification head
model = keras.Sequential([
    base_model,
    keras.layers.GlobalAveragePooling2D(),
    keras.layers.Dense(128, activation='relu'),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(5, activation='softmax')  # 5 classes
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train only the new layers
model.fit(train_dataset, epochs=10, validation_data=val_dataset)

# Fine-tune: unfreeze some layers and train with a low learning rate
base_model.trainable = True
for layer in base_model.layers[:-20]:
    layer.trainable = False

model.compile(optimizer=keras.optimizers.Adam(1e-5), loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(train_dataset, epochs=5, validation_data=val_dataset)

TensorFlow Hub

TensorFlow Hub provides hundreds of pre-trained models you can use directly or fine-tune:

Python
import tensorflow_hub as hub

# Use a pre-trained feature extractor from TF Hub
feature_extractor = hub.KerasLayer(
    "https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/feature_vector/5",
    trainable=False
)

model = keras.Sequential([
    feature_extractor,
    keras.layers.Dense(5, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
When to Use Transfer Learning: Almost always! Unless you have millions of labeled images for your specific task, transfer learning will give you better results faster. Start with a frozen pre-trained base, train the head, then optionally fine-tune.

Next Up: NLP with TensorFlow

Learn how to process text data, build embeddings, and create sequence models for natural language tasks.

Next: NLP with TensorFlow →