Intermediate

NLP with TensorFlow

Learn to process text data, build word embeddings, implement RNNs and LSTMs, and leverage Transformer architectures for natural language understanding.

Text Preprocessing

Before feeding text to a neural network, you need to convert it into numerical form. TensorFlow provides the TextVectorization layer for this:

Python

import tensorflow as tf
from tensorflow import keras

# TextVectorization layer handles tokenization and encoding
vectorizer = keras.layers.TextVectorization(
    max_tokens=10000,          # Vocabulary size
    output_mode='int',          # Output integer token IDs
    output_sequence_length=200  # Pad/truncate to 200 tokens
)

# Adapt the vocabulary from your training data
vectorizer.adapt(train_texts)

# Now use it as the first layer in your model
text_input = keras.Input(shape=(1,), dtype=tf.string)
x = vectorizer(text_input)
# ... rest of the model

Word Embeddings

Embeddings map discrete tokens to dense vectors in a continuous space where semantically similar words are close together:

Python

# Learnable embedding layer
model = keras.Sequential([
    keras.layers.Embedding(
        input_dim=10000,    # Vocabulary size
        output_dim=128,     # Embedding dimension
        input_length=200    # Sequence length
    ),
    keras.layers.GlobalAveragePooling1D(),
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dense(1, activation='sigmoid')  # Binary classification
])

# Or use pre-trained embeddings (GloVe, Word2Vec)
import numpy as np

embedding_matrix = np.zeros((10000, 100))
# ... load pre-trained vectors into embedding_matrix

pretrained_embedding = keras.layers.Embedding(
    input_dim=10000,
    output_dim=100,
    weights=[embedding_matrix],
    trainable=False  # Freeze pre-trained weights
)

Recurrent Neural Networks (RNNs)

RNNs process sequences step by step, maintaining a hidden state that captures information from previous time steps. LSTMs and GRUs are improved variants that handle long-range dependencies:

Python

# LSTM-based sentiment analysis
model = keras.Sequential([
    keras.layers.Embedding(10000, 128),
    keras.layers.Bidirectional(keras.layers.LSTM(64, return_sequences=True)),
    keras.layers.Bidirectional(keras.layers.LSTM(32)),
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# GRU variant (faster, often comparable performance)
gru_model = keras.Sequential([
    keras.layers.Embedding(10000, 128),
    keras.layers.GRU(64, return_sequences=True),
    keras.layers.GRU(32),
    keras.layers.Dense(1, activation='sigmoid')
])

Transformers with TensorFlow

Transformers have largely replaced RNNs for NLP tasks. You can build Transformer blocks in Keras or use pre-trained models via TensorFlow Hub and Hugging Face:

Python

import tensorflow as tf
from tensorflow import keras

# Custom Transformer block
class TransformerBlock(keras.layers.Layer):
    def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):
        super().__init__()
        self.att = keras.layers.MultiHeadAttention(
            num_heads=num_heads, key_dim=embed_dim
        )
        self.ffn = keras.Sequential([
            keras.layers.Dense(ff_dim, activation='relu'),
            keras.layers.Dense(embed_dim)
        ])
        self.norm1 = keras.layers.LayerNormalization()
        self.norm2 = keras.layers.LayerNormalization()
        self.dropout1 = keras.layers.Dropout(rate)
        self.dropout2 = keras.layers.Dropout(rate)

    def call(self, inputs, training):
        attn_output = self.att(inputs, inputs)
        attn_output = self.dropout1(attn_output, training=training)
        out1 = self.norm1(inputs + attn_output)
        ffn_output = self.ffn(out1)
        ffn_output = self.dropout2(ffn_output, training=training)
        return self.norm2(out1 + ffn_output)

# Use pre-trained BERT via Hugging Face
from transformers import TFAutoModel, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
bert_model = TFAutoModel.from_pretrained('bert-base-uncased')

Complete Text Classification Example

Python

import tensorflow as tf
from tensorflow import keras

# Load IMDB movie review dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=10000)

# Pad sequences to uniform length
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=200)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=200)

# Build model
model = keras.Sequential([
    keras.layers.Embedding(10000, 128, input_length=200),
    keras.layers.Bidirectional(keras.layers.LSTM(64)),
    keras.layers.Dense(64, activation='relu'),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5, batch_size=64, validation_split=0.2)
model.evaluate(x_test, y_test)

Modern NLP: For production NLP tasks today, pre-trained Transformer models (BERT, GPT, T5) via Hugging Face are typically the best choice. Check out the HF Transformers course for a deep dive.

Next Up: Best Practices

Learn performance optimization, debugging techniques, and deployment strategies for TensorFlow models.

Next: Best Practices →

← CNNs & Computer Vision Best Practices →