NLP with TensorFlow
Learn to process text data, build word embeddings, implement RNNs and LSTMs, and leverage Transformer architectures for natural language understanding.
Text Preprocessing
Before feeding text to a neural network, you need to convert it into numerical form. TensorFlow provides the TextVectorization layer for this:
import tensorflow as tf from tensorflow import keras # TextVectorization layer handles tokenization and encoding vectorizer = keras.layers.TextVectorization( max_tokens=10000, # Vocabulary size output_mode='int', # Output integer token IDs output_sequence_length=200 # Pad/truncate to 200 tokens ) # Adapt the vocabulary from your training data vectorizer.adapt(train_texts) # Now use it as the first layer in your model text_input = keras.Input(shape=(1,), dtype=tf.string) x = vectorizer(text_input) # ... rest of the model
Word Embeddings
Embeddings map discrete tokens to dense vectors in a continuous space where semantically similar words are close together:
# Learnable embedding layer model = keras.Sequential([ keras.layers.Embedding( input_dim=10000, # Vocabulary size output_dim=128, # Embedding dimension input_length=200 # Sequence length ), keras.layers.GlobalAveragePooling1D(), keras.layers.Dense(64, activation='relu'), keras.layers.Dense(1, activation='sigmoid') # Binary classification ]) # Or use pre-trained embeddings (GloVe, Word2Vec) import numpy as np embedding_matrix = np.zeros((10000, 100)) # ... load pre-trained vectors into embedding_matrix pretrained_embedding = keras.layers.Embedding( input_dim=10000, output_dim=100, weights=[embedding_matrix], trainable=False # Freeze pre-trained weights )
Recurrent Neural Networks (RNNs)
RNNs process sequences step by step, maintaining a hidden state that captures information from previous time steps. LSTMs and GRUs are improved variants that handle long-range dependencies:
# LSTM-based sentiment analysis model = keras.Sequential([ keras.layers.Embedding(10000, 128), keras.layers.Bidirectional(keras.layers.LSTM(64, return_sequences=True)), keras.layers.Bidirectional(keras.layers.LSTM(32)), keras.layers.Dense(64, activation='relu'), keras.layers.Dropout(0.5), keras.layers.Dense(1, activation='sigmoid') ]) model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) # GRU variant (faster, often comparable performance) gru_model = keras.Sequential([ keras.layers.Embedding(10000, 128), keras.layers.GRU(64, return_sequences=True), keras.layers.GRU(32), keras.layers.Dense(1, activation='sigmoid') ])
Transformers with TensorFlow
Transformers have largely replaced RNNs for NLP tasks. You can build Transformer blocks in Keras or use pre-trained models via TensorFlow Hub and Hugging Face:
import tensorflow as tf from tensorflow import keras # Custom Transformer block class TransformerBlock(keras.layers.Layer): def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1): super().__init__() self.att = keras.layers.MultiHeadAttention( num_heads=num_heads, key_dim=embed_dim ) self.ffn = keras.Sequential([ keras.layers.Dense(ff_dim, activation='relu'), keras.layers.Dense(embed_dim) ]) self.norm1 = keras.layers.LayerNormalization() self.norm2 = keras.layers.LayerNormalization() self.dropout1 = keras.layers.Dropout(rate) self.dropout2 = keras.layers.Dropout(rate) def call(self, inputs, training): attn_output = self.att(inputs, inputs) attn_output = self.dropout1(attn_output, training=training) out1 = self.norm1(inputs + attn_output) ffn_output = self.ffn(out1) ffn_output = self.dropout2(ffn_output, training=training) return self.norm2(out1 + ffn_output) # Use pre-trained BERT via Hugging Face from transformers import TFAutoModel, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased') bert_model = TFAutoModel.from_pretrained('bert-base-uncased')
Complete Text Classification Example
import tensorflow as tf from tensorflow import keras # Load IMDB movie review dataset (x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=10000) # Pad sequences to uniform length x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=200) x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=200) # Build model model = keras.Sequential([ keras.layers.Embedding(10000, 128, input_length=200), keras.layers.Bidirectional(keras.layers.LSTM(64)), keras.layers.Dense(64, activation='relu'), keras.layers.Dropout(0.5), keras.layers.Dense(1, activation='sigmoid') ]) model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy']) model.fit(x_train, y_train, epochs=5, batch_size=64, validation_split=0.2) model.evaluate(x_test, y_test)
Next Up: Best Practices
Learn performance optimization, debugging techniques, and deployment strategies for TensorFlow models.
Next: Best Practices →
Lilly Tech Systems