Category 4: Time Series
The time series category tests your ability to forecast future values from sequential data. You must create windowed datasets from raw time series, build RNN/LSTM models, and achieve target MAE thresholds. This is often considered the hardest exam category.
What the Exam Tests
You receive a time series (e.g., temperature, stock prices, sunspot activity) and must build a model that predicts future values. The key challenge is creating windowed training data from a single sequence and choosing the right model architecture.
tf.data.Dataset.window() pattern — you will use it in every time series task. Getting this wrong means your model trains on garbage data.Creating Windowed Datasets
The core technique for time series: split a sequence into overlapping windows where each window is a training example. The last value in each window is the label (what we predict).
import tensorflow as tf
import numpy as np
# ---- The windowed dataset function (MEMORIZE THIS) ----
def windowed_dataset(series, window_size, batch_size, shuffle_buffer):
"""
Convert a time series into a windowed tf.data.Dataset.
Args:
series: numpy array of time series values
window_size: number of time steps to use as input
batch_size: batch size for training
shuffle_buffer: buffer size for shuffling
Returns:
tf.data.Dataset of (window, label) pairs
"""
dataset = tf.data.Dataset.from_tensor_slices(series)
dataset = dataset.window(window_size + 1, shift=1, drop_remainder=True)
dataset = dataset.flat_map(lambda w: w.batch(window_size + 1))
dataset = dataset.map(lambda w: (w[:-1], w[-1])) # (input, label)
dataset = dataset.shuffle(shuffle_buffer)
dataset = dataset.batch(batch_size).prefetch(1)
return dataset
# ---- Example usage ----
# Generate synthetic time series
time = np.arange(0, 1000)
series = 10 + np.sin(time * 0.1) * 10 + np.random.randn(1000) * 2
series = series.astype(np.float32)
# Split into train/validation
SPLIT_TIME = 800
train_series = series[:SPLIT_TIME]
val_series = series[SPLIT_TIME:]
# Create windowed datasets
WINDOW_SIZE = 20
BATCH_SIZE = 32
SHUFFLE_BUFFER = 1000
train_dataset = windowed_dataset(
train_series, WINDOW_SIZE, BATCH_SIZE, SHUFFLE_BUFFER
)
# Inspect one batch
for x, y in train_dataset.take(1):
print(f"Input shape: {x.shape}") # (32, 20)
print(f"Label shape: {y.shape}") # (32,)
print(f"Input[0]: {x[0].numpy()[:5]}...")
print(f"Label[0]: {y[0].numpy()}")
Practice Model 1: Dense Network for Time Series
Start with a simple dense network. This works surprisingly well for many exam tasks and trains much faster than LSTM.
import tensorflow as tf
import numpy as np
# ---- Generate synthetic time series with trend + seasonality ----
def generate_series(time, trend_slope=0.05, seasonality_period=365,
seasonality_amplitude=40, noise_level=5):
trend = trend_slope * time
seasonal = seasonality_amplitude * np.sin(2 * np.pi * time / seasonality_period)
noise = noise_level * np.random.randn(len(time))
return (trend + seasonal + noise).astype(np.float32)
time = np.arange(0, 1500)
series = generate_series(time)
SPLIT_TIME = 1200
train_series = series[:SPLIT_TIME]
val_series = series[SPLIT_TIME:]
WINDOW_SIZE = 30
BATCH_SIZE = 32
# Reuse windowed_dataset function from above
train_dataset = windowed_dataset(train_series, WINDOW_SIZE, BATCH_SIZE, 1000)
# ---- Simple Dense model ----
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu', input_shape=[WINDOW_SIZE]),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(1)
])
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
loss='mse',
metrics=['mae']
)
history = model.fit(
train_dataset,
epochs=50,
callbacks=[
tf.keras.callbacks.EarlyStopping(monitor='loss', patience=5)
]
)
# ---- Forecast validation period ----
def forecast_series(model, series, window_size, split_time):
forecast = []
for t in range(split_time, len(series)):
window = series[t - window_size:t][np.newaxis]
pred = model.predict(window, verbose=0)[0, 0]
forecast.append(pred)
return np.array(forecast)
forecast = forecast_series(model, series, WINDOW_SIZE, SPLIT_TIME)
mae = np.mean(np.abs(forecast - val_series[:len(forecast)]))
print(f"Validation MAE: {mae:.4f}")
model.save('timeseries_dense.h5')
Practice Model 2: LSTM for Time Series
When a dense network is not enough, use LSTM. The key difference is reshaping input to 3D: (batch, timesteps, features).
import tensorflow as tf
import numpy as np
# Assume: train_dataset created with windowed_dataset()
# WINDOW_SIZE = 30
# ---- LSTM model ----
# CRITICAL: LSTM expects 3D input: (batch_size, timesteps, features)
# Our windowed data is 2D: (batch_size, window_size)
# We need to add a feature dimension using Lambda or Reshape
model = tf.keras.Sequential([
# Add feature dimension: (batch, window_size) -> (batch, window_size, 1)
tf.keras.layers.Lambda(lambda x: tf.expand_dims(x, axis=-1),
input_shape=[30]),
# Bidirectional LSTM
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32, return_sequences=True)),
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(16)),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(1)
])
# ---- Learning rate scheduling (important for time series) ----
lr_schedule = tf.keras.callbacks.LearningRateScheduler(
lambda epoch: 1e-4 * 10**(epoch / 20) # Exponential increase to find best LR
)
model.compile(
optimizer=tf.keras.optimizers.SGD(learning_rate=1e-4, momentum=0.9),
loss='mse',
metrics=['mae']
)
# Step 1: Find optimal learning rate (run ~100 epochs with lr_schedule)
# history = model.fit(train_dataset, epochs=100, callbacks=[lr_schedule])
# Plot loss vs learning_rate to find the minimum
# Step 2: Train with the optimal learning rate
model.compile(
optimizer=tf.keras.optimizers.SGD(learning_rate=1e-5, momentum=0.9),
loss='mse',
metrics=['mae']
)
# history = model.fit(train_dataset, epochs=200)
model.save('timeseries_lstm.h5')
Practice Model 3: Conv1D + LSTM Hybrid
A powerful combination that often outperforms pure LSTM on exam tasks. Conv1D extracts local patterns, LSTM captures long-range dependencies.
import tensorflow as tf
# ---- Conv1D + LSTM hybrid model ----
WINDOW_SIZE = 30
model = tf.keras.Sequential([
# Expand dimensions for Conv1D
tf.keras.layers.Lambda(lambda x: tf.expand_dims(x, axis=-1),
input_shape=[WINDOW_SIZE]),
# Conv1D to extract local patterns
tf.keras.layers.Conv1D(64, kernel_size=5, strides=1,
padding='causal', activation='relu'),
tf.keras.layers.Conv1D(64, kernel_size=3, strides=1,
padding='causal', activation='relu'),
# LSTM for sequence modeling
tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32)),
# Output
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(1)
])
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
loss='huber', # Huber loss is robust to outliers
metrics=['mae']
)
# history = model.fit(
# train_dataset,
# epochs=100,
# callbacks=[
# tf.keras.callbacks.EarlyStopping(
# monitor='loss', patience=10,
# restore_best_weights=True
# )
# ]
# )
model.save('timeseries_conv_lstm.h5')
Time Series Quick Reference
# Time Series Exam Cheat Sheet
# 1. ALWAYS split time series chronologically (no random shuffle for split)
SPLIT_TIME = int(0.8 * len(series))
train = series[:SPLIT_TIME]
val = series[SPLIT_TIME:]
# 2. Window size selection:
# - Too small: model cannot capture patterns
# - Too large: model trains slowly, may overfit
# - Good starting points: 20-50 for daily data, 7-14 for weekly patterns
# 3. Model architecture decision:
# - Start with Dense (fastest to train, often good enough)
# - Move to LSTM if Dense MAE is too high
# - Try Conv1D + LSTM hybrid for best results
# 4. Loss functions for time series:
# - 'mse': standard, penalizes large errors more
# - 'mae': robust to outliers
# - 'huber': combination of MSE and MAE (often best for exam)
# 5. Common mistakes:
# - Shuffling the train/val split (must be chronological!)
# - Forgetting to expand dims for LSTM/Conv1D input
# - Not using the windowed_dataset function correctly
# - Setting window_size larger than available data
# - Using too high a learning rate (time series models are sensitive)
Key Takeaways
- Memorize the
windowed_dataset()function — it is the foundation of every time series exam task - Always split time series chronologically, never randomly
- Start with a Dense model first — if MAE is too high, move to LSTM
- LSTM requires 3D input: use
Lambda(lambda x: tf.expand_dims(x, axis=-1)) - Huber loss often works better than MSE for time series
- Learning rate scheduling can significantly improve time series model performance
- Conv1D + LSTM hybrid is the most powerful architecture for exam tasks
Lilly Tech Systems