Intermediate
Predictive Maintenance
Build machine learning models that predict equipment failures before they occur — reducing downtime, maintenance costs, and unexpected production losses.
Maintenance Strategies Compared
| Strategy | Approach | Cost | Downtime |
|---|---|---|---|
| Reactive | Fix when it breaks | Highest (emergency repairs) | Unpredictable, high |
| Preventive | Schedule-based maintenance | Medium (over-maintenance) | Planned but frequent |
| Predictive | AI predicts optimal timing | Lowest (targeted repairs) | Minimal, planned |
| Prescriptive | AI recommends specific actions | Lowest + optimized | Minimal + root cause |
Data Sources for Predictive Maintenance
Vibration
Accelerometers detect bearing wear, imbalance, misalignment, and looseness. The most common predictive maintenance signal.
Temperature
Thermocouples and IR sensors detect overheating from friction, electrical faults, and cooling system failures.
Current/Power
Motor current signature analysis (MCSA) detects rotor bar breaks, stator faults, and load anomalies.
Acoustic
Ultrasonic and audio sensors detect leaks, bearing defects, and electrical discharge not visible to other sensors.
Building a Predictive Maintenance Model
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
# Load sensor data with failure labels
data = pd.read_csv('machine_sensors.csv')
# Feature engineering: rolling statistics
for col in ['vibration', 'temperature', 'pressure']:
data[f'{col}_mean_1h'] = data[col].rolling(60).mean()
data[f'{col}_std_1h'] = data[col].rolling(60).std()
data[f'{col}_max_1h'] = data[col].rolling(60).max()
data[f'{col}_trend'] = data[col].diff(periods=30)
data = data.dropna()
# Prepare features and labels
feature_cols = [c for c in data.columns if c not in ['timestamp', 'failure']]
X = data[feature_cols]
y = data['failure'] # 0=normal, 1=failure within 24h
# Train/test split (temporal)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, shuffle=False
)
# Train model
model = RandomForestClassifier(n_estimators=100, class_weight='balanced')
model.fit(X_train, y_train)
# Evaluate
predictions = model.predict(X_test)
print(classification_report(y_test, predictions))
Remaining Useful Life (RUL) Estimation
Instead of binary failure prediction, RUL models estimate how many hours/cycles of useful life remain:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
# LSTM model for RUL prediction
model = Sequential([
LSTM(64, input_shape=(sequence_length, n_features),
return_sequences=True),
Dropout(0.2),
LSTM(32),
Dropout(0.2),
Dense(16, activation='relu'),
Dense(1) # Predicted remaining cycles
])
model.compile(optimizer='adam', loss='mse', metrics=['mae'])
model.fit(X_train_seq, y_train_rul, epochs=50,
batch_size=32, validation_split=0.2)
ML Approaches for PdM
| Approach | Method | Best For |
|---|---|---|
| Classification | Random Forest, XGBoost | Failure/no-failure prediction |
| Anomaly detection | Isolation Forest, Autoencoders | Detecting unknown failure modes |
| RUL estimation | LSTM, Transformer | Predicting remaining useful life |
| Survival analysis | Cox PH, RSF | Time-to-event modeling |
Key takeaway: Start with anomaly detection (unsupervised) if you don't have labeled failure data. Once you collect enough failure events, train supervised classification or RUL models. Feature engineering on rolling statistics is often more important than model complexity.