Intermediate

Predictive Maintenance

Build machine learning models that predict equipment failures before they occur — reducing downtime, maintenance costs, and unexpected production losses.

Maintenance Strategies Compared

StrategyApproachCostDowntime
ReactiveFix when it breaksHighest (emergency repairs)Unpredictable, high
PreventiveSchedule-based maintenanceMedium (over-maintenance)Planned but frequent
PredictiveAI predicts optimal timingLowest (targeted repairs)Minimal, planned
PrescriptiveAI recommends specific actionsLowest + optimizedMinimal + root cause

Data Sources for Predictive Maintenance

📈

Vibration

Accelerometers detect bearing wear, imbalance, misalignment, and looseness. The most common predictive maintenance signal.

🌡

Temperature

Thermocouples and IR sensors detect overheating from friction, electrical faults, and cooling system failures.

Current/Power

Motor current signature analysis (MCSA) detects rotor bar breaks, stator faults, and load anomalies.

🔊

Acoustic

Ultrasonic and audio sensors detect leaks, bearing defects, and electrical discharge not visible to other sensors.

Building a Predictive Maintenance Model

import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

# Load sensor data with failure labels
data = pd.read_csv('machine_sensors.csv')

# Feature engineering: rolling statistics
for col in ['vibration', 'temperature', 'pressure']:
    data[f'{col}_mean_1h'] = data[col].rolling(60).mean()
    data[f'{col}_std_1h'] = data[col].rolling(60).std()
    data[f'{col}_max_1h'] = data[col].rolling(60).max()
    data[f'{col}_trend'] = data[col].diff(periods=30)

data = data.dropna()

# Prepare features and labels
feature_cols = [c for c in data.columns if c not in ['timestamp', 'failure']]
X = data[feature_cols]
y = data['failure']  # 0=normal, 1=failure within 24h

# Train/test split (temporal)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, shuffle=False
)

# Train model
model = RandomForestClassifier(n_estimators=100, class_weight='balanced')
model.fit(X_train, y_train)

# Evaluate
predictions = model.predict(X_test)
print(classification_report(y_test, predictions))

Remaining Useful Life (RUL) Estimation

Instead of binary failure prediction, RUL models estimate how many hours/cycles of useful life remain:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout

# LSTM model for RUL prediction
model = Sequential([
    LSTM(64, input_shape=(sequence_length, n_features),
         return_sequences=True),
    Dropout(0.2),
    LSTM(32),
    Dropout(0.2),
    Dense(16, activation='relu'),
    Dense(1)  # Predicted remaining cycles
])

model.compile(optimizer='adam', loss='mse', metrics=['mae'])
model.fit(X_train_seq, y_train_rul, epochs=50,
          batch_size=32, validation_split=0.2)

ML Approaches for PdM

ApproachMethodBest For
ClassificationRandom Forest, XGBoostFailure/no-failure prediction
Anomaly detectionIsolation Forest, AutoencodersDetecting unknown failure modes
RUL estimationLSTM, TransformerPredicting remaining useful life
Survival analysisCox PH, RSFTime-to-event modeling
Key takeaway: Start with anomaly detection (unsupervised) if you don't have labeled failure data. Once you collect enough failure events, train supervised classification or RUL models. Feature engineering on rolling statistics is often more important than model complexity.