Home » ML Algorithm Directory » Other Algorithms

Other Algorithms & Master Comparison

Anomaly detection, sequence labeling, association rules, and more

This final section covers important algorithms that do not fit neatly into the previous categories: probabilistic graphical models, anomaly detection methods, association rule mining, and self-organizing maps. The section concludes with the master comparison table of all 100+ algorithms.

1. Hidden Markov Model (HMM)

Description: A probabilistic model for sequential data with hidden (latent) states. Assumes the system transitions between hidden states according to a Markov chain (transition probabilities), and each state emits an observable output (emission probabilities). Three key problems: evaluation (forward algorithm), decoding (Viterbi), and learning (Baum-Welch / EM).

Use Cases: Speech recognition, part-of-speech tagging, gene sequence analysis, handwriting recognition, financial regime detection.

from hmmlearn.hmm import GaussianHMM
import numpy as np

# Generate sequential data
np.random.seed(42)
n_samples = 300
# Simulate two hidden states (e.g., bull/bear market)
state_seq = np.random.choice([0, 1], size=n_samples, p=[0.6, 0.4])
observations = np.where(state_seq == 0,
    np.random.normal(0.05, 0.02, n_samples),
    np.random.normal(-0.03, 0.04, n_samples)
).reshape(-1, 1)

model = GaussianHMM(
    n_components=2,           # Number of hidden states
    covariance_type='full',
    n_iter=100,
    random_state=42
)
model.fit(observations)

# Decode: find most likely hidden state sequence
hidden_states = model.predict(observations)
print(f"Transition matrix:\n{model.transmat_.round(3)}")
print(f"Means: {model.means_.flatten().round(4)}")
print(f"Score (log-likelihood): {model.score(observations):.2f}")

2. Conditional Random Field (CRF)

Description: A discriminative probabilistic model for labeling sequential data. Unlike HMMs (generative), CRFs directly model the conditional probability P(labels | observations), allowing them to use arbitrary features of the input. CRFs define a global normalization factor, avoiding the label bias problem of MEMMs.

Use Cases: Named entity recognition (NER), POS tagging, information extraction, image segmentation.

import sklearn_crfsuite
from sklearn_crfsuite import metrics

# CRF for sequence labeling (e.g., NER)
def word_to_features(sentence, i):
    word = sentence[i]
    features = {
        'word.lower()': word.lower(),
        'word[-3:]': word[-3:],
        'word[-2:]': word[-2:],
        'word.isupper()': word.isupper(),
        'word.istitle()': word.istitle(),
        'word.isdigit()': word.isdigit(),
    }
    if i > 0:
        features['prev_word'] = sentence[i-1].lower()
    if i < len(sentence) - 1:
        features['next_word'] = sentence[i+1].lower()
    return features

# Example training data
X_train = [[word_to_features(s, i) for i in range(len(s))]
           for s in [["John", "lives", "in", "New", "York"]]]
y_train = [["B-PER", "O", "O", "B-LOC", "I-LOC"]]

crf = sklearn_crfsuite.CRF(
    algorithm='lbfgs',
    c1=0.1,                   # L1 regularization
    c2=0.1,                   # L2 regularization
    max_iterations=100
)
crf.fit(X_train, y_train)
print(f"Labels: {crf.classes_}")

3. Isolation Forest

Description: An anomaly detection algorithm based on the principle that anomalies are "few and different" and therefore easier to isolate. Randomly selects a feature and split value to partition data; anomalies require fewer splits to be isolated (shorter path length in the tree).

Use Cases: Fraud detection, network intrusion detection, manufacturing defect detection, any unsupervised anomaly detection task.

from sklearn.ensemble import IsolationForest
import numpy as np

# Normal data + anomalies
np.random.seed(42)
X_normal = np.random.randn(200, 2)
X_anomaly = np.random.uniform(-4, 4, (20, 2))
X = np.vstack([X_normal, X_anomaly])

model = IsolationForest(
    n_estimators=100,
    contamination=0.1,        # Expected proportion of anomalies
    max_features=1.0,
    random_state=42
)
predictions = model.fit_predict(X)

n_anomalies = (predictions == -1).sum()
print(f"Detected anomalies: {n_anomalies} / {len(X)}")
print(f"Anomaly scores (first 5 normal): {model.score_samples(X[:5]).round(3)}")
print(f"Anomaly scores (first 5 anomaly): {model.score_samples(X[200:205]).round(3)}")

4. Local Outlier Factor (LOF)

Description: Detects anomalies by measuring the local density deviation of a point relative to its neighbors. A point with substantially lower density than its neighbors is considered an outlier. The LOF score quantifies how much more (or less) dense a point's neighborhood is compared to its neighbors' neighborhoods.

Use Cases: Outlier detection in datasets with varying densities, fraud detection, sensor data cleaning.

from sklearn.neighbors import LocalOutlierFactor

model = LocalOutlierFactor(
    n_neighbors=20,
    contamination=0.1,
    metric='euclidean',
    novelty=False             # True for prediction on new data
)
predictions = model.fit_predict(X)

n_outliers = (predictions == -1).sum()
print(f"Outliers detected: {n_outliers}")
print(f"LOF scores (first 5): {-model.negative_outlier_factor_[:5].round(3)}")

5. One-Class SVM

Description: An unsupervised anomaly detection variant of SVM. Learns a decision boundary that encloses most of the training data in feature space. Points outside this boundary are classified as anomalies. Works well in high-dimensional spaces.

Use Cases: Novelty detection, when only "normal" data is available for training, high-dimensional anomaly detection.

from sklearn.svm import OneClassSVM
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import make_pipeline

model = make_pipeline(
    StandardScaler(),
    OneClassSVM(
        kernel='rbf',
        gamma='scale',
        nu=0.1                # Upper bound on fraction of outliers
    )
)
model.fit(X_normal)  # Train on normal data only

# Predict on all data
predictions = model.predict(X)
n_anomalies = (predictions == -1).sum()
print(f"Anomalies detected: {n_anomalies} / {len(X)}")

6. Self-Organizing Map (SOM)

Description: An unsupervised neural network that produces a low-dimensional (typically 2D) discretized representation of the input space. Neurons are arranged in a grid, and each neuron has a weight vector. Training uses competitive learning: the best-matching neuron and its neighbors update their weights towards the input, preserving topological relationships.

Use Cases: Visualization of high-dimensional data, customer segmentation, document organization, exploratory data analysis.

from minisom import MiniSom
import numpy as np

# Create and train SOM
np.random.seed(42)
data = np.random.rand(500, 4)  # 500 samples, 4 features

som = MiniSom(
    x=10, y=10,               # 10x10 grid
    input_len=4,
    sigma=1.0,                 # Neighborhood radius
    learning_rate=0.5,
    random_seed=42
)
som.random_weights_init(data)
som.train_random(data, num_iteration=1000)

# Find best matching unit for each sample
winners = [som.winner(d) for d in data[:5]]
print(f"Grid size: 10x10 = 100 neurons")
print(f"First 5 sample mappings: {winners}")
print(f"Quantization error: {som.quantization_error(data):.4f}")

7. Restricted Boltzmann Machine (RBM)

Description: A two-layer stochastic neural network with visible and hidden units, with no connections within a layer (the "restricted" part). Learns a probability distribution over the input using contrastive divergence. Can be stacked to form Deep Belief Networks.

Use Cases: Feature learning, dimensionality reduction, collaborative filtering, pretraining deep networks (historically).

from sklearn.neural_network import BernoulliRBM
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split

X, y = load_digits(return_X_y=True)
X = X / 16.0  # Scale to [0, 1]
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# RBM for feature extraction + Logistic Regression for classification
rbm = BernoulliRBM(
    n_components=100,
    learning_rate=0.01,
    n_iter=20,
    random_state=42,
    verbose=0
)
rbm.fit(X_train)

# Transform features
X_train_rbm = rbm.transform(X_train)
X_test_rbm = rbm.transform(X_test)
print(f"Original features: {X_train.shape[1]}")
print(f"RBM features: {X_train_rbm.shape[1]}")

8. K-Medoids (PAM)

Description: Similar to K-Means but uses actual data points (medoids) as cluster centers instead of means. This makes it more robust to outliers and works with any distance metric (not just Euclidean). Uses Partitioning Around Medoids (PAM) algorithm.

Use Cases: When cluster centers should be actual data points, when using non-Euclidean distances, outlier-robust clustering.

from sklearn_extra.cluster import KMedoids
from sklearn.metrics import silhouette_score

model = KMedoids(
    n_clusters=4,
    metric='euclidean',       # Works with any metric
    method='pam',             # 'pam' or 'alternate'
    init='k-medoids++',
    random_state=42
)
labels = model.fit_predict(X_normal)

print(f"Medoid indices: {model.medoid_indices_}")
print(f"Inertia: {model.inertia_:.2f}")
print(f"Silhouette: {silhouette_score(X_normal, labels):.4f}")

9. Apriori Algorithm

Description: An association rule mining algorithm that finds frequent itemsets in transactional databases. Uses a bottom-up approach: first finds frequent individual items, then extends to pairs, triples, etc. The key principle: any subset of a frequent itemset must also be frequent (anti-monotone property), enabling efficient pruning.

Use Cases: Market basket analysis ("customers who bought X also bought Y"), cross-selling, web usage mining, medical diagnosis co-occurrences.

from mlxtend.frequent_patterns import apriori, association_rules
import pandas as pd

# Transaction data (one-hot encoded)
data = {
    'bread':  [1, 1, 0, 1, 1, 0, 1, 1],
    'butter': [0, 1, 1, 1, 1, 0, 1, 0],
    'milk':   [1, 0, 1, 1, 0, 1, 1, 1],
    'eggs':   [0, 1, 0, 0, 1, 1, 1, 0],
    'cheese': [0, 0, 1, 1, 0, 1, 0, 1],
}
df = pd.DataFrame(data)

# Find frequent itemsets (min support = 40%)
frequent = apriori(df, min_support=0.4, use_colnames=True)
print("Frequent itemsets:")
print(frequent)

# Generate association rules
rules = association_rules(frequent, metric='confidence', min_threshold=0.6)
print("\nAssociation rules:")
print(rules[['antecedents', 'consequents', 'support', 'confidence', 'lift']])

10. FP-Growth

Description: An improved alternative to Apriori that avoids the expensive candidate generation step. Compresses the database into a Frequent Pattern tree (FP-tree), then extracts frequent itemsets directly from this compact structure. Significantly faster than Apriori for large datasets.

Use Cases: Same as Apriori but for larger datasets, real-time association mining, when Apriori is too slow.

from mlxtend.frequent_patterns import fpgrowth, association_rules

# FP-Growth (same interface as Apriori but faster)
frequent_fp = fpgrowth(df, min_support=0.4, use_colnames=True)
print("FP-Growth frequent itemsets:")
print(frequent_fp)

# Generate rules
rules_fp = association_rules(frequent_fp, metric='lift', min_threshold=1.0)
print(f"\nRules with lift > 1.0: {len(rules_fp)}")
for _, row in rules_fp.iterrows():
    print(f"  {set(row['antecedents'])} => {set(row['consequents'])} "
          f"(conf={row['confidence']:.2f}, lift={row['lift']:.2f})")

Master Comparison Table: All 100+ Algorithms

The complete reference table of every algorithm in this directory, organized by category.

Regression (15)

#	Algorithm	Type	Interpretability	Scalability	Key Library
1	Linear Regression	Supervised	High	High	sklearn
2	Polynomial Regression	Supervised	Medium	Medium	sklearn
3	Ridge Regression	Supervised	High	High	sklearn
4	Lasso Regression	Supervised	High	High	sklearn
5	Elastic Net	Supervised	High	High	sklearn
6	Bayesian Linear Regression	Supervised	High	Medium	sklearn
7	SVR	Supervised	Low	Low	sklearn
8	Decision Tree Regression	Supervised	High	Medium	sklearn
9	Random Forest Regression	Supervised	Medium	High	sklearn
10	Gradient Boosting Regression	Supervised	Low	Medium	sklearn
11	XGBoost Regression	Supervised	Low	High	xgboost
12	LightGBM Regression	Supervised	Low	Very High	lightgbm
13	CatBoost Regression	Supervised	Low	High	catboost
14	Quantile Regression	Supervised	High	Medium	sklearn
15	Poisson Regression	Supervised	High	High	sklearn

Classification (17)

#	Algorithm	Type	Interpretability	Scalability	Key Library
16	Logistic Regression	Supervised	High	High	sklearn
17	KNN	Supervised	Medium	Low	sklearn
18	SVM	Supervised	Low	Low	sklearn
19	Decision Tree Classifier	Supervised	High	Medium	sklearn
20	Random Forest Classifier	Supervised	Medium	High	sklearn
21	Gaussian Naive Bayes	Supervised	High	Very High	sklearn
22	Bernoulli Naive Bayes	Supervised	High	Very High	sklearn
23	Multinomial Naive Bayes	Supervised	High	Very High	sklearn
24	Gradient Boosting Classifier	Supervised	Low	Medium	sklearn
25	AdaBoost	Supervised	Medium	Medium	sklearn
26	XGBoost Classifier	Supervised	Low	High	xgboost
27	LightGBM Classifier	Supervised	Low	Very High	lightgbm
28	CatBoost Classifier	Supervised	Low	High	catboost
29	SGD Classifier	Supervised	High	Very High	sklearn
30	Perceptron	Supervised	High	Very High	sklearn
31	Passive Aggressive	Supervised	Medium	Very High	sklearn
32	Naive Bayes (General)	Supervised	High	Very High	sklearn

Clustering (11)

#	Algorithm	Type	Requires k?	Scalability	Key Library
33	K-Means	Unsupervised	Yes	High	sklearn
34	Mini-Batch K-Means	Unsupervised	Yes	Very High	sklearn
35	Hierarchical Clustering	Unsupervised	Optional	Low	scipy
36	Agglomerative Clustering	Unsupervised	Yes	Medium	sklearn
37	DBSCAN	Unsupervised	No	Medium	sklearn
38	OPTICS	Unsupervised	No	Medium	sklearn
39	Mean Shift	Unsupervised	No	Low	sklearn
40	Spectral Clustering	Unsupervised	Yes	Low	sklearn
41	GMM	Unsupervised	Yes	Medium	sklearn
42	BIRCH	Unsupervised	Yes	Very High	sklearn
43	Affinity Propagation	Unsupervised	No	Low	sklearn

Dimensionality Reduction (10)

#	Algorithm	Type	Linear?	Scalability	Key Library
44	PCA	Unsupervised	Yes	High	sklearn
45	Kernel PCA	Unsupervised	No	Medium	sklearn
46	LDA	Supervised	Yes	High	sklearn
47	t-SNE	Unsupervised	No	Low	sklearn
48	UMAP	Unsupervised	No	High	umap-learn
49	ICA	Unsupervised	Yes	Medium	sklearn
50	Factor Analysis	Unsupervised	Yes	Medium	sklearn
51	NMF	Unsupervised	Yes	Medium	sklearn
52	Isomap	Unsupervised	No	Low	sklearn
53	LLE	Unsupervised	No	Low	sklearn

Ensemble Methods (7)

#	Algorithm	Strategy	Reduces	Scalability	Key Library
54	Bagging	Parallel	Variance	High	sklearn
55	Boosting	Sequential	Bias	Medium	sklearn
56	Random Forest	Bagging + feature random	Variance	High	sklearn
57	Gradient Boosting	Sequential	Both	Medium	sklearn
58	AdaBoost	Sequential	Bias	Medium	sklearn
59	Stacking	Meta-learning	Both	Low	sklearn
60	Voting	Aggregation	Variance	Medium	sklearn

Reinforcement Learning (14)

#	Algorithm	Category	Action Space	On/Off Policy	Key Library
61	Q-Learning	Value-based	Discrete	Off	Custom
62	SARSA	Value-based	Discrete	On	Custom
63	DQN	Value-based	Discrete	Off	PyTorch/TF
64	Double DQN	Value-based	Discrete	Off	PyTorch/TF
65	Dueling DQN	Value-based	Discrete	Off	PyTorch/TF
66	Policy Gradient	Policy-based	Both	On	PyTorch/TF
67	REINFORCE	Policy-based	Both	On	PyTorch/TF
68	Actor-Critic	Actor-Critic	Both	Both	PyTorch/TF
69	A3C	Actor-Critic	Both	On	PyTorch/TF
70	PPO	Actor-Critic	Both	On	stable-baselines3
71	TRPO	Actor-Critic	Both	On	sb3-contrib
72	DDPG	Actor-Critic	Continuous	Off	stable-baselines3
73	TD3	Actor-Critic	Continuous	Off	stable-baselines3
74	SAC	Actor-Critic	Continuous	Off	stable-baselines3

Neural Networks & Deep Learning (14)

#	Architecture	Data Type	Year	Key Library
75	ANN	Tabular	1943	PyTorch/TF
76	Feedforward NN	Tabular	1986	sklearn/PyTorch
77	MLP	Tabular	1986	sklearn/PyTorch
78	CNN	Images	1989	PyTorch/TF
79	RNN	Sequential	1986	PyTorch/TF
80	LSTM	Sequential	1997	PyTorch/TF
81	GRU	Sequential	2014	PyTorch/TF
82	Transformer	Sequential/Any	2017	PyTorch/TF
83	GNN	Graphs	2009	PyG/DGL
84	GCN	Graphs	2017	PyG/DGL
85	GAT	Graphs	2018	PyG/DGL
86	Autoencoder	Any	1986	PyTorch/TF
87	VAE	Any	2013	PyTorch/TF
88	GAN	Any	2014	PyTorch/TF

Time Series (5) & Recommendation (4)

#	Algorithm	Category	Scalability	Key Library
89	ARIMA	Time Series	Medium	statsmodels
90	SARIMA	Time Series	Medium	statsmodels
91	Prophet	Time Series	High	prophet
92	Holt-Winters	Time Series	High	statsmodels
93	State Space Models	Time Series	Medium	statsmodels
94	Collaborative Filtering	Recommendation	Medium	surprise
95	Content-Based Filtering	Recommendation	High	sklearn
96	Matrix Factorization	Recommendation	High	surprise/custom
97	Factorization Machines	Recommendation	High	PyTorch/xlearn

Other Algorithms (10)

#	Algorithm	Category	Scalability	Key Library
98	HMM	Probabilistic / Sequence	Medium	hmmlearn
99	CRF	Probabilistic / Sequence	Medium	sklearn-crfsuite
100	Isolation Forest	Anomaly Detection	High	sklearn
101	Local Outlier Factor	Anomaly Detection	Medium	sklearn
102	One-Class SVM	Anomaly Detection	Low	sklearn
103	Self-Organizing Map	Unsupervised / Visualization	Medium	minisom
104	RBM	Unsupervised / Generative	Medium	sklearn
105	K-Medoids	Clustering	Medium	sklearn-extra
106	Apriori	Association Rules	Low	mlxtend
107	FP-Growth	Association Rules	Medium	mlxtend

Total: 107 algorithms across 10 categories. This directory provides a comprehensive reference for selecting, understanding, and implementing machine learning algorithms. Use the sidebar to navigate to any category, and refer to the Overview page for the selection guide.

← Previous: Time Series & Recommendation Back to Directory Home →