Advanced

caret & mlr3

Explore alternative ML frameworks in R: caret's unified interface and mlr3's modern object-oriented approach.

caret Package Overview

caret (Classification And REgression Training) was the dominant ML framework in R for over a decade. It provides a unified interface to 200+ models.

💡

Note: caret is in maintenance mode. For new projects, tidymodels is recommended. However, you will encounter caret in existing codebases and tutorials, so it is worth understanding.

The train() Function

library(caret)

# Train a random forest with cross-validation
set.seed(42)
ctrl <- trainControl(method = "cv", number = 10)

model <- train(
  Species ~ .,
  data = iris,
  method = "rf",          # Random forest
  trControl = ctrl,
  tuneLength = 5          # Try 5 parameter combinations
)

print(model)
plot(model)

# Predictions
preds <- predict(model, newdata = iris)
confusionMatrix(preds, iris$Species)

Preprocessing with caret

# preProcess for data transformation
pp <- preProcess(iris[, 1:4], method = c("center", "scale"))
scaled <- predict(pp, iris[, 1:4])

# Or include preprocessing in train()
model <- train(
  Species ~ .,
  data = iris,
  method = "knn",
  preProcess = c("center", "scale"),
  trControl = ctrl,
  tuneGrid = data.frame(k = c(3, 5, 7, 9, 11))
)

Comparing Models in caret

# Train multiple models
rf_model <- train(Species ~ ., data = iris, method = "rf", trControl = ctrl)
svm_model <- train(Species ~ ., data = iris, method = "svmRadial", trControl = ctrl)
knn_model <- train(Species ~ ., data = iris, method = "knn", trControl = ctrl)

# Compare
results <- resamples(list(RF = rf_model, SVM = svm_model, KNN = knn_model))
summary(results)
bwplot(results)

mlr3 Framework

mlr3 is a modern, object-oriented ML framework for R. It uses R6 classes and a task/learner/resampling paradigm.

library(mlr3)
library(mlr3learners)

# Define a task
task <- TaskClassif$new(id = "iris", backend = iris, target = "Species")

# Define a learner
learner <- lrn("classif.ranger", num.trees = 500)

# Define resampling
resampling <- rsmp("cv", folds = 10)

# Run the experiment
rr <- resample(task, learner, resampling)
rr$aggregate(msr("classif.acc"))

mlr3 Pipelines

library(mlr3pipelines)

# Build a pipeline: scale -> PCA -> random forest
graph <- po("scale") %>>%
  po("pca") %>>%
  po("learner", lrn("classif.ranger"))

# Convert to a learner
graph_learner <- as_learner(graph)
graph_learner$train(task)

When to Use Which Framework

Framework	Best For	Strengths
tidymodels	New projects, tidyverse users	Tidy syntax, active development, great documentation
caret	Quick prototyping, legacy code	200+ models, simple API, one function does it all
mlr3	Complex experiments, benchmarking	Flexible pipelines, R6 classes, extensible

✅

Recommendation: Start with tidymodels for new projects. It has the best documentation, integrates with the tidyverse, and is under active development by the Posit team. Learn caret and mlr3 when you encounter them in existing codebases.

← Previous Model Evaluation Next → Best Practices