Advanced
caret & mlr3
Explore alternative ML frameworks in R: caret's unified interface and mlr3's modern object-oriented approach.
caret Package Overview
caret (Classification And REgression Training) was the dominant ML framework in R for over a decade. It provides a unified interface to 200+ models.
Note: caret is in maintenance mode. For new projects, tidymodels is recommended. However, you will encounter caret in existing codebases and tutorials, so it is worth understanding.
The train() Function
R
library(caret) # Train a random forest with cross-validation set.seed(42) ctrl <- trainControl(method = "cv", number = 10) model <- train( Species ~ ., data = iris, method = "rf", # Random forest trControl = ctrl, tuneLength = 5 # Try 5 parameter combinations ) print(model) plot(model) # Predictions preds <- predict(model, newdata = iris) confusionMatrix(preds, iris$Species)
Preprocessing with caret
R
# preProcess for data transformation pp <- preProcess(iris[, 1:4], method = c("center", "scale")) scaled <- predict(pp, iris[, 1:4]) # Or include preprocessing in train() model <- train( Species ~ ., data = iris, method = "knn", preProcess = c("center", "scale"), trControl = ctrl, tuneGrid = data.frame(k = c(3, 5, 7, 9, 11)) )
Comparing Models in caret
R
# Train multiple models rf_model <- train(Species ~ ., data = iris, method = "rf", trControl = ctrl) svm_model <- train(Species ~ ., data = iris, method = "svmRadial", trControl = ctrl) knn_model <- train(Species ~ ., data = iris, method = "knn", trControl = ctrl) # Compare results <- resamples(list(RF = rf_model, SVM = svm_model, KNN = knn_model)) summary(results) bwplot(results)
mlr3 Framework
mlr3 is a modern, object-oriented ML framework for R. It uses R6 classes and a task/learner/resampling paradigm.
R
library(mlr3) library(mlr3learners) # Define a task task <- TaskClassif$new(id = "iris", backend = iris, target = "Species") # Define a learner learner <- lrn("classif.ranger", num.trees = 500) # Define resampling resampling <- rsmp("cv", folds = 10) # Run the experiment rr <- resample(task, learner, resampling) rr$aggregate(msr("classif.acc"))
mlr3 Pipelines
R
library(mlr3pipelines) # Build a pipeline: scale -> PCA -> random forest graph <- po("scale") %>>% po("pca") %>>% po("learner", lrn("classif.ranger")) # Convert to a learner graph_learner <- as_learner(graph) graph_learner$train(task)
When to Use Which Framework
| Framework | Best For | Strengths |
|---|---|---|
| tidymodels | New projects, tidyverse users | Tidy syntax, active development, great documentation |
| caret | Quick prototyping, legacy code | 200+ models, simple API, one function does it all |
| mlr3 | Complex experiments, benchmarking | Flexible pipelines, R6 classes, extensible |
Recommendation: Start with tidymodels for new projects. It has the best documentation, integrates with the tidyverse, and is under active development by the Posit team. Learn caret and mlr3 when you encounter them in existing codebases.
Lilly Tech Systems