Docker Security for ML Containers
Docker is the foundation of most ML container deployments. Hardening your Dockerfiles and runtime configuration is the first line of defense against container-based attacks.
Dockerfile Hardening for ML
ML Dockerfiles require special attention because of their large base images and complex dependency chains:
-
Use Official, Pinned Base Images
Always pin your CUDA and ML framework base images to specific digests rather than mutable tags. Use
nvidia/cuda:12.2.0-runtime-ubuntu22.04@sha256:...instead ofnvidia/cuda:latest. This prevents supply chain attacks via tag mutation. -
Multi-Stage Builds
Separate your build stage (with compilers, build tools) from your runtime stage. This dramatically reduces image size and attack surface. Copy only the compiled artifacts and model files to the final stage.
-
Run as Non-Root
Create a dedicated user for your ML workload. Use
USER mluserin your Dockerfile. GPU access does not require root — the NVIDIA Container Toolkit handles device permissions at the runtime level. -
Read-Only Root Filesystem
Run containers with
--read-onlyand mount writable volumes only where needed (model output, logs, checkpoints). This prevents attackers from modifying system binaries or installing tools.
Secrets Management for ML Pipelines
ML workloads commonly need credentials for data stores, model registries, and cloud APIs. Never embed these in your Docker images:
ENV, ARG, or COPY to embed API keys, database passwords, or cloud credentials in Docker images. Every layer is inspectable with docker history and can be extracted by anyone with access to the image.| Method | Use Case | Security Level |
|---|---|---|
| Docker Secrets | Swarm mode deployments, simple key-value secrets | Good |
| Kubernetes Secrets | K8s deployments with encrypted etcd, external secret operators | Good |
| HashiCorp Vault | Dynamic secrets, rotation, fine-grained access control | Excellent |
| Cloud KMS | AWS KMS, GCP KMS, Azure Key Vault for cloud-native ML pipelines | Excellent |
GPU Passthrough Security
Configuring GPU access securely requires balancing performance with isolation:
- Limit GPU visibility: Use
NVIDIA_VISIBLE_DEVICESto expose only the specific GPUs a container needs, rather than all available devices - Enable MIG partitioning: On A100 and H100 GPUs, use Multi-Instance GPU to create hardware-isolated GPU partitions for different workloads
- Restrict capabilities: Drop all Linux capabilities and add back only what is needed. ML inference typically needs no special capabilities
- Disable inter-process communication: Use
--ipc=noneunless shared memory is specifically required for multi-GPU training
Docker Compose Security for ML
When using Docker Compose for multi-container ML applications (API server, model server, data preprocessor), apply these practices:
Network Isolation
Create separate networks for frontend and backend services. The model serving container should not be directly accessible from the internet.
Resource Limits
Set memory and CPU limits for each service. For GPU services, use the deploy.resources.reservations.devices section to control GPU allocation.
Health Checks
Implement health checks for all services. A compromised container that stops responding to health checks can be automatically restarted.
Logging Configuration
Configure centralized logging with size limits. ML training logs can grow very large and should be rotated to prevent disk exhaustion attacks.
Lilly Tech Systems