Deploying Docker Containers to Production: A Complete Guide
Cover: gradient generated for Devgains
To deploy Docker containers to production you move a container through four stages, in
order: build a small, reproducible image; push it to a registry with an immutable tag; run
it under an orchestrator that keeps it alive and healthy; and roll out new versions without
dropping a single request. Skip or rush any stage and the failure shows up later — as a
1.2 GB image that takes minutes to pull, a :latest tag nobody can trace, a crash loop at
3 a.m., or a deploy that serves 502s for four seconds while the dashboard stays green. This
guide is the hub for the Devgains DevOps cluster: it gives you the end-to-end mental model,
then points to the deep dives for each stage.
The gap between "it runs on my laptop" and "it runs in production" is almost entirely operational. The application code rarely changes. What changes is everything around it — how the image is built, how it's shipped, how the platform decides it's healthy, and how one version is swapped for the next. Get those four things right and most deployment incidents simply never happen.
Quick answer: how do you deploy a Docker container to production?
To deploy a Docker container to production, run these five steps:
- Build a lean image. Use a multi-stage build so the final image contains only your app and its runtime — no compilers or dev tooling.
- Tag it immutably. Tag by Git SHA or a semantic version, never
:latest. You must be able to point at the exact bytes running in production. - Push to a registry. Send the image to a registry (Docker Hub, GitHub Container Registry, ECR, ACR) that your cluster can pull from.
- Run it under an orchestrator. Deploy to Kubernetes (or a managed equivalent) with liveness and readiness probes, resource limits, and a non-root user.
- Roll out safely. Ship new versions with a rolling update so traffic only reaches pods that are actually ready.
Everything below expands those five steps.
Why it matters
A container that runs locally has already cleared the hard part of packaging. What it has not cleared is any of the failure modes that only appear under real traffic: a node dies mid-request, a deploy overlaps with in-flight connections, an image ships a known CVE because it bundles a compiler it never uses, or a health check lies and the platform sends traffic to a process that isn't ready.
Production is defined by those failure modes, not by the happy path. The cost of getting deployment wrong is measured in pager alerts and lost requests, and — because container platforms, registries, and observability tooling are some of the highest-CPM topics in developer advertising — it's also one of the most heavily documented problems in the industry. That's good news: nearly every step here has an authoritative reference.
Architecture: the path an image takes to production
Every production deployment is the same pipeline, whatever tools fill the boxes:
source → CI build → image → registry → orchestrator → running pods → rollout- CI build compiles the app and produces a container image. Its speed depends almost entirely on cache hit rate, not raw compute.
- Image is the immutable artifact. Built once, it moves unchanged through every environment — this is the property that makes containers reproducible.
- Registry stores tagged images. Immutable tags are what let you roll back by redeploying an old SHA.
- Orchestrator (Kubernetes, ECS, Nomad) schedules the image onto nodes, restarts it on failure, and manages rollouts.
- Rollout replaces old pods with new ones under a strategy that keeps capacity up throughout.
The key insight is that the image is the contract. Everything before it exists to produce a trustworthy artifact; everything after it exists to run that exact artifact reliably.
Step-by-step tutorial
1. Build a production image
Start from a modern, multi-stage Dockerfile. The build stage has the toolchain; the final
stage has only the compiled app, runs as a non-root user, and declares how the platform
should check its health.
# ---- build stage ----
FROM node:22-slim AS build
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# ---- runtime stage ----
FROM node:22-slim AS runtime
WORKDIR /app
ENV NODE_ENV=production
# copy only what production needs
COPY --from=build /app/node_modules ./node_modules
COPY --from=build /app/dist ./dist
# never run as root in production
USER node
EXPOSE 3000
HEALTHCHECK --interval=30s --timeout=3s CMD node dist/healthcheck.js
CMD ["node", "dist/server.js"]Two stages, one artifact. The build stage installs dev dependencies and compiles; the
runtime stage copies only the results. The USER node line drops root, and the
ordering — copying package*.json before the source — means the dependency layer is cached
until your lockfile changes. That single ordering choice is often the difference between a
20-second and a 4-minute CI build.
2. Tag and push to a registry
Tag with something that traces back to a commit. :latest is un-rollback-able because two
different builds can share it.
# tag by the exact commit that produced the image
SHA=$(git rev-parse --short HEAD)
docker build -t ghcr.io/acme/api:$SHA .
# push to the registry the cluster pulls from
docker push ghcr.io/acme/api:$SHANow ghcr.io/acme/api:$SHA names one immutable set of bytes forever. To roll back, you
redeploy an older SHA — no rebuild, no guesswork about what's running.
3. Run it on Kubernetes
A minimal but production-shaped Deployment declares the image, resource limits, and — most importantly — probes that tell the truth about readiness.
apiVersion: apps/v1
kind: Deployment
metadata:
name: api
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 0 # never drop below full capacity
maxSurge: 1 # add one new pod at a time
selector:
matchLabels: { app: api }
template:
metadata:
labels: { app: api }
spec:
containers:
- name: api
image: ghcr.io/acme/api:1a2b3c4 # the immutable SHA
ports: [{ containerPort: 3000 }]
resources:
requests: { cpu: "100m", memory: "128Mi" }
limits: { cpu: "500m", memory: "256Mi" }
readinessProbe:
httpGet: { path: /healthz, port: 3000 }
periodSeconds: 5
livenessProbe:
httpGet: { path: /livez, port: 3000 }
periodSeconds: 10maxUnavailable: 0 with maxSurge: 1 means Kubernetes adds a new, ready pod before it
removes an old one — the mechanic behind a genuinely zero-downtime rollout. The
readinessProbe gates traffic: a pod receives requests only after /healthz returns 200,
so a slow-starting process never sees traffic it can't handle.
4. Roll out and verify
Apply the manifest and watch the rollout converge. kubectl rollout status blocks until
every new pod is ready, which makes it safe to use as a CI deploy gate.
kubectl apply -f deployment.yaml
kubectl rollout status deployment/api --timeout=120s
# if it goes wrong, one command reverts to the previous ReplicaSet
kubectl rollout undo deployment/apiBest practices
- One immutable tag per build. Tag by Git SHA. It makes rollbacks trivial and audits honest.
- Run as non-root with a read-only filesystem. Drop capabilities you don't need; a compromised container should be able to do almost nothing.
- Set resource requests and limits. Requests drive scheduling; limits prevent one noisy pod from starving its neighbors.
- Make readiness probes honest. A pod is ready only when it can actually serve — after DB connections are open and caches are warm, not the moment the process starts.
- Handle SIGTERM. Catch the signal, stop accepting new connections, drain in-flight requests, then exit. This is what turns a rolling update from "mostly clean" to invisible.
- Keep the CI cache warm. Order Dockerfile layers so dependencies cache separately from source, and reuse the build cache between runs.
Common mistakes
- Deploying
:latest. You lose the ability to say what's running or roll back to a known version. Always deploy a specific SHA or version. - Probes that only check the process is alive. A readiness probe that returns 200 before the app can serve routes traffic straight into errors during every rollout.
- No graceful shutdown. Without SIGTERM handling, Kubernetes kills pods mid-request and users get 502s during an otherwise "successful" deploy.
- Fat images. Shipping the build toolchain bloats pull times and multiplies your CVE surface. Use a multi-stage build.
maxUnavailableleft at the default. The default rolling update can take pods down before replacements are ready, briefly reducing capacity under load.
Deployment strategy comparison
| Strategy | How it works | Downtime | Cost / complexity | Best for |
|---|---|---|---|---|
| Recreate | Kill all old pods, then start new ones | Yes (a gap) | Lowest | Dev, or apps that can't run two versions at once |
| Rolling update | Replace pods incrementally, gated by readiness | None | Low | The default for stateless services |
| Blue-green | Run full new version alongside old, switch traffic at once | None | Higher (2× capacity) | Fast, all-or-nothing cutover with instant rollback |
| Canary | Send a small % of traffic to the new version first | None | Highest (needs traffic splitting) | Risky changes you want to validate on real traffic |
For most stateless services, a rolling update is the right default — it's built into Kubernetes and needs no extra infrastructure. Reach for blue-green or canary only when the change is risky enough to justify the added capacity and tooling.
Takeaways
- Deployment is a four-stage pipeline: build a lean image → tag immutably → run under an orchestrator → roll out safely. Each stage has a dedicated deep dive below.
- The image is the contract. Build it once, run the exact same bytes everywhere, and tag it so you can always roll back.
- Probes and graceful shutdown are what make rollouts invisible. Honest readiness plus SIGTERM draining eliminate the 502 blip most teams never notice.
maxUnavailable: 0,maxSurge: 1is the rolling-update setting that guarantees you never drop below full capacity.- CI speed is cache, not compute. Layer ordering and cache reuse turn minutes into seconds.
Ready to go deeper? Browse the DevOps cluster for the stage-by-stage guides — writing a modern Dockerfile, multi-stage builds, liveness vs readiness probes, zero-downtime rolling updates, and fixing slow CI caches.
FAQ
How do you deploy a Docker container to production? Build a lean multi-stage image, tag it with an immutable identifier like the Git SHA, push it to a container registry, and run it under an orchestrator such as Kubernetes with readiness/liveness probes and resource limits. Ship new versions with a rolling update so traffic only reaches pods that are ready.
Should I use docker run directly in production? For anything beyond a single small
service, no. A bare docker run gives you no self-healing, no rolling updates, and no
scheduling. Use an orchestrator (Kubernetes, ECS, Nomad) — it restarts crashed containers,
manages rollouts, and reschedules work when a node dies.
Why shouldn't I deploy the :latest tag? Because :latest is mutable — two different
builds can share it — so you can't say exactly what's running or roll back to a known-good
version. Tag by Git SHA or semantic version instead. See
modern Dockerfile practices for more.
What is the difference between a liveness and a readiness probe? A liveness probe asks "is this process broken and in need of a restart?"; a readiness probe asks "can this pod serve traffic right now?". Confusing them causes restart loops or dropped requests — the probes deep dive covers exactly when each fires.
How do I achieve zero-downtime deploys? Combine a rolling update (maxUnavailable: 0),
honest readiness probes, and graceful SIGTERM handling that drains in-flight requests before
the pod exits. The zero-downtime guide
walks through the exact settings.
Conclusion
Deploying Docker containers to production is not one big step — it's four small ones done in order: build a lean image, tag it so it's traceable, run it under an orchestrator that keeps it healthy, and roll it out without dropping requests. None of the individual pieces are hard once you see the pipeline whole. Work through the deep dives linked above for each stage, and you'll have a deployment path that survives node failures, bad releases, and 3 a.m. traffic spikes — quietly, which is exactly how production is supposed to feel.
References
- Docker: Build best practices — official guidance on multi-stage builds, layer caching, and small images.
- Kubernetes: Performing a rolling update — how the default deployment strategy replaces pods incrementally.
- Kubernetes: Configure liveness, readiness and startup probes — probe semantics and configuration.
- OCI Image Specification — the standard that makes an image a portable, immutable artifact.
- Kubernetes: Container lifecycle & termination — how SIGTERM and graceful shutdown work during a rollout.

