Kamal vs Flux CD: Simplicity or Resilience — Pick One

Your container crashed at 3 AM. With Flux CD, the cluster noticed, restarted the pod, and reconciled the desired state from Git. You found out from a Slack notification over morning coffee. With Kamal — nobody noticed. Your app has been down for four hours. The first person to notice was a customer, and they didn’t file a bug report. They just left.
This is the fundamental difference between Kamal and Flux CD, and it matters more than most “versus” articles admit. It’s not about which tool is “better.” It’s about whether your deployment tool is responsible for keeping your app running, or just for putting it there.
What Kamal Does
Kamal (created by Basecamp, formerly MRSK) deploys Docker containers to bare servers over SSH. No Kubernetes. No cluster. No orchestration layer. You run kamal deploy, and it:
- Builds your Docker image
- Pushes it to a registry
- SSHs into your server(s)
- Pulls the new image and starts the container
- Switches traffic via Kamal Proxy (zero-downtime rolling deploys)
That’s it. The mental model is beautifully simple: you have a server, you have a container, Kamal puts the container on the server.
# config/deploy.yml
service: my-saas
image: my-saas/app
servers:
web:
- 192.168.1.10
- 192.168.1.11
registry:
username: deployer
password:
- KAMAL_REGISTRY_PASSWORD
proxy:
ssl: true
host: app.example.comkamal deploy # Deploy the latest version
kamal rollback # Roll back to previous version
kamal app logs # Check logsThe developer experience is excellent. If you’ve ever deployed with Capistrano or even plain SSH scripts, Kamal feels like a natural evolution. It’s imperative — you tell it what to do, and it does it right now.
What Kamal Doesn’t Do
Here’s the problem: Kamal is a deployment tool, not a runtime supervisor. Once kamal deploy finishes, Kamal’s job is done. It has no ongoing relationship with your running containers.
- Container crashes? Kamal doesn’t know. Nobody restarts it automatically (unless you’ve configured Docker restart policies, which handle simple crashes but not image corruption, OOM kills with bad state, or configuration drift).
- Server goes down? Kamal doesn’t know. No failover, no rescheduling to another host.
- Configuration drifts? If someone SSHs in and changes something, Kamal doesn’t detect or correct it.
- Health checks? Kamal Proxy checks if the container is healthy during deployment, but there’s no continuous health monitoring afterward.
Kamal deploys your app. What happens after that is your problem.
What Flux CD Does Differently
Flux CD takes a fundamentally different approach. Instead of pushing deployments imperatively, you declare the desired state in a Git repository, and Flux continuously reconciles the cluster to match.
# clusters/production/my-saas.yaml
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: my-saas
namespace: production
spec:
interval: 5m
chart:
spec:
chart: ./charts/my-saas
sourceRef:
kind: GitRepository
name: my-saas
values:
image:
tag: v1.4.3
replicas: 3You push a commit changing the image tag. That’s your deploy. Flux notices the change, applies it to the cluster, and Kubernetes handles the rollout. But here’s the key difference — Flux doesn’t stop working after the deploy:
- Container crashes? Kubernetes restarts it immediately. If the new version keeps crashing, the rollout is automatically halted.
- Someone manually changes something in the cluster? Flux detects the drift and reverts it to match Git — the single source of truth.
- Server (node) goes down? Kubernetes reschedules pods to healthy nodes.
- You want to roll back?
git revertthe commit. Flux reconciles. The cluster matches the previous state.
The system heals itself. You are not the reconciliation loop.
The Real Cost of “Simplicity”
The Kamal pitch is: “You don’t need Kubernetes. It’s too complex. Just deploy to a VPS.”
And for a certain class of applications, that’s absolutely right. But the argument has a blind spot: the complexity doesn’t disappear, it just moves to you.
With Kamal, you need to:
- Set up your own monitoring to detect when containers crash
- Build your own alerting to wake you up at 3 AM
- SSH in manually and run
kamal deployorkamal app bootto restart - Manage server health, updates, and security patches yourself
- Handle failover manually if a server dies
With Flux + managed Kubernetes (GKE, EKS, AKS):
- The cloud provider manages the control plane
- Kubernetes handles restarts, health checks, and rescheduling
- Flux handles drift detection and reconciliation
- You manage your app’s desired state in Git — that’s it
The question isn’t “is Kubernetes complex?” — yes, it is. The question is: is being the human self-healing mechanism at 3 AM simpler?
Can a New SaaS Afford Downtime?
There’s a common belief that early-stage SaaS products can tolerate downtime because they’re small. The logic goes: “We only have 50 customers, it’s not like we’re running a bank.”
This logic is exactly backwards.
A mature SaaS with 10,000 customers can survive a 2-hour outage. They have brand trust, switching costs, annual contracts, and a status page that customers check patiently. You have none of that.
A managed GKE Autopilot cluster costs ~$70/month. One churned customer on a $29/month plan pays for that in three months. How many did you lose during that 4 AM outage you didn’t notice?
And here’s the part nobody talks about: you’ll never know. A trial user who hits a 502 doesn’t file a support ticket. They close the tab and sign up for your competitor. There’s no metric for “customers who would have converted if the app had been up.”
“But I Don’t Want to Learn Kubernetes!”
That’s fair. Kubernetes has a real learning curve. But consider what “learning Kubernetes” means in 2026:
- You don’t run your own cluster. Use GKE Autopilot, EKS with Fargate, or AKS. The cloud provider manages nodes, upgrades, and the control plane.
- You write YAML manifests — Deployments, Services, Ingress. Maybe 5-6 files for a typical SaaS app.
- You install Flux —
flux bootstraptakes 5 minutes. - You push to Git — that’s your deploy workflow from now on.
Yes, there’s an upfront investment. But compare that to the ongoing investment of being your own orchestrator: monitoring, alerting, manual restarts, server maintenance, failover planning — all of which Kubernetes handles for you.
The learning curve is a one-time cost. The operational burden of running without orchestration is ongoing.
Where Kamal Shines
None of this means Kamal is a bad tool. It’s excellent — for the right use cases:
- Home lab and self-hosting — Running Nextcloud, Gitea, Immich, or Plex on your own hardware? Kamal is perfect. If your media server goes down overnight, nobody churns. You fix it on Saturday.
- Internal tools — Dashboards, admin panels, CI runners — things used by your own team during business hours. Downtime is inconvenient, not costly.
- Prototypes and MVPs — When you’re validating an idea and don’t have paying customers yet. Move fast, deploy to a cheap VPS, worry about resilience when you have something worth keeping up.
- Side projects — Your personal blog, a hobby app, a weekend project. Kamal makes deployment fun again.
The common thread: the blast radius of downtime is you, not your customers.
Decision Framework
| Factor | Kamal | Flux CD + Kubernetes |
|---|---|---|
| Customers depend on uptime | No | Yes |
| Blast radius of downtime | You / your team | Paying customers |
| Team size for ops | Just you, and that’s fine | Just you, but you need sleep |
| Deploy model | Imperative (kamal deploy) | Declarative (push to Git) |
| Self-healing | No | Yes |
| Drift detection | No | Yes |
| Infrastructure cost | $5-20/mo VPS | $70-300/mo managed K8s |
| Learning curve | Low (SSH + Docker) | Medium (K8s + GitOps) |
| Ongoing ops burden | High (you are the operator) | Low (the cluster is the operator) |
Conclusion
Kamal and Flux CD are both excellent tools that serve different purposes. The question is not which is technically superior — it’s who notices when things break.
If the answer is “me, whenever I happen to check” — Kamal is fine. Your home lab, your internal tools, your side projects will love it.
If the answer needs to be “the system, immediately, at 3 AM, without waking anyone up” — you need Kubernetes and GitOps. Your SaaS customers are paying for uptime. Give it to them.
Don’t be the self-healing mechanism. That’s the cluster’s job.
Cover photo by Christian Stahl on Unsplash.


