Stop Demolishing the House to Change the Furniture: Why GitOps Beats Push-Based Deployment

You want to deploy a new version of your API. A one-line change: bump the image tag from v1.4.2 to v1.4.3. With a GitOps setup, you push to Git, and 30 seconds later the cluster has reconciled. With the push-based approach I keep seeing in production — Helm charts published by Terraform — you run a pipeline that plans the entire infrastructure, evaluates every resource, waits for approval, and then applies the change. That’s demolishing the house to change the furniture.
And yet, push-based deployment via Terraform remains remarkably popular. Managers like it because they can see a “Deploy” button in the CI/CD UI. Engineers default to it because Terraform is already managing the cluster. But the cost is real — in pipeline duration, in blast radius, in coupling, and in operational resilience.
How Push-Based Deployment Works
The typical push-based setup looks like this:
VPC, DNS, databases,
Helm releases, secrets... TF->>TF: Plan: 1 to change CI->>CI: Wait for manual approval CI->>TF: terraform apply TF->>K8s: helm upgrade my-api Note over K8s: Image updated to v1.4.3
Terraform owns everything — the GKE cluster, the VPC, the Cloud SQL instance, the DNS records, and the Helm releases. To change anything, you run the entire pipeline. Even if only an image tag changed, Terraform must:
- Initialize — download providers, modules, and the state file
- Plan — evaluate every resource in the state against the real world
- Wait — someone clicks “Approve” (because you’re not auto-applying infrastructure changes, right?)
- Apply — execute the change
For a moderately sized infrastructure, this takes 5-15 minutes. For large setups with hundreds of resources, it can take 30+. All that to update one container tag.
How GitOps (Pull-Based) Works
With Flux CD, the cluster watches your Git repository and reconciles the actual state to the desired state:
(polling or webhook) Flux->>Git: Pull latest manifests Flux->>K8s: Apply diff: update image tag Note over K8s: Image updated to v1.4.3 Flux->>Flux: Reconciliation complete ✓
No pipeline triggered. No infrastructure re-evaluated. No manual approval for a pre-reviewed Git commit. Flux saw the change in Git, computed the diff against the cluster, and applied only what changed. The entire cycle takes 30 seconds to 2 minutes.
The Real Cost of Push-Based Deployment
1. Everything Is Coupled
When Terraform manages both infrastructure and application deployment, they share a state file, a pipeline, and a blast radius. A bug in a Helm chart template can block a critical VPC change. A database migration timeout can prevent an unrelated app deployment from proceeding. Everything is serialized through one bottleneck.
With GitOps, Terraform manages infrastructure that changes rarely (networks, clusters, databases), and Flux manages workloads that change often (applications, configuration). Each has its own lifecycle, its own cadence, its own failure domain.
2. No Self-Healing
Push-based deployment is fire-and-forget. Terraform applies the change and walks away. If someone runs kubectl delete deployment my-api or a node failure causes pods to be rescheduled with stale config, Terraform doesn’t know and doesn’t care — it only checks state when you run it again.
Flux continuously reconciles. Every few minutes, it compares the cluster state to the Git repository and corrects any drift. Someone manually scaled your deployment to 1 replica? Flux reverts it. Someone applied a hotfix directly with kubectl? Flux overwrites it with what’s in Git. The Git repository is the single source of truth, and the cluster is a reflection of it.
This is not a theoretical benefit. In production, we’ve seen:
- A developer
kubectl apply-ing a debug configuration that was never reverted — Flux caught it on the next reconciliation cycle - A CI/CD misconfiguration that deployed an old image tag — Flux reconciled back to the Git-declared version within minutes
- A node replacement that started pods with cached old config — Flux reapplied the correct manifests
3. Slow Feedback Loops
With push-based deployment, the feedback loop is:
- Merge PR → 2. CI builds image → 3. CI triggers Terraform → 4. Terraform plans (5 min) → 5. Approval (?) → 6. Terraform applies (3 min) → 7. Deployment complete
Total: 10-20 minutes, assuming no queue and no approval bottleneck.
With GitOps:
- Merge PR → 2. CI builds image → 3. CI updates Git manifest → 4. Flux reconciles (30 sec)
Total: the CI build time plus 30 seconds. The deployment itself is nearly instantaneous.
4. The “Click to Deploy” Illusion
Managers often prefer push-based deployment because it offers a visible control point: a button in Jenkins, a manual approval in GitHub Actions, a “Run Pipeline” in GitLab. It feels like control.
But what does that button actually control? It controls when a change is applied — not what is applied. The “what” was already decided when the PR was merged. The approval step is reviewing a Terraform plan that says “1 resource to change: helm_release.my_api” — which tells you nothing about whether the new image tag is safe to deploy. The real review happened during the PR.
GitOps replaces this theatrical control with actual control: the Git history. Every change is a commit. Every commit is reviewable, revertable, and auditable. Want to roll back? git revert. Want to know who deployed what and when? git log. Want an approval gate? Use branch protection and required reviewers on the GitOps repository.
The “Deploy” button is a security blanket. The Git log is a safety net.
5. The State File: A Locked Bucket That Developers Can’t Touch
Terraform state files are a liability for application deployment. They contain the full state of every managed resource — VPC IDs, database passwords, IAM bindings — so naturally, the state bucket is locked down. Developers don’t have access. They shouldn’t. That state file contains production infrastructure secrets.
But when Helm releases live in the same state file, you’ve just made application deployment a privileged infrastructure operation. A developer who wants to deploy their service can’t run terraform plan locally to debug a failed deployment. They can’t inspect the state to understand why Terraform thinks the Helm release is in a certain condition. They’re locked out of their own deployment pipeline by a security boundary that was designed for infrastructure, not applications.
The result: developers file tickets asking the platform team to “run the pipeline” or “check why the deployment is stuck.” The platform team becomes a deployment bottleneck — not because they want to be, but because the architecture forces it.
And when things go wrong, they go wrong in ways that are hard to debug from the outside:
- Terraform “knows” a Helm release is at version X, but the cluster has version Y because someone ran
helm upgrademanually. The nextterraform planshows a confusing diff — or silently plans to revert the release to the stale state. - The state file is locked because a previous pipeline run timed out. Nobody can deploy anything until someone with bucket access runs
terraform force-unlock— which itself requires infrastructure credentials. - A Helm release is marked as “failed” in Terraform state after a timeout, but the pods are actually running fine. Terraform wants to recreate the release from scratch. That’s the house demolition again.
Flux has no state file. Its state is the Git repository. Developers have full access. There’s nothing to lock, nothing to corrupt, nothing to force-unlock at 2 AM.
6. Helm Releases Stuck in Purgatory
This deserves its own section because it’s the most common firefight in push-based setups. Helm maintains its own release state as Kubernetes Secrets in the cluster. Terraform maintains a separate view of that state in its state file. When these two views diverge — and they will — you enter Helm purgatory.
Here’s what it looks like:
Error: UPGRADE FAILED: another operation (install/upgrade/rollback)
is in progressThe previous helm upgrade timed out (maybe a readiness probe took too long, maybe the cluster was under load). Helm marked the release as pending-upgrade. Terraform doesn’t know what to do with it. The pipeline fails. Every subsequent pipeline run fails with the same error.
The fix? Someone with cluster access must manually run:
# Option A: Roll back to the last successful release
helm rollback my-api -n production
# Option B: If rollback fails too, nuclear option
kubectl delete secret -n production -l name=my-api,owner=helm
# Then re-run the pipeline and prayThis is infrastructure firefighting for what should be a routine application deployment. And it requires kubectl access to the production cluster — the very thing push-based deployment was supposed to abstract away.
With Flux, a stuck HelmRelease is self-correcting. Flux retries with exponential backoff, and if the release is truly broken, you fix the manifest in Git and Flux applies the fix. No manual helm rollback, no secret deletion, no “who has production kubectl access?” panic. If you need to intervene, flux suspend and flux resume give you clean control without touching Helm’s internal state.
7. Helm Is for Operators, Not Your Apps
Here’s an unpopular opinion: most teams shouldn’t be writing Helm charts for their own applications.
Helm is a package manager. It exists so that third parties can distribute complex, configurable software — an Nginx ingress controller with 200 configuration options, a Prometheus stack with dozens of interdependent components, a database operator with custom CRDs. Helm’s templating engine ({{ .Values.replicaCount }}) makes sense when you don’t know the target environment.
But you know your environment. You wrote the app. You know how many replicas it needs, what ports it exposes, what environment variables it reads. Wrapping a straightforward Deployment + Service + Ingress in a Helm chart adds a templating layer that:
- Obscures what’s actually deployed.
helm templateoutput is what hits the cluster, but nobody reads it. Developers editvalues.yamland hope the templates do the right thing. - Creates indirection for no benefit. Your
templates/deployment.yamlis a Go template that renders into… a Deployment. You could just write the Deployment directly. - Makes diffs unreadable. A PR that changes a Helm value shows a one-line
values.yamldiff. Reviewers can’t see the actual manifest change without runninghelm templatelocally. - Adds a packaging and versioning ceremony. Chart.yaml, chart versions, chart repositories or OCI registries — all ceremony for software that has exactly one consumer: your own cluster.
Compare this to plain Kustomize, which is what Flux uses natively:
# kustomization.yaml — what you see is what gets applied
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- deployment.yaml
- service.yaml
- ingress.yaml
images:
- name: my-api
newTag: v1.4.3# deployment.yaml — plain Kubernetes, no templating
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-api
spec:
replicas: 3
selector:
matchLabels:
app: my-api
template:
metadata:
labels:
app: my-api
spec:
containers:
- name: my-api
image: my-api # tag set by kustomization.yaml
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128MiNo templates. No values files. No {{ if .Values.ingress.enabled }}. The manifests are real Kubernetes YAML that you can kubectl apply directly for debugging. Kustomize handles environment-specific overlays (dev/staging/prod) through patches, not through a templating language.
Use Helm for: ingress controllers, cert-manager, monitoring stacks, database operators — third-party software with complex configuration matrices.
Use Kustomize for: your own applications — where you control the source, the configuration, and the target environment.
When Terraform manages helm_release resources for your apps, you’re paying the complexity tax of Helm and Terraform and their state synchronization. With Flux + Kustomize, you’re writing plain YAML and pushing to Git.
8. Canary Deployments and Progressive Delivery
This is where the push-based model falls apart completely. Try implementing a canary deployment with Terraform and Helm. You’d need to:
- Deploy the canary version alongside the stable version (a second
helm_release?) - Shift a percentage of traffic to the canary (modify the Istio/Linkerd VirtualService… via Terraform?)
- Monitor error rates and latency for the canary (how? Terraform doesn’t watch metrics)
- Gradually increase traffic if healthy, roll back if not (Terraform has no concept of this)
- Promote the canary to stable and remove the old version
None of this is Terraform’s job. Terraform applies a desired state and walks away. It has no feedback loop, no metric analysis, no progressive rollout. You’d end up building a custom pipeline with dozens of steps, sleep timers, and curl commands to Prometheus — a fragile Rube Goldberg machine.
Flagger — part of the Flux ecosystem — was purpose-built for this. It’s a Kubernetes operator that automates progressive delivery: canary releases, A/B testing, and blue/green deployments. It integrates with Istio, Linkerd, Contour, Nginx, and other service meshes and ingress controllers.
Here’s what a canary deployment looks like with Flagger:
apiVersion: flagger.app/v1beta1
kind: Canary
metadata:
name: my-api
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-api
service:
port: 8080
analysis:
# Run canary analysis every 60 seconds
interval: 60s
# Max number of failed checks before rollback
threshold: 5
# Canary traffic weight steps: 5% → 10% → 20% → 50% → 100%
maxWeight: 50
stepWeight: 10
metrics:
- name: request-success-rate
thresholdRange:
min: 99
interval: 60s
- name: request-duration
thresholdRange:
max: 500
interval: 60sWhen you push a new image tag to Git:
Flagger watches real traffic metrics from Prometheus. If the canary version has a higher error rate or slower response times, it automatically rolls back — no human intervention, no “who’s watching the dashboard.” If the canary is healthy through all weight steps, it promotes and the old version is retired.
This is the kind of deployment sophistication that the Kubernetes ecosystem enables natively. It’s not bolted on — it’s built into the reconciliation model. Flux detects the change in Git, Flagger manages the progressive rollout, the service mesh shifts traffic, Prometheus provides the feedback signal. Every component does what it was designed for.
Try fitting that into a terraform apply.
When Push-Based Makes Sense
To be fair, push-based deployment isn’t wrong in all contexts:
- Non-Kubernetes targets — deploying to VMs, Lambda functions, or managed services that don’t have a pull-based reconciler
- One-time provisioning — initial cluster setup, database creation, network configuration (Terraform is excellent for this)
- Environments without persistent agents — ephemeral test environments that are created and destroyed by CI
The problem isn’t using Terraform — it’s using Terraform for ongoing application deployment to Kubernetes. Terraform is an infrastructure provisioning tool. Flux is a deployment reconciler. They solve different problems.
The Right Separation
The architecture that works:
Terraform provisions the house. Networks, clusters, databases, IAM, and the Flux Operator itself. These change infrequently — maybe monthly. A 15-minute Terraform plan is acceptable when you’re modifying VPC peering or resizing a database.
Flux arranges the furniture. Application deployments, configuration updates, secret rotations, scaling changes. These change daily. A 30-second reconciliation loop is not just acceptable — it’s expected.
Terraform installs Flux. Flux takes over from there. The handoff is clean: Terraform creates the FluxInstance CRD (as described in our Flux Operator article), and Flux begins reconciling the cluster from the GitOps repository.
Common Objections
“But we already manage everything with Terraform”
That’s a reason to change, not a reason to stay. Terraform’s helm_release resource was designed for installing cluster add-ons (ingress controllers, cert-manager), not for managing application lifecycle. Using it for daily deployments means you’ve turned an infrastructure tool into a deployment pipeline — and gotten the worst of both worlds.
“GitOps means we lose visibility”
The opposite. terraform plan gives you visibility when you run it. Flux gives you visibility continuously. The Flux Operator’s built-in Web UI shows real-time sync status, reconciliation history, and resource health. Combined with Slack notifications on sync failures, you get better observability than a CI/CD dashboard.
“What about approval gates?”
Git has had approval gates for decades — they’re called pull request reviews. Require two approvals on your GitOps repo. Add branch protection. Use CODEOWNERS. This is more granular than a “click to approve Terraform plan” gate, because reviewers see the actual manifest diff, not a Terraform plan output.
“Developers need to understand Kubernetes manifests”
They already do — or they should. Whether the manifests are applied by Terraform or Flux, someone writes them. The difference is that with GitOps, developers own the full lifecycle: write the manifest, submit the PR, get it reviewed, merge, and watch it deploy. No waiting for a platform team to “run the pipeline.”
Getting Started
If you’re currently using Terraform + Helm for application deployment and want to migrate to GitOps:
- Keep Terraform for infrastructure. VPCs, clusters, databases, DNS — Terraform is the right tool.
- Install Flux via Terraform. Use the Flux Operator as described in our previous article. This is the bridge.
- Move one app at a time. Remove
helm_releasefrom Terraform, add the equivalentHelmReleaseto your GitOps repo. Verify Flux reconciles it. - Delete the Helm releases from Terraform state.
terraform state rm helm_release.my_api. Don’t runterraform destroy— that would delete the actual release. - Repeat until Terraform only manages infrastructure and Flux manages all workloads.
The migration is incremental. You don’t have to move everything at once. But every app you move is one less reason to run a 15-minute Terraform pipeline for a one-line image tag change.
Stop demolishing the house. Just move the furniture.
Cover photo by Tamas Szabo on Unsplash.


