Agentic Coding Has No Floor
Vibe coding is what agentic coding decays into when you're tired or four hours in. The structural fix has to come from the harness vendors, not from another instruction file.
Vibe coding works for the first week or two. You describe what you want, the agent writes it, tests pass, you ship. A few weeks in, progress falls off a cliff. New prompts start breaking older features in ways that pass the obvious tests, but later surface in production.
Vibe coding is the version where you fully trust the agent, don’t read or only skim the code, and ship. Agentic coding is the version where you still read every diff, but the line between the two is a convention that decays when you’re tired, when the diff is large, or when you’re four hours in and the feature is almost done. So I’m treating vibe coding here as the failure mode of agentic coding rather than a separate thing (the disciplined version comes with its own skill atrophy issue either way).
The issue is structural, since coding agents have no equivalent of the source/generated-output boundary that a compiler gives us, and so prompt, code, tests, and previous agent output are all editable and all treated as input. The fix has to come from the harness vendors, in the form of a protected region the agent can read but can’t rewrite without an explicit human unlock, because another instruction file isn’t going to cut it. Until they ship the real thing, the workarounds are all a bit unsatisfying.
It’s tempting to read this as a problem that only kicks in once you have a real team or a serious codebase, but even the vendors selling these agents are starting to see its limits. From a recent interview:
if you close your eyes, and you don’t look at the code, and you have AIs build things with shaky foundations as you add another floor, and another floor, and another floor, things start to kind of crumble.
Michael Truell, Cursor CEO
I don’t want to cast blame on the users here (“professional” SWEs doing vibe coding is another story). The dream is real: a tool that lets you build production software without the years of engineering muscle memory it usually takes. The marketing says it’s safe and the product produces plausible work. The loop stays quiet until something breaks, and the dev forums are full of stories where it did: leaked secrets, runaway agents, silent regressions.
Even if you don’t use agents, or you always read the diffs carefully, you still have to deal with the consequences. It usually arrives as a vibe-coded PR or demo from a non-technical colleague that engineering then has to finish properly. It’s hard to be the engineer who always says no, especially when these colleagues are excited to contribute and think they made something good. The question is do we want to fix it, control it, or ban it?
Why this fails
The agent reads both the prompt and the code, treating them as equally important since either can be changed at any time. This is different from a compiler, which operates in one direction. You write Go, it produces assembly, and there’s no confusion about which side to edit. If you change the Go file, the assembly gets regenerated next time. If you edit the assembly directly, you could make a mistake that the next compile will silently overwrite.
Now, picture a compiler that is right 95% of the time. Sometimes it regenerates code in a different file you didn’t plan to modify, treating its previous output as input for the next run. Nobody reads the assembly because the main reason for trusting the compiler is that you don’t have to. So, when things go wrong, nobody notices. The compiler continues to treat its past output as if it were the source, causing errors to accumulate unnoticed.
If compilers operated this way, we would stop using them. But that’s the situation with the agent loop today. Both the prompt and the code can be changed, and both are seen as equally valid. The agent can’t tell which lines are meant to be permanent, which are temporary, and which are leftovers from a prompt made in another session. It edits whatever seems reasonable, and your original constraints fade away.
To make this concrete, let’s say that in week 1 you ask the agent to add a payment flow where it does the right thing, eg, a GDPR consent check before charge and amount bounded against user daily cap:
if not user.has_consent("payments"):
raise PaymentDenied("missing consent")
if amount <= 0 or amount > user.daily_cap:
raise PaymentDenied("amount out of bounds")
You revisit the same function weeks later and tell the agent to send a quick cleanup pass and it looks this way:
if amount <= 0:
raise PaymentDenied("amount out of bounds")
if amount > user.daily_cap and not user.is_premium:
raise PaymentDenied("amount out of bounds")
The tests still pass, the code is clean and readable, but gone is the GDPR check, a fraud cap has been silently dropped from premium users without anyone asking for it.
I’ve been calling this logic drift. The code shape is roughly the same, but an earlier constraint is subtly relaxed. An invariant becomes conditional, a guard gets moved a few lines down past the thing it was supposed to guard, an authorization check gets duplicated and one of the copies is wrong. The diff just says a guard moved. The source never stated that the guard was load-bearing, so the review never catches the moment it is no longer load-bearing.
This actually happened on the Linux kernel recently. A maintainer submitted a patch generated by a AI that removed a __read_mostly annotation. This annotation is a hint to the compiler about cacheline placement, and removing it causes contention on every multi-core system that the kernel ships to. On review, the line seemed like a simple cleanup, so the patch was accepted, and Torvalds later said that he would have viewed it differently if he had known it was written by AI. The source didn’t say anything about this attribute being load-bearing, nor did the patch.
The shape of a fix
The fix needs to be in the harness, the layer between the model and your filesystem (Cursor, Claude Code, Replit, an IDE plugin). The simplest implementation is a way of tagging a comment and the code immediately following it as human owned so that the agent can read it and reference it and suggest a patch but cannot implement the patch without the human unlocking it first. That puts the source/assembly boundary back into the code.
Protected regions like this are a really old idea. Code generators have used BEGIN USER CODE / END USER CODE markers for decades because rerunning the generator overwrites whatever you had hand-edited inside the generated file. Agentic coding has the same overwrite problem, except there’s no generator and no rerun, just an agent editing ordinary source files in the background. There’s no codegen template to put the markers in, so the lock has to live one layer up, in the harness itself.
In practice I’d lock two things: code the agent generated under constraints it will later forget, and code a human deliberately wrote because the exact logic matters. They mostly cover the same ground anyway, core business logic, security boundaries, the places where logic drift hurts the most. Either way, the agent can’t edit the region until the human unlocks it.
For that to work, the lock has to be explicit, written by a human, and stated once in a place the agent will always see. Annotations fit well: they sit next to the code they protect, they don’t execute at runtime, and existing tooling already knows how to extract them.
@prompt("""gdpr art 6 - refuse charge if user.has_consent("payments") is falsefraud SLA: dont charge if amount<=0 or > user.daily_cappci needs log.info("charged", user=user.id, amount=amount) after stripe call^ compliance keeps asking abt this dont remove""")def charge_card(user, amount, idempotency_key): ...
The human edits the prompt in an interface that hides the underlying code, and the @prompt decorator is the lock, so the agent regenerates the body from the prompt whenever it needs to touch the function. The prompt is the source, the body is the assembly. I don’t really care about this specific syntax, what matters is that the human-written constraint sits above the generated body and the harness treats it as the one the agent isn’t allowed to overwrite.
If you’d rather keep reading the source directly, a # lock: comment does the same job one statement at a time, in the spirit of Python’s # type: or # pragma: no cover:
def charge_card(user, amount, idempotency_key): # lock: gdpr art 6 - refuse charge if no payment consent if not user.has_consent("payments"): raise PaymentDenied("missing consent") # lock: fraud SLA - reject amounts <=0 or above user.daily_cap if amount <= 0 or amount > user.daily_cap: raise PaymentDenied("amount out of bounds") invoice = build_invoice(user, amount, idempotency_key) metrics.timing("invoice.build", invoice.elapsed_ms) receipt = stripe.charge(invoice.token, amount) # lock: pci audit trail, compliance keeps asking, dont remove log.info("charged", user=user.id, amount=amount) return receipt
The # lock: comment locks itself and the syntax node immediately below, so attaching it to an if covers the whole block and attaching it to a single call covers just that line. The comment contains the motivation and is locked along with the code. From the harness’s point of view it’s the same idea, a region the agent can’t overwrite. The difference is whether you want the protected source of truth to be a prompt or the code itself.
Note that these solutions do not rely on the model to cooperate. The harness already sits between the agent and the filesystem. Before applying any patch, it analyses the file, determines where the locks are placed, and refuses all attempts to edit the spans containing the locks, unless of course they are explicitly unlocked by the user (not sure how this UI should behave).
What’s been tried
The first answer everyone reaches for is discipline (agentic coding is a trap): use the agent less, keep diffs small, review everything. This all works well right up until the tool itself drains any remaining self-discipline you might have. You pull the lever and a perfectly functional piece of code drops out of the app. Also, even if you may have strong discipline, you cannot enforce that on others (if you could my mom would have ensured I actually did my homework).
Traditional engineering processes work well for humans, but don’t scale to the scope of agents. Requirements live outside the code and are not generally read by agents. Tests, types, and linters all give the agent rails to follow, but none of them says: don’t change this line, ever. Code review can catch some of the drift, but it’s a scale problem. Reviewing takes far longer than it takes an agent to spit out a new feature. Secondary studies on AI in software engineering are mapping out the same gap from the academic side.
The harness vendors themselves have caught up some too, but most of what they’ve shipped is still not hard constraints. Persistent memory survives sessions, skills bundle known procedures, code search has gone from grep to semantic indexing, and AGENTS.md files politely beg the agent not to touch certain functions. Cursor has project rules, Claude Code has hooks that can intercept tool calls, GitHub Copilot has custom instructions, and OpenCode has modes that can’t write to production files at all. I actually use a lot of it.

I think spec-driven development is another quite interesting development, the most common approach I’ve seen goes like this. You write a short description into a ticket. Then you let the agent flesh it out, verify it by hand against your actual constraints, then circulate it to the team. Once the spec is right, the implementation can lean on unchecked agent edits without the usual cost, since the constraints are pinned in the document instead of left floating around in your head. Agile already taught us the problem with this approach though, requirements written before code are usually wrong. The agent will then fill in whatever the spec missed with its own guesses, and you’ve locked yourself to a flawed plan.
Until then, micro repos
My prior on repo structure was Conway’s Law: systems end up shaped like the teams that build them anyway, so you might as well draw the repo boundaries to match the org from the start. Platform team gets a repo, payments team gets a repo, small company runs one monorepo. Going finer than that has always felt to me like friction without a real payoff. There is some empirical evidence for this too, in the mirroring hypothesis and in ownership studies on Vista and Windows 7. Split a repo and split the team with it.
Agentic coding has shifted my thinking on this somewhat, and I think it merits another look. Another repo acts as a strong barrier and the most harnesses will warn clearly when the agent wants to talk past it. This is obviously much coarser than a real locked region would be, there’s no way to lock just a region in one file or even just one file.Still, it works, which is more than AGENTS.md or anything else above can really claim. It’s not at the point where I’ve structured any repos with this in mind, but it’s worth thinking about. Micro repos also require architectural taste. If you draw the boundary in the wrong place you end up with a mess. So you need someone on the team who can spot the actual seams and keep doing it as the system grows.
So that’s roughly where I land. The harness vendors aren’t going to ship a real lock anytime soon, and until they do, the only boundary that reliably holds is one the agent can’t see or touch, which today mostly means smaller repos. Current solutions are helpful but just as advisory hints rather than as the lock itself.