Code Velocity Labs Ltd AI-Native Software Manufacturing Doc. CVL-01 / Rev. 04 / United Kingdom
← All insights

What are the problems with vibe coding in production?

Vibe coding is genuinely fast and the speed is real. The problem is the gap between working and safe, and how invisible that gap is until someone finds it for you. The Lovable CVE and the Bolt key leaks show what that costs.


Direct answer

The problems with vibe coding in production are structural: no persistent security posture between prompt sessions, missing access controls, and secrets shipped in client-side code. The gap between 'it works' and 'it is safe' stays invisible until someone finds it, as the Lovable CVE and the Bolt key leaks showed.

Vibe coding is genuinely fast. The speed is real and it’s not going away. The problem isn’t the tool. It’s the gap between “it works” and “it’s safe”, and how invisible that gap is until someone finds it for you.

In March 2025, a security researcher reported a critical access control flaw to Lovable, the vibe coding platform now valued at $6.6 billion with eight million users. The team confirmed receipt and did not act. When the researcher’s 45-day disclosure window expired at the end of May, the flaw went public as CVE-2025-48757: missing Row Level Security that left user records across more than 170 production applications readable, and in many cases writable, by anyone who asked.

A Q2 2026 security scan of apps built on Bolt found that 15% had shipped hardcoded API keys in client-side JavaScript. Stripe keys. Supabase service keys. In some cases, OpenAI API keys with live billing accounts attached. Not obscure edge cases: apps connected to real payment systems and real user data.

Neither of these is a story about AI being dangerous. They’re stories about what happens when a demo ships as a product.

Why does the gap exist?

Vibe coding tools are optimised for one thing: producing a working result from a conversational prompt. That’s what makes them fast. It’s also the structural reason they have a security problem.

Security isn’t something you bolt on at the end. It’s an architectural position you take at the beginning: how authentication boundaries work, how database access is scoped, how secrets are managed, how inputs are validated. An experienced engineer applies these instinctively. They’re not consciously thinking through each one, they’re habits built over years. The AI doesn’t have habits. It has no implicit security posture unless one has been explicitly defined in the context it’s working from.

A vibe-coded app assembled through a sequence of conversational sessions has no persistent context about what the system should never do. Each session starts fresh. Protections added in one conversation don’t carry forward to the next. The model generated code that worked. Nobody asked it to be secure, and it had no way to know that it should have been.

This is not an AI problem. It’s a deployment problem that AI made faster.

What does a factory do differently?

“Add a review step” is not the answer. A review step you add after the fact is exactly the kind of thing the Lovable team presumably thought they had.

A Demo Isn’t a Factory draws the distinction that matters here. A workshop produces impressive output that depends entirely on who’s holding the keyboard that day. A factory produces predictable output because the system enforces it, not just the person doing the building.

The practical version of this in a production codebase:

The security posture lives in the context rails, not in the prompt. Before any manufacturing run begins, the authentication strategy is defined, the database access model is defined, secret management is defined. The agent builds inside that envelope. It doesn’t decide it at runtime when it’s halfway through generating a database connection.

The validation gate asks specific questions. Not “does it run.” Does Row Level Security exist and is it correctly scoped? Are secrets in environment variables, not in client-side code? Does the auth boundary hold under inputs the agent wasn’t explicitly prompted to consider? These checks run before the output advances. They’re not optional.

The agent doesn’t assess its own output. Claude Code (not a conversational prompt tool but an agent with terminal access, file system access, and the ability to run tests and read the results) is still a workshop without structured gates around it. The factory adds a review layer that operates at the system level, not just the module level. Structural coherence across the whole build, not just “the function I just wrote looks right.”

The Lovable and Bolt failures aren’t failures of the AI. The model did what it was asked. The factory wasn’t there.

What is the useful question?

If your app was built with a conversational AI tool and hasn’t had an independent review, the question isn’t whether vulnerabilities exist. Based on what the 2025 and 2026 evidence shows, the more useful question is which ones, and whether you find them before someone else does.

If you want an independent eye on a vibe-coded or AI-assisted codebase before it becomes a problem, that’s exactly what our code review and audit service is built for. We’ve set out what that review checks, area by area, in our guide to AI code review.

WhatsApp