The 5 myths of the agentic coding apocalypse


the-5-myths-of-the-agentic-coding-apocalypse-v2

Tharon Green/ZDNET/Getty Images

Follow ZDNET: Add us as a preferred source on Google.


ZDNET’s key takeaways

  • You face real maintenance and sustainability issues when ceding coding control to AI.
  • Having AI agents write your code is a lot like having human contractors write it.  
  • These best practices will help you get back from AIs what you asked for.

There are two prevailing narratives about vibe coding. The first is that you can write a single sentence, and the AI will give you back a million-dollar app. The second is that since the AI is writing all the code, humans have no idea what’s inside it. It must, therefore, eventually fail and cause a large-scale apocalypse.

Both of these narratives are caricatures of reality. In previous articles, I’ve talked about my work on a variety of vibe-coded projects. We’ve looked at how they’re both amazing and a lot of work. In this article, I’m going to dive deep into the maintenance and sustainability questions that come from ceding coding control to a machine.

Also: The case against an imminent software developer apocalypse

When I was a young product manager, I was sent down to Los Angeles to support our sales VP. He decided to take me to one of his favorite restaurants. This restaurant specialized in fusion cuisine, which meant the chef would mix a lot of different influences into his food. It had a reputation for its chef’s special, which was whatever the chef decided to create for you that evening.

I remember wondering just what I’d gotten myself into. I knew that I’d get food, but I had no idea what I would be expected to ingest. As it turned out, the food we ate that night was…weird. It was edible. It was not someplace I’d go again voluntarily.

Agentic coding is a lot like going to that restaurant. You know that the reputation of the coding AI you’re using is good, but you really have no idea what’s going to be delivered to you. You have little insight into the actual code coming from the AI. You’re basically going to have to eat it, regardless of what you’ve been served.

Also: I used Gmail’s AI tool to do hours of work for me in 10 minutes

When you have agents writing your code, it’s like having a bunch of contractors or subordinates writing your code. Until you test and evaluate it, you have no idea what you’ll get.

Everything is predicated on your prompt. Garbage-in, garbage-out has a much deeper meaning than the old hackneyed phrase would imply. If you don’t prompt clearly enough, and you don’t maintain the conversation with enough clarity and oversight, the code you’ll get back from the AI will be hard to stomach.

1. The myth of lost control

Engineering managers have faced the challenge of managing contractors under their supervision since the days of the pyramids. Assigning work and evaluating the work product is what engineering managers do. Maintaining quality and control in that process is at the core of software engineering.

Also: Implementing AI into software engineering? Here’s everything you need to know

On the other hand, while much of the vibe-coding doom and gloom is hyperbole, there’s also truth there. Without quality standards and practices, you could end up with problematic code. In this article, we’ll discuss the myths surrounding agentic coding and the best practices that will help you get back from AIs what you asked for.

Many AI coding advocates recommend providing the AI with deep, rich requirements documents. However, my experience is that the AIs can misinterpret one single element of that deep document and go completely off the rails in ways you can’t trace or find.

I prefer to give the AI one simple task. Once that has been successfully completed, I give it another. That way, there’s less of an opportunity for either AI or me to lose track of the overall plan.

As a sole developer, I used to write code line by line. I sweated each and every line. I knew everything about my code. But when I was an engineering manager, I had to rely on my teams and the individual developers on my teams.

Also: I built two apps with just my voice and a mouse – are IDEs already obsolete?

Sure, we had coders (roughly the equivalent of agents). But I still needed to build a discipline of testing and integration into the system, to be sure what was submitted by any one of our coders or contractors worked with everything else.

If you’re going to use agentic coding, you’ll need to do the same. Checkpoints at every stage. Carefully track the integration. Assume you’re taking delivery from outside contractors, and therefore need to check their work before incorporating it into your main project.

2. The myth of real-world readiness

I have a friend whom I dread sharing my software projects with. No matter how carefully I’ve designed and tested my code, the minute I give it to him to run, it breaks.

That’s because he uses the code without my curse of knowledge. I know what my code should do. I know how the program should work. I build the code to do that. My friend, however, does not have that internal map in his head. He just uses it. In the course of using it, he always tries something I never thought anyone would do. The code breaks.

My buddy is a textbook example of the fragility of automated testing systems. Sure, automated tests can help you determine whether a recent fix broke something else. But because you’re pre-planning the tests, you’re bound to miss something that an outsider without the curse of knowledge about your project’s complete spec would inevitably find.

In some ways, AIs help serve the role of the untrained friend. They can be instructed to try various tests to see if the code can survive the encounter. But when AIs are asked to build up their own set of tests, they are also limited by whatever perspective they use or are prompted with going into the project.

Also: Anthropic’s new Claude Security tool scans your codebase for flaws – and helps you decide what to fix first

Most unit tests examine what are called “happy paths,” the paths developers know and expect the code to take. But those same unit tests often overlook edge cases. When AI builds unit tests, they often inherit the same blind spots as the human-created tests.

Test environments are also not real-world environments. There are more than 20,000 websites running my security software. You would not believe some of the problems users have reported. Problems range from legitimate bugs to spending days on support messages only to find out the software doesn’t work because the user never installed it.

Many coding managers base their assumption of correctness on reports from diagnostic and testing systems. Getting good coverage and performance metrics from tests that suffer from the curse of knowledge can easily mask real-world issues.

Here’s the real cost. When the failures are discovered during integration and deployment rather than in development, debugging complexity and expense can increase considerably.

To overcome this problem, test like an outsider. Incorporate adversarial test practices. When prompting, require edge-case, failure-mode, and misuse scenarios as part of every test plan. Assign people and/or AIs to intentionally misuse and abuse the code without guidance, simulating real users with no internal context. Build in instrumentation for unexpected behavior. Code everything for failure and error correction.

If you’re using agentic coding, remember that your project is never done. It’s just in a state of good enough to test. Expect breakage. Build corrective processes into your management structure and into the code itself.

3. The myth of inherited code

Throughout my years in the software industry, both as an employee and a business owner, one of my core competencies was acquiring rights to software intellectual property.

With the exception of the two Apple apps I’m vibe coding right now, every product I’ve brought to market was originally coded by someone else. There was benefit to this. Most products came with an existing customer base and a pre-built deep understanding of the core application.

But acquired products also come with challenges. There are usually reasons why the software IP is available for acquisition. There can be technical debt inside the software, where something doesn’t work. Market changes can make the software less valuable. There can be (and this was a big driver for my acquisitions) a great deal of weariness on the part of the original developers. They no longer want the responsibility for maintenance and support.

Also: I asked 5 data leaders about how they use AI to automate – and end integration nightmares

I usually inherited black boxes. The code was crafted by other people and teams. In order to improve it, maintain it, and just keep it from exploding in my face, I had to somehow absorb code with all sorts of secrets hidden inside. It’s kind of like buying a house without a home inspection, only to find faulty wiring and broken pipes inside the walls.

This is what every AI coding experience will be like. By definition, the code is not written by human developers. The AIs construct an entire black box. You just hope it will run.

Don’t give up on the process. I’ve made a career shipping code I didn’t fully understand on acquisition. Early on, I had to rely on my programming teams to figure it out. Later, as I dove into a solo developer career, I systematically learned segments of the code, working my way through function after function, often driven by a desire to add a feature or fix a customer-reported problem.

4. The myth of maintenance debt

Since the AI isn’t human, design decisions and code structure aren’t built with humans in mind. If there are problems, debugging consists of a combination of reverse engineering the AI’s work and cajoling the AI to fix something it might not have fully understood to begin with.

Code built by an AI often lacks consistent intent, structure, and architectural coherence. This makes for a shaky foundation, so anything built onto or after becomes a patchwork of disparate elements cobbled together. Likewise, naming conventions and patterns can vary widely across AI-generated components. The result is that changes and updates cascade into unexpected bugs across loosely related areas.

After a few days of working with Claude on a new iPhone app, I decided to take a look at the file structure. It was completely incoherent. The AI had decided to place files wherever it wanted. It named them whatever it seemed to want to name them. As for structure, it didn’t group anything. It was all a giant pile of files in one main directory.

Also: 10 things I wish I knew before trusting Claude Code to build my iPhone app

It doesn’t have to be that way. I instructed the AI to clean up after itself, and it did. It took a few tries to create a file structure pattern that made sense. It took a few other tries to immortalize that practice into startup instructions. Thinking like a manager helped me wrangle my digital subordinate.

Likewise, there are times when careful code reviews will get you quite far. My advice for the agentic coding era we’re entering is to use multiple agentic AIs based on different large language models. Not multiple agents, but multiple AIs. Have one model code-review the other model’s work. Let one model be the maker. Let the other be the evaluator.

This approach won’t solve every problem. I’ve used Claude Code and OpenAI’s Codex to check each other’s work. I’ve been quite pleased about how, with careful coordination on my part, they keep each other fairly honest.

5. The myth of vulnerability-free output

If you think about it, as soon as you combine testing fragility with maintenance debt, you can’t help but get security blind spots. Poorly written code with inherent failure points is a recipe for a perfect storm of security issues.

In some ways, it’s even worse with AI. AI coding models have been trained from information available on the public internet. That includes a tremendous amount of faulty code and bad advice. Programmers have been posting on the internet longer than any other professional group, so the scope and range of that knowledge base might be bigger than just about any other topic area.

Also: 7 AI coding techniques I use to ship real, reliable products – fast

Since most open source code has been posted to GitHub, the underlying software that most of the world runs on has also been available to the models for training. The gotcha? That code doesn’t always work, regularly has bugs and vulnerabilities, sometimes has only been tangentially tested by a lone developer, and often is followed by coding comments from coders who make mistakes. Lots and lots of mistakes.

Models, therefore, may well reproduce insecure coding patterns learned from public data. Input validation and sanitization gaps allow subtle exploit vectors. I was shocked to discover that the AI that I was using to work on my security product did absolutely zero input verification, completely ignoring best practices. Once I adjusted my human assumptions and told the AI to properly check inputs, the validation routines were better than I had written on my own. But I had to diligently instruct the AI to make things secure.

AIs are also likely to incorporate libraries that seem to fit the problem without checking the supply chain for down-chain vulnerabilities or issues. Since AIs generate code far faster than we humans can double-check it, there’s a good chance errors will be introduced that we just can’t catch at meatspace speed.

Also: The new rules for AI-assisted code in the Linux kernel: What every dev needs to know

Add to that the rapidly expanding piles of code made possible by code generation, which takes hours instead of months, and you have a giant time bomb of unverified and potentially unsafe code.

Keep in mind that human-generated code can also be a security nightmare. AIs can sometimes help fix that problem.

My hosting provider recently informed me that the open source anti-registration spam blocker I was using had a severe vulnerability. The author wasn’t available to make fixes. I had my AI examine the code, identify the vulnerabilities, and create a fresh code module that did not contain those vulnerabilities.

Using a separate AI to validate the first AI’s code uncovered a few more things that needed fixing, which I then had the first AI fix. We repeated the cycle until no further errors were found. It’s been months since I installed that new code on my server. In that time, the hosting provider hasn’t red-flagged the new implementation.

Think like a general contractor, not a craftsperson

Perhaps because I spent most of my software career working on code acquired through IP acquisitions or produced by contractors and employees, I’m not freaked out by the fact that AI-written code is a big black box. It’s just that you have to use different skills.

Many cautionary articles on vibe coding contend that although the code writing period of the software lifecycle is wildly compressed, the debugging and maintenance periods have expanded to account for the mess AIs deliver in their code, often after it’s been shipped to users.

Also: How to actually use AI in a small business: 10 lessons from the trenches

There is truth to this worry, but it’s really no different than working with acquired code or code written by contractors. Engineering managers have been dealing with these issues for decades. Good software engineering, planning, and management practices are designed to overcome the problem of contractor opaqueness. It just requires discipline, training, and experience.

This all goes back to the premise that AI isn’t a magic bullet. You’re never going to give a one-line prompt and create a million-dollar product. You have to work it. You can shorten time-to-market. You can use AI to help support maintenance. You can use AI to find and fix security vulnerabilities. You can have fun with AI.

Just remember that the AI is a tool, and you are the professional. You need to manage, delegate intentionally, and test voraciously. If you do, you’re likely to find that you can avoid a vibe coding apocalypse and create solid outcomes.

What’s your take on agentic AI in software development? Let us know in the comments below.


You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.





Source link

Leave a Reply

Subscribe to Our Newsletter

Get our latest articles delivered straight to your inbox. No spam, we promise.

Recent Reviews


There are a ton of laptops on the market at any given moment and almost all of those models are available in multiple configurations to match your performance and budget needs. If you’re feeling overwhelmed with options when looking for a new laptop, it’s understandable. To help simplify things for you, here are the main things you should consider when you start looking.

Price

The search for a new laptop for most people starts with price. If the statistics that chipmaker Intel and PC manufacturers hurl at us are correct, you’ll be holding onto your next laptop for at least three years. If you can afford to stretch your budget a little to get better specs, do it. That stands whether you’re spending $500 or more than $1,000. In the past, you could get away with spending less upfront with an eye toward upgrading memory and storage in the future. Laptop makers are increasingly moving away from making components easily upgradable, so again, it’s best to get as much laptop as you can afford from the start.

Generally speaking, the more you spend, the better the laptop. That could mean better components for faster performance, a nicer display, sturdier build quality, a smaller or lighter design from higher-end materials or even a more comfortable keyboard. All of these things add to the cost of a laptop. I’d love to say $500 will get you a powerful gaming laptop, for example, but that’s not the case. Right now, the sweet spot for a reliable laptop that handles average work, home office or school tasks is between $700 and $800 and a reasonable model for creative work or gaming is upward of about $1,000. The key is to look for discounts on models in all price ranges so you can get more laptop capabilities for less.

Operating system

Choosing an operating system is part personal preference and part budget. For the most part, Microsoft Windows and Apple MacOS do the same things (save for gaming, where Windows is the winner), but they do them differently. Unless there’s an OS-specific application you need, get the one you feel most comfortable using. If you’re not sure which that is, head to an Apple store or a local electronics store and test them out. Or ask friends or family to let you test theirs for a bit. If you have an iPhone or iPad and like it, chances are you’ll like MacOS, too.

In price and variety (and PC gaming), Windows laptops win. If you want MacOS, you’re getting a MacBook. Apple’s MacBooks regularly top our best lists, the least expensive one is the M1 MacBook Air for $999. It is regularly discounted to $750 or $800, but if you want a cheaper MacBook, you’ll have to consider older refurbished ones.

Windows laptops can be found for as little as a couple of hundred dollars and come in all manner of sizes and designs. Granted, we’d be hard-pressed to find a $200 laptop we’d give a full-throated recommendation to but if you need a laptop for online shopping, email and word processing, they exist.

If you are on a tight budget, consider a Chromebook. ChromeOS is a different experience than Windows; make sure the applications you need have a Chrome, Android or Linux app before making the leap. If you spend most of your time roaming the web, writing, streaming video or using cloud-gaming services, they’re a good fit.

Size

Remember to consider whether having a lighter, thinner laptop or a touchscreen laptop with a good battery life will be important to you in the future. Size is primarily determined by the screen — hello, laws of physics — which in turn factors into battery size, laptop thickness, weight and price. Keep in mind other physics-related characteristics, such as an ultrathin laptop isn’t necessarily lighter than a thick one, you can’t expect a wide array of connections on a small or ultrathin model and so on.

Screen

When deciding on a screen, there are a myriad number of considerations, like how much you need to display (which is surprisingly more about resolution than screen size), what types of content you’ll be looking at and whether you’ll be using it for gaming or creative work.

You really want to optimize pixel density; that is, the number of pixels per inch the screen can display. Although other factors contribute to sharpness, a higher pixel density usually means a sharper rendering of text and interface elements. (You can easily calculate the pixel density of any screen at DPI Calculator if you don’t feel like doing the math, and you can also find out what math you need to do there.) I recommend a dot pitch of at least 100 pixels per inch as a rule of thumb.

Because of the way Windows and MacOS scale for the display, you’re frequently better off with a higher resolution than you’d think. You can always make things bigger on a high-resolution screen, but you can never make them smaller — to fit more content in the view — on a low-resolution screen. This is why a 4K, 14-inch screen may sound like unnecessary overkill but may not be if you need to, say, view a wide spreadsheet.

If you need a laptop with relatively accurate color that displays the most colors possible or that supports HDR, you can’t simply trust the specs — not because manufacturers lie, but because they usually fail to provide the necessary context to understand what the specs they quote mean. You can find a ton of detail about considerations for different types of screen uses in our monitor buying guides for general purpose monitors, creators, gamers and HDR viewing.

Processor

The processor, aka the CPU, is the brains of a laptop. Intel and AMD are the main CPU makers for Windows laptops, with Qualcomm as a new third option with its Arm-based Snapdragon X processors. Both Intel and AMD offer a staggering selection of mobile processors. Making things trickier, both manufacturers have chips designed for different laptop styles, like power-saving chips for ultraportables or faster processors for gaming laptops. Their naming conventions will let you know what type is used. You can head over to Intel or AMD for explanations so you get the performance you want. Generally speaking, the faster the processor speed and the more cores it has, the better the performance will be.

Apple makes its own chips for MacBooks, which makes things slightly more straightforward. Like Intel and AMD, you’ll still want to pay attention to the naming conventions to know what kind of performance to expect. Apple uses its M-series chipsets in Macs. The entry-level MacBook Air uses an M1 chip with an eight-core CPU and seven-core GPU. The current models have M2-series silicon that starts with an eight-core CPU and 10-core GPU and goes up to the M2 Max with a 12-core CPU and a 38-core GPU. Again, generally speaking, the more cores it has, the better the performance.

Battery life has less to do with the number of cores and more to do with CPU architecture, Arm versus x86. Apple’s Arm-based MacBooks and the first Arm-based Copilot Plus PCs we’ve tested offer better battery life than laptops based on x86 processors from Intel and AMD.

Graphics

The graphics processor handles all the work of driving the screen and generating what gets displayed, as well as speeding up a lot of graphics-related (and increasingly, AI-related) operations. For Windows laptops, there are two types of GPUs: integrated (iGPU) or discrete (dGPU). As the names imply, an iGPU is part of the CPU package, while a dGPU is a separate chip with dedicated memory (VRAM) that it communicates with directly, making it faster than sharing memory with the CPU.

Because the iGPU splits space, memory and power with the CPU, it’s constrained by the limits of those. It allows for smaller, lighter laptops, but doesn’t perform nearly as well as a dGPU. There are some games and creative software that won’t run unless they detect a dGPU or sufficient VRAM. Most productivity software, video streaming, web browsing and other nonspecialized apps will run fine on an iGPU.

For more power-hungry graphics needs, like video editing, gaming and streaming, design and so on, you’ll need a dGPU; there are only two real companies that make them, Nvidia and AMD, with Intel offering some based on the Xe-branded (or the older UHD Graphics branding) iGPU technology in its CPUs.

Memory

For memory, I highly recommend 16GB of RAM (8GB absolute minimum). RAM is where the operating system stores all the data for running applications and it can fill up fast. After that, it starts swapping between RAM and SSD, which is slower. A lot of sub-$500 laptops have 4GB or 8GB, which in conjunction with a slower disk can make for a frustratingly slow Windows laptop experience. Also, many laptops now have the memory soldered onto the motherboard. Most manufacturers disclose this but if the RAM type is LPDDR, assume it’s soldered and can’t be upgraded.

Some PC makers will solder memory on and also leave an empty internal slot for adding a stick of RAM. You may need to contact the laptop manufacturer or find the laptop’s full specs online to confirm. Check the web for user experiences because the slot may still be hard to get to, it may require nonstandard or hard-to-get memory or other pitfalls.

Storage

You’ll still find cheaper hard drives in budget laptops and larger hard drives in gaming laptops. Faster solid-state drives have all but replaced hard drives in laptops and can make a big difference in performance. Not all SSDs are equally speedy, and cheaper laptops typically have slower drives. If the laptop only comes with 4GB or 8GB of RAM, it may end up swapping to that drive and the system may slow down quickly while you’re working.

Get what you can afford and if you need to go with a smaller drive, you can always add an external drive or two down the road or use cloud storage to bolster a small internal drive. The exception is gaming laptops: I don’t recommend going with less than a 512GB SSD unless you really like uninstalling games every time you want to play a new game.





Source link