Developing Software with AI (in 2026)

AI is transforming how we write software, but maintaining code quality requires structured workflows. This article explores practical strategies for AI-assisted development in 2026, including context engineering with dedicated files, iterative plan-based implementation, chunked reviews, and tools like MCP-Servers and AGENTS.md that enhance the development experience while keeping us developers engaged and in control.

There is a huge range of opinions about using agentic AI for software development. While there are still some traditionalists that insist on using their own hands and more importantly brains to write software, most developers have started to integrate AI into their development process. Even non-developers are now capable of writing software using a vibe-coding approach. The quality of the resulting code strongly depends on the task, programming language, environment, model and also prompting skills. Just by looking at the rapid developments of the last year or even months, I think we can all agree about the fact, that our job is changing, and we as developers need to stay on our toes to not be left behind.

Reviewing Code

We developers at esveo also have different opinions on AI use, but there is one thing that unites us (and I think that the wider developer community would agree): We love quality code, so AI-generated code should always be reviewed; at least twice.

While creating and reviewing pull requests has been the norm for a while now, a developer creating a pull request should always have read the code that they want to merge themselves first. This is also nothing new, as copy&paste have existed long before AI. But with development workflows changing, there is room for slip-ups. If you have the AI create huge amounts of changes within one session, there is a certain danger that some changes slip through without proper review. IDEs help you already by showing the exact diff caused by the AI, but it is really easy just to accept the changes, moving file out of your direct radar.

Work as a developer has already shifted from writing code yourself to orchestrating the AI and reviewing generated code. Code review is not always fun, so we need workflows that keep us engaged and that ensure that the generated code is as close to the expected result as possible; nobody likes to review bad code. (to my colleague Jonathan, who had to review the results of my personal AI slip-up: I'm really sorry you had to work through that, but thanks for having my back! 🙂).

Iterating on one-shot AI results

If we use the AI to directly build a new feature or refactor existing code in a single go, many files will be changed and review may be difficult, if the change is big. You also have to make the decision whether you want to iterate on the changes before or after committing everything to source control. When iterating on temporary changes with the AI, things can go bad, and you might want to revert to the first result produced by the AI. On the other hand, if you commit the changes, the AI might lose the proper context of the full change and assume much more of the newly created but committed code as given, basically being stuck within the bad or mediocre framework that it built up itself in the first place.

In the beginning was the Prompt

Before iterating on any results, let’s go back to the start.

Prompt- or nowadays rather context engineering is somewhere in between art and science. Providing the proper context and instructions to an AI will drastically improve the results, but the type of context that is needed to achieve this, heavily depends on the task, model and probably the weather. Creativity is important, but there are also strategies that have proven effective.

Usually when implementing a new feature for an existing project, you will have a basic idea where that feature will live, how it will connect to existing code, and maybe you’ll even have certain edge-cases in mind with possible solutions. The more experienced you are as a developer, the more useful input you can provide to help the model achieve your goal. We are not yet obsolete as humans and should leverage our experience.

Instead of just writing the prompt to the AI, create a new file where you save your prompt. This will help you structure the thoughts and also be a useful reference later on. That file can be as simple as a MyFeature.prompt.md file in your project. Using markdown is beneficial, because the AI understands it, and it provides a good framework to structure the prompt in a clean way.

I have made good experiences with the following rough guidelines:

  • Explain the business feature and how the feature should be used in the end
  • Be specific about workflows, provide user stories that explain the intended workflows
  • Keep technical details to a separate section, if you want to provide those
  • Reference files in the workspace. You are not required to use properly formatted markdown file links, the AIs are pretty smart in resolving file, function or class names automatically.
  • Explicitly restrict the scope where applicable. Sometimes the AI is a bit eager to just finish or build everything that is adjacent to your feature

Feel free to use AI to iterate on the business specification. Sometimes the AI can spot inconsistencies or already pose interesting questions.

A prompt file could look something like this in structure.

# Business Specification

Allow users to archive TODO-List items. As a user I want to be able to remove (but not permanently delete) items from my TODO list.

Interactions:
- In the main list view, there should be a link to another view that shows all archived items
- An item can be archived or restored through the context menu on the item.

# Further effects

- The user should no longer receive any notifications (e.g. for deadlines) on archived items.
- archived items cannot be edited through UIs. to edit an archived item the user needs to restore it first

# Technical Details

- Use a boolean field `isArchived` to the `TodoListItem` entity.
- Instead of adding a new endpoint, add a parameter to the existing `items/get` endpoint
- Create new end-to-end tests for the new functionality and UIs

# Scope Restriction

- Do not create new permissions for the operation to archive/restore items. For now use the same permission that is used for deletion for both operations.

You can already see that while being a rather simple example, this feature will span multiple layers of the application, including database, backend logic, apis and frontend UI.

Working with a plan

Once you are happy with the prompt, you could just have the AI implement everything, but then again, there would be huge block of changes that waits for careful review.

One way to simplify review and iteration is not to work with the resulting changes, but to instead work out and iterate on a step-wise implementation plan. You don’t need to write this yourself, but you can let the AI generate the plan given the prompt file as input to draft a plan.

Again, instead of drafting the plan within the chat session, I prefer working with a plan file MyFeature.plan.md that lives directly beside the prompt file. As the plan grows bigger and as you iterate on the plan, sometimes the chat grows beyond its context window. And although the IDEs are getting better at not letting you notice a change of context window, sometimes the AI loses track of details way back in the conversation. With a plan file, nothing is lost.

Drafting a plan allows us to iterate on the implementation steps before a single line of code is written. Current models tend to automatically include code snippets that serve as drafts for planned implementation. These snippets make it really easy for us to see whether the AI is going in the right direction. Instead of refactoring a big set of changes, we iterate on a single file until happy with the resulting draft.

As last step before heading into implementation, I like to start a fresh AI session and let the AI review the plan and how well it matches the initial prompt. After several steps of iteration the plan may have started to diverge from the initial prompt. This is another reason, why working with prompt and plan files is beneficial. The AI can quickly parse both files and locate discrepancies.

Chunking implementation and review

You may be tempted to now just instruct the AI to “Do it!”. Again the result would be a huge set of changes, and while you are more familiar with the structures after having drafted the plan more interactively, it may still be a lot of code to review at once.

My first prompt usually looks like this when I start the actual implementation:

Start implementation by following the plan defined in MyFeature.plan.md. Start with Step 1 and keep the plan file updated as you go. Do not continue with the other steps, until I explicitly instruct you to do so.

Instructing to keep the plan file updated allows both you and the AI to easily keep track of where you are in the implementation. This makes it a lot easier to pick up the work on the next day when you are working on bigger features.

I explicitly let the AI implement each step individually and instruct it to wait before continuing with the next steps. I see multiple benefits to using a step-wise approach in implementation:

Limit the amount of changes to review

When the AI stops implementation after each step and waits for review before it continues, this gives us as developers a good well-defined scope in which we can review the changes just related to this single implementation step.

Spotting errors early

If we spot anything that we are not happy with during the review of a single step, we can instruct the AI to fix the issues before it builds upon its mistakes and causes a lot of headache later on.

Using Source Control

Splitting the implementation into reasonable chunks allows us to commit the reviewed changes for each step. Once we are done with everything and happy with the result, we can still squash the individual changes. But this way we always have savepoints that we can return to, if we want to take a break or if anything goes horribly wrong.

Keeping idle time low

If I let the model just do everything at once, it will take some time for the AI to finish. The AIs cannot (yet) work fully independently, and at any point in time, the AI could request additional input, clarification or confirmation. You can tweak this behavior, but ultimately it is not deterministic. Depending on the size of the task, I could go grab a coffee, lunch or maybe even work on other projects in parallel, but if there is any required input from my side, when I come back, I would have to wait again after providing the input. And in contrast to AI, we as humans are not that good in switching context frequently.

I have tried out this approach, but have for now decided that I prefer to stay engaged in the implementation process. While the AI is busy with the first step, I can already start reviewing the first incoming changes, speeding up the whole process. This limits the amount of files to review at once even more and again allows me to spot problems early on. Interrupting and then again picking up the AI implementation process within a step has not yet caused issues for me, so feel free to step in and give early feedback.

Finishing implementation

Once you are happy with the implementation of the first step, continue on with the next step. Do that until all steps are done and until there are no more open todos in the plan file. Having gone through the individual steps with individual review, there is no need for a big review (from your side). Any other reviewer will still have to do a big review, but if you did everything right, this will not be different from any review of human-written code.

Depending on your personal preference, you can decide whether to keep the prompts and plans in the project. The files can serve as a form of documentation that AI could reference later on to better understand the ideas behind and inner workings of certain features.

Further aspects

The workflows described here were developed and used in late 2025. Given how rapidly models and tooling are evolving, everything you read here may already be outdated in a few months. My guess is, that the recommendations will become obsolete before 2027, but who knows.

So if you are reading at some point in the probably not so far future, see it as a snapshot of the past, how we used to work with AI way back in 2025.

There are also many more aspects to AI-assisted software development that I would like to at least quickly address in this section:

MCP-Servers

MCP-Servers are a great way to extend AI capabilities. When working with the previously describe prompt, plan and implementation workflow, these MCP-Servers have proven especially useful:

  • https://github.com/upstash/context7 gives your AI access to documentation for a lot of publicly available packages. This way, the AI can autonomously check the official documentation and understand how to use the library in the intended way. When letting the AI write code that uses third party packages, context7 drastically improves the quality of the first results.
  • https://github.com/ChromeDevTools/chrome-devtools-mcp connects your AI to your Chrome DevTools. The AI can now click through pages, fill forms and understand layout. This basically closes the feedback cycle, allowing the AI to troubleshoot and debug your web-based application by iterating on the code and then autonomously checking the result in the browser.
  • MCP-Server for your database: By now there are MCP-Servers for basically every major database provider. This allows the AI to read and manipulate schema and data in your database. Migrations can be generated accurately, because the AI can understand the structure of your database. Most of the available MCP-Servers allow you to configure the kind of access (read/write, data/schema) that your AI gets, so if you worry about leaking data to the outside world or accidentally wiping your database, just restrict the access.

There are countless available MCP-Servers that can help improving your workflows. If your AI lacks certain capabilities, chances are high that there is already an MCP-Server that gives your AI the required tools.

Prompt Refinement and AGENTS.md

Refining the prompt gives you a huge opportunity to improve or tweak AI behavior to your specific needs. When reading about drafting prompts, we often see initial instructions in the style of:

You are a specialist in … and are assisting with …
Your answers are expertly crafted…

…and so on and so on. When using built-in AI in coding environments like VS Code or Cursor, usually the model already receives prebuilt instructions giving the context of being used within a coding environment.

One easy step for enriching the prompt by giving more project-specific context is using a properly configured AGENTS.mdfile. There are plenty of guides and articles about how to set up this file, and how your IDE might even assist in setting this up. With this file you can provide project-specific context to any AI interacting with the code-base. In short this file could include:

  • Documentation about the architecture, libraries and frameworks used in the project
  • Information about the business problems that the software tries to solve
  • Coding Guidelines
  • Documentation of development workflows (how to run tests, how to start a development server, …)

The information that you provide here is always in scope when prompting the AI. So if you encounter certain recurring topics, where the AI needs to research a certain topic again and again, just put a note in the AGENTS.md or even better just instruct the AI to add a section for that topic into the AGENTS.md file.

For large monorepos you can define a separate AGENTS.mdfile per package. The AI automatically uses the nearest file in the directory tree.

Checkout https://agents.md/ for further details.

IDE and model selection

One thing I haven’t talked about yet is the impact of model and IDE selection. I am mainly using VS Code and the GitHub Copilot. There are many other viable IDEs that have more or less native AI-integration. Cursor is a good example for an IDE where AI is much more deeply integrated in the IDE, which can simplify certain workflows.

Concerning models, for the last months I have enjoyed working with the Anthropic models, especially Sonnet 4.5 and Opus 4.5. While being a bit more expensive, the capabilities and speed of the Opus model are really worth the coin. Both models are really good in finding their way in big code bases and the described approach to developing new features works really well with both models.

GPT-5.1 and Gemini 3 produce similar results, but have slightly different styles. My current take on model selection is that the choice of the “best” model heavily depends on your individual project, stack and style. Knowing your tool and how to use it will still be better than having the perfect tool, but being unfamiliar with it’s quirks.

Background Agents

The most current developments go towards more autonomous orchestration, where a central model can orchestrate sub-agents that work on independent tasks in the background and even in parallel in isolated environments. Support for Background Agents was just introduced to VS Code in December 2025. Right now, one-shot solutions for bigger tasks are still not at a point where I am happy with the result, and thus for now, I prefer staying close to the action instead of reviewing a lot of changes at once in the end. I am quite sure that this will change with the next generations of models and I am curious to see where this will take us.

Final thoughts

It is hard to judge how our job as developer and also the role of software companies will evolve over the next years. Our best chance is to stay flexible and to be open to change, to not be left behind somewhere on the way.

Through our blog, we at esveo want to share practical insights and real-world experiences with you, as we navigate this change together. Stay curious and happy developing! :-)