Test driven development with AI Agents
Learn best practices for combining TDD with AI coding agents like Windsurf and Claude 3.5 Sonnet to automate and streamline your software development pipeline.

What is TDD (Test Driven Development)?
At its core, TDD revolves around a cyclical process known as Red-Green-Refactor. The cycle commences with the “Red” phase, where a developer writes an automated test case that defines a desired feature or behavior. This test is intentionally written to fail initially because the corresponding code does not yet exist. This initial failure is critical as it ensures that the test is indeed testing the intended functionality and can catch errors once the code is implemented.
The subsequent “Green” phase involves writing the minimum amount of code necessary to make the previously failing test pass. This principle encourages developers to focus on the immediate requirement defined by the test, maintain a clean codebase, and prevent over-engineering.
Finally, the “Refactor” phase focuses on improving the structure, readability, and maintainability of both the test code and the production code, all while ensuring that all existing tests continue to pass. Refactoring ensures that the codebase remains healthy and adaptable to future changes without introducing regressions, with the existing test suite acting as a safety net during this phase.
How to Fully Automate your TDD pipeline?
There are a lot of factors affecting the performance of AI Agents while coding. From the LLM Model to how you structure your code and your development pipeline. We have found out TDD to be effective with Windsurf with Claude 3.5 sonnet. The following is a sample task that is implemented in TDD.
What do you need?
Before we start coding, we need the following:
Enough Tests
Make sure you already have tests according to TDD, and they cover most of the scope of your problem that you think is logical and helpful for the AI Agent. You don’t need to change or customize anything for sake AI Agent. If your tests and their naming convention are based on a standard, it would be better. As the first step, the AI Agent will see these tests to start implementing.
Here is an example of a test I have that tests whether a document has been correctly inserted into MongoDB:

Interfaces
An Interface in Java (or any other programming language) is a structure defining what the classes should look like and what methods they should implement. To guide the AI Agent even further, it would be helpful to create an Interface for your repository to insert documents:


Specific description of the task
Lastly, we need a specific task description. Usually, you can use JIRA or GitHub issues where you define the task. This is ours:

Start Vibe Coding
Vibe coding means you describe what you need in plain English (or even voice commands), and AI generates the code for you in real time. We go into detail on Vibe Coding in this blog. Here is the prompt I used in Windsurf to complete the task for me:

Implement the following query as a combination of one or more named queries and Java code.
Find all drivers who have completed at least X trips with a rating above 5 in a given date range and have never received a rating below 3 stars.
Note:
You do not need to find a single query to solve this task (you can use a combination of Java code and named queries), but you have to keep ORM-performance in mind, i.e., make sure that your solution is also reasonably fast if you have many entities. During the discussion session you should be able to explain what types of problems can arise with badly written queries.
@Ass1_2_2Test.java#L35-60
these are the corresponding tests. Its implemented in@DriverDAO.java#L34-63
Mentioning the files is the most important aspect when vibe coding. Now sit back and see the AI Agent do its magic. It implemented the Class, ran the tests, and reiterated to make the tests pass:

Frequently asked questions
- What is Test Driven Development (TDD)?
Test Driven Development (TDD) is a software development approach where automated tests are written before the actual code. The process follows a Red-Green-Refactor cycle: writing a failing test (Red), implementing code to pass the test (Green), and then refactoring code while keeping all tests passing.
- How can AI Agents automate the TDD pipeline?
AI Agents like Windsurf, especially when paired with models such as Claude 3.5 Sonnet, can automate code generation, run tests, and perform iterative improvements, making the TDD process faster and more efficient.
- What are the prerequisites for automating TDD with AI Agents?
To automate TDD with AI Agents, you need a comprehensive set of tests, clearly defined interfaces, and specific task descriptions. Standardized test naming and clear documentation help guide the AI Agent for optimal results.
- What is Vibe Coding?
Vibe Coding is an AI-powered approach where developers describe requirements in plain English (or voice), and the AI generates code in real time, iterating until all tests pass and the solution meets the requirements.
Boost Your Development with AI Agents
Discover how FlowHunt's AI flow engineers and coding agents can automate your development process for maximum efficiency and innovation.