Quoting Unknown - Introducing Showboat and Rodney, so agents can demo what they’ve built

Simon Willison in the article Introducing Showboat and Rodney, so agents can demo what they’ve built (published February 10, 2026):

The frontier models all understand that “red/green TDD” means they should write the test first, run it and watch it fail and then write the code to make it pass - it’s a convenient shortcut.

I find this greatly increases the quality of the code and the likelihood that the agent will produce the right thing with the smallest amount of prompts to guide it.