Debug-gym Can AI agents lighten developers debugging load
Developers spend a lot of time debugging code. Learn how debug-gym can equip AI agents to help, enabling them to set breakpoints, navigate the codebase, and print runtime variable values on demand, so they better understand the code and its execution flow.
The Growing Role of AI in Coding
With the rise of AI coding tools like GitHub Copilot, developers are increasingly relying on AI to generate code. GitHub CEO Thomas Dohmke predicted that "sooner than later, 80% of the code is going to be written by Copilot." This trend is evident in startups, where 95% of code for a quarter of Y Combinator’s latest batch was written by large language models (LLMs).
However, most developers spend the majority of their time debugging code, not writing it. This raises an important question: Can AI tools also assist in debugging?
Introducing Debug-gym
Microsoft Research has developed debug-gym, an environment that equips AI agents with interactive debugging tools. Unlike traditional AI coding tools, debug-gym allows agents to:
- Set breakpoints
- Navigate codebases
- Print runtime variable values
- Create test functions
Key Features of Debug-gym
- Repository-level information: Agents can navigate and edit files across the entire repository.
- Robust and safe: Code runs in sandboxed Docker containers to prevent harmful actions.
- Extensible: Easily add new tools to the environment.
- Text-based: Compatible with modern LLM-based agents.
Early Results and Future Work
Initial experiments show promising results. While current AI tools struggle with complex debugging tasks, debug-gym-enabled agents show significant improvement. For example, on the SWE-bench Lite benchmark, agents with debugging tools performed better than those without.
Next Steps
Microsoft plans to:
- Fine-tune LLMs for interactive debugging.
- Develop specialized data for training debugging agents.
- Open-source debug-gym to encourage community collaboration.
Conclusion
Debug-gym represents a significant step forward in AI-assisted debugging. By enabling AI agents to interactively seek information and propose fixes, it has the potential to drastically reduce developers' debugging workload. For more details, check out the technical report and GitHub repository.
Related News
Microsoft Releases Open-Source AI Agent Framework
Microsoft unveils its open-source Agent Framework to streamline AI agent development with enterprise-ready tools and simplified coding.
GoDaddy Launches Trusted Identity System for AI Agents
GoDaddy introduces a trusted identity naming system for AI agents to verify legitimacy and ensure secure interactions as the AI agent landscape grows.
About the Author

David Chen
AI Startup Analyst
Senior analyst focusing on AI startup ecosystem with 11 years of venture capital and startup analysis experience. Former member of Sequoia Capital AI investment team, now independent analyst writing AI startup and investment analysis articles for Forbes, Harvard Business Review and other publications.