Training Software Engineering Agents and Verifiers with SWE-Gym
Papers with CodeBy Javier Vásquez
Posted on: January 01, 2025
**Paper Analysis**
The paper introduces SWE-Gym, a novel environment designed to train real-world software engineering (SWE) agents and verifiers. The authors aim to develop AI-powered tools that can assist humans in software development tasks, such as bug detection, code refactoring, or feature implementation.
**Research Questions:**
The paper addresses the following questions:
1. Can SWE-Gym effectively train language model-based SWE agents?
2. How do these trained agents perform on real-world software engineering benchmarks (SWE-Bench)?
3. Can verifiers be trained to scale inference-time with sampled agent trajectories from SWE-Gym?
**Methodology:**
The authors created SWE-Gym, a dataset containing 2,438 Python task instances, each consisting of:
1. Codebase with an executable runtime environment
2. Unit tests
3. Task specified in natural language
They used this dataset to train language model-based SWE agents and evaluate their performance on the SWE-Bench Verified and Lite test sets.
**Results:**
The authors achieved significant gains in resolve rate (up to 19% absolute) for trained SWE agents on both SWE-Bench test sets. By combining fine-tuned SWE agents with verifiers trained on agent trajectories, they achieved new state-of-the-art results (32.0% and 26.0%) on the same test sets.
**Significance:**
This paper's contributions are twofold:
1. **SWE-Gym**: The authors provide a comprehensive environment for training SWE agents, which can be used to develop AI-powered tools for software development.
2. **Improved performance**: By fine-tuning SWE agents and using verifiers, the authors demonstrate significant improvements in resolve rate on real-world software engineering benchmarks.
**Potential Use Cases:**
1. **Automated bug detection**: Trained SWE agents can assist in identifying bugs or anomalies in software code, reducing manual testing efforts.
2. **Code refactoring**: AI-powered tools can help developers refactor code to improve maintainability, readability, and efficiency.
3. **Feature implementation**: SWE agents can aid in implementing new features by suggesting suitable design patterns, APIs, or libraries.
**Link:**
You can access the paper on Papers with Code:
https://paperswithcode.com/paper/training-software-engineering-agents-and
This paper presents a significant advancement in the field of AI-powered software engineering. By providing SWE-Gym and demonstrating improved performance on real-world benchmarks, the authors pave the way for further research and development of AI-powered tools that can assist humans in software development tasks.