Training Software Engineering Agents and Verifiers with SWE-Gym

By Javier Vásquez

Posted on: January 01, 2025

Training Software Engineering Agents and Verifiers with SWE-Gym

**Paper Analysis**

The paper introduces SWE-Gym, a novel environment designed to train real-world software engineering (SWE) agents and verifiers. The authors aim to develop AI-powered tools that can assist humans in software development tasks, such as bug detection, code refactoring, or feature implementation.

**Research Questions:**

The paper addresses the following questions:

1. Can SWE-Gym effectively train language model-based SWE agents?

2. How do these trained agents perform on real-world software engineering benchmarks (SWE-Bench)?

3. Can verifiers be trained to scale inference-time with sampled agent trajectories from SWE-Gym?

**Methodology:**

The authors created SWE-Gym, a dataset containing 2,438 Python task instances, each consisting of:

1. Codebase with an executable runtime environment

2. Unit tests

3. Task specified in natural language

They used this dataset to train language model-based SWE agents and evaluate their performance on the SWE-Bench Verified and Lite test sets.

**Results:**

The authors achieved significant gains in resolve rate (up to 19% absolute) for trained SWE agents on both SWE-Bench test sets. By combining fine-tuned SWE agents with verifiers trained on agent trajectories, they achieved new state-of-the-art results (32.0% and 26.0%) on the same test sets.

**Significance:**

This paper's contributions are twofold:

1. **SWE-Gym**: The authors provide a comprehensive environment for training SWE agents, which can be used to develop AI-powered tools for software development.

2. **Improved performance**: By fine-tuning SWE agents and using verifiers, the authors demonstrate significant improvements in resolve rate on real-world software engineering benchmarks.

**Potential Use Cases:**

1. **Automated bug detection**: Trained SWE agents can assist in identifying bugs or anomalies in software code, reducing manual testing efforts.

2. **Code refactoring**: AI-powered tools can help developers refactor code to improve maintainability, readability, and efficiency.

3. **Feature implementation**: SWE agents can aid in implementing new features by suggesting suitable design patterns, APIs, or libraries.

**Link:**

You can access the paper on Papers with Code:

https://paperswithcode.com/paper/training-software-engineering-agents-and

This paper presents a significant advancement in the field of AI-powered software engineering. By providing SWE-Gym and demonstrating improved performance on real-world benchmarks, the authors pave the way for further research and development of AI-powered tools that can assist humans in software development tasks.