Comments (2)
what's the difference between the following fields?
"generated": 249,
"with_logs": 249,
"applied": 245,
"resolved": 48
from auto-code-rover.
cost_X_Y: X is the budget cost of running swe-agent in our experiment, and Y is the trail of repetition.
In this case, we used a budget of 2 USD, and repeated the experiment 3 times.
Inside the cost_X_Y directory, *.traj files are the conversation log files for each task instance in swe-bench.
all_pred.jsonl includes all the generated patches.
For AutoCodeRover acr-run-1, acr-run-2, and acr-run-3 results align with Table-3, In our environment, the ACR column.
- generated: there is an agent-generated patch for this issue
- with_logs: a log file is produced when executing the passing/failing test-cases of this issue
- applied: the patch can be applied successfully to the original program.
- resolved: the patch made the passing/failing test-cases of this issue pass
The details on how the stats are generated can be found here: https://github.com/yuntongzhang/SWE-bench/blob/main/metrics/report.py#L264C5-L264C21
from auto-code-rover.
Related Issues (20)
- [Fresh issue mode] Support local codebase and local issue file
- missing rich library HOT 1
- swe_agent_rep HOT 2
- Ollama support issue HOT 1
- Evaluation new models HOT 1
- Question on how to obtain different repo versions HOT 1
- How to make it work without anaconda HOT 2
- Why not listed in the swe-bench leaderboard HOT 1
- Docker image fails to build (M1 Mac) HOT 3
- How to select the api? HOT 4
- Possible underestimated pass@3 results HOT 3
- Why have you implemented your tool from scratch instead of using existing frameworks like AutoGPT or Baby AGI?
- Why not generate the JSON with api calls directly in the first inference? HOT 1
- Request for Explanation on Restriction of OpenAI Parallel Tool Calls
- open source the latest version of the code HOT 3
- docker install error HOT 2
- Context window limitation
- Add a new mode that only outputs fix locations HOT 1
- Can AutoCodeRover execute the code files? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from auto-code-rover.