BuildArena
Benchmark where LLM agents design, build, and test rockets, cars, and bridges in a physics simulator from text goals.
Connects to: Besiege · Python · Other 94★
Use it with an AI agent
Loadbay is an MCP server, so an agent can search the catalog and find this harness:
claude mcp add --transport http loadbay https://loadbay.xyz/api/mcp
- Source: https://github.com/AI4Science-WestlakeU/BuildArena
- This harness as JSON: /api/harnesses/buildarena
- Agent setup: /setup.md
- Browse all 370+ harnesses on Loadbay