Browser & computer harnesses for AI agents
29 open-source Browser & computer harnesses an AI agent can use — MCP servers, SDKs, and adapters. Browse them on Loadbay. An agent can search these over Loadbay's MCP:
claude mcp add --transport http loadbay https://loadbay.xyz/api/mcp
→ Best Browser & computer harnesses (top picks, ranked)
- browser-use — Make any website usable by an agent. It drives a real browser to click, type, and finish tasks online — the most-starred browser harness by a wide margin.
- open-interpreter — Lets LLMs run code and control the local computer through a natural-language interface for OS-level automation.
- Chrome DevTools MCP — Official Chrome DevTools MCP server letting an agent control and inspect a live Chrome browser for automation and debugging.
- UI-TARS Desktop — ByteDance multimodal agent stack (Agent TARS + UI-TARS Desktop) that controls computer and browser operators via natural language.
- playwright-mcp — Microsoft's Playwright MCP server — drive a real browser (navigate, click, fill, assert) from any agent.
- OmniParser — Screen-parsing tool that converts UI screenshots into structured elements to enable pure vision-based GUI agents.
- stagehand — SDK for browser agents that adds act, extract, and observe primitives on top of Playwright for AI-driven web automation.
- skyvern — Automates browser-based workflows using LLMs and computer vision to operate websites without site-specific scripts.
- cua — Infrastructure for computer-use agents: sandboxes, SDKs, and benchmarks so an agent can drive a whole desktop without escaping it.
- Anthropic computer-use demo — Anthropic official computer-use reference: a containerized Linux desktop where Claude controls the GUI via screenshots and tool calls.
- web-ui — Browser-based UI for running web-automation agents with support for custom models and persistent browser sessions.
- maxun — No-code platform that turns websites into structured APIs through browser-based scraping and AI data extraction.
- midscene — Vision-driven UI automation that drives web and mobile interfaces from natural language for AI agents.
- nanobrowser — An open-source Chrome extension that runs multi-agent web-automation workflows right in your browser.
- Agent-S — An open framework that lets an agent use a computer the way a person does: read the screen, move the mouse, type.
- BrowserOS — Open-source agentic web browser (Chromium fork) that runs AI agents natively inside the browser.
- Bytebot — A self-hosted AI desktop agent that automates computer tasks inside its own containerized desktop.
- self-operating-computer — A framework that lets a multimodal model operate a computer by looking at the screen and moving the mouse.
- UFO — UI-focused agent that operates Windows applications via natural language using GUI and API actions.
- steel-browser — Open-source browser API and sandbox that lets AI agents automate the web without managing browser infrastructure.
- LaVague — Large Action Model framework that turns natural-language objectives into executable web automation for AI agents.
- Playwright MCP (ExecuteAutomation) — Popular community Playwright MCP server enabling agents to automate browsers and APIs, with screenshots and codegen.
- Open-Interface — Controls any computer using LLMs by simulating keyboard and mouse to complete user-specified tasks across apps.
- notte — Framework to build web agents and deploy serverless web automation functions on managed browser infrastructure.
- OpenAdapt — Generative RPA that records desktop screen and input activity and replays it with multimodal models to automate GUI tasks.
- WebArena — Self-hostable realistic web environment (e-commerce, forums, GitLab, CMS) for building and benchmarking autonomous web agents.
- Agent-E — Agent-driven browser automation built on the AutoGen framework for autonomous web task execution via natural language.
- WebVoyager — End-to-end multimodal web agent that completes instructions on real live websites with set-of-mark prompting.
- OpenCUA — Open foundation models and framework for computer-use agents, including dataset, benchmark, and end-to-end CUA models.