Repository navigation
computer-use
- Website
- Wikipedia
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
Let AI be your browser operator.
The most reliable AI agent framework that supports MCP.
Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.
c/ua is the Docker Container for Computer-Use AI Agents.
Agent S: an open agentic framework that uses computers like a human
Ui.Vision Open-Source RPA Software with Computer Vision, OCR, Anthropic Computer Use/LLM. Selenium IDE import/export.
Open Source Generative Process Automation (i.e. Generative RPA). AI-First Process Automation with Large ([Language (LLMs) / Action (LAMs) / Multimodal (LMMs)] / Visual Language (VLMs)) Models
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
AI computer use powered by open source LLMs and E2B Desktop Sandbox
An open-sourced end-to-end VLM-based GUI Agent
A fork of Anthropic Computer Use that you can run on Mac computers to give Claude and other AI models autonomous access to your computer.
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
Bytebot is the container for desktop agents.
Desktop app powered by Claude’s computer use capability to control your computer
A framework to enable autonomous android and computer use using any LLM (local or remote)
A curated list of resources about AI agents for Computer Use, including research papers, projects, frameworks, and tools.
This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use".
A framework to enable autonomous android and computer use using any LLM (local or remote)