sarvamai/computer_use_agents
forked from morph-labs/openai-cua-sample-app
Captured source
source ↗sarvamai/computer_use_agents
Description: Computer Use Agents using OpenAI models and various VM instrastructure
Language: Python
License: MIT
Stars: 4
Forks: 2
Open issues: 0
Created: 2025-05-15T06:10:57Z
Pushed: 2025-05-20T09:02:50Z
Default branch: main
Fork: yes
Parent repository: morph-labs/openai-cua-sample-app
Archived: no
README:
Computer Using Agent (CUA) with Morph Cloud Integration
Computer Using Agents with Infinibranch
This fork enhances the OpenAI Computer Using Agent sample app with Morph Cloud integration, enabling powerful new capabilities for agents through environment snapshotting and Infinibranch technology.
Key Contributions
- Autonomous Agent Implementation: New autonomous agent capabilities for self-directed task completion
- MorphComputer: Custom computer implementation that interfaces with Morph Cloud
- Branching Agent: Agent that can create multiple branches from environment snapshots
- Branching Example: Practical demonstration of Infinibranch technology in action
Key Benefits
- Instant Environment Access: Jump directly into pre-configured environments without waiting for setup
- Infinibranch Technology: Create multiple branches from a single snapshot for parallel exploration
- Persistent Environments: Save agent state and resume work without losing progress
- Infrastructure Simplification: Eliminates the need to manage Docker containers locally
- Remote Debian Desktop: Access a fully-featured Linux environment in the cloud
Getting Started with Morph Cloud
To use Morph Cloud with the CUA sample app:
# Set up Python environment with conda conda create -n manas python=3.11 conda activate manas pip install -r requirements.txt # Install Morph Cloud and set API key pip install morphcloud export MORPH_API_KEY=your_api_key_here # Run with Morph Cloud python cli.py --input "Open tokyo wikipedia page" --storage-folder ./trajectory --computer morph
Exploring Infinibranch Capabilities
Try our branching example to see how you can interactively create multiple agent exploration paths from a single environment snapshot:
python examples/branching_agent_example.py
This demonstrates how an agent can pursue different strategies in parallel by creating branches from a snapshot, then compare results across branches.
Computer Using Agent Sample App
Get started building a Computer Using Agent (CUA) with the OpenAI API.
> [!CAUTION] > Computer use is in preview. Because the model is still in preview and may be susceptible to exploits and inadvertent mistakes, we discourage trusting it in authenticated environments or for high-stakes tasks.
Set Up & Run
Set up python env and install dependencies.
python3 -m venv env source env/bin/activate pip install -r requirements.txt
Run CLI to let CUA use a local browser window, using playwright. (Stop with CTRL+C)
python cli.py --computer local-playwright
Other included sample [computer environments](#computer-environments):
- Docker (containerized desktop)
- Browserbase (remote browser, requires account)
- Scrapybara (remote browser or computer, requires account)
- Morph (remote desktop, requires account)
- ...or implement your own
Computer!
Overview
The computer use tool and model are available via the Responses API. At a high level, CUA will look at a screenshot of the computer interface and recommend actions. Specifically, it sends computer_call(s) with actions like click(x,y) or type(text) that you have to execute on your environment, and then expects screenshots of the outcomes.
You can learn more about this tool in the Computer use guide.
Abstractions
This repository defines two lightweight abstractions to make interacting with CUA agents more ergonomic. Everything works without them, but they provide a convenient separation of concerns.
| Abstraction | File | Description | | ----------- | ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | Computer | computers/computer.py | Defines a Computer interface for various environments (local desktop, remote browser, etc.). An implementation of Computer is responsible for executing any computer_action sent by CUA (clicks, etc). | | Agent | agent/agent.py | Simple, familiar agent loop – implements run_full_turn(), which just keeps calling the model until all computer actions and function calls are handled. |
CLI Usage
The CLI (cli.py) is the easiest way to get started with CUA. It accepts the following arguments:
--computer: The computer environment to use. See the [Computer Environments](#computer-environments) section below for options. By default, the CLI will use thelocal-playwrightenvironment.--input: The initial input to the agent (optional: the CLI will prompt you for input if not provided)--debug: Enable debug mode.--show: Show images (screenshots) during the execution.--start-url: Start the browsing session with a specific URL (only for browser environments). By default, the CLI will start the browsing session withhttps://bing.com.
Run examples (optional)
The examples folder contains more examples of how to use CUA.
python -m examples.weather_example
For reference, the file simple_cua_loop.py implements the basics of the CUA loop.
You can run it with:
python simple_cua_loop.py
Computer Environments
CUA can work with any Computer environment that can handle the CUA actions:
| Action | Example | | ---------------------------------- | ------------------------------- | | click(x, y, button="left") | `click(24,…
Excerpt shown — open the source for the full document.
Notability
notability 2.0/10Low-traction fork of an existing repo