ForkSarvam AISarvam AIpublished May 15, 2025seen 5d

sarvamai/computer_use_agents

forked from morph-labs/openai-cua-sample-app

Open original ↗

Captured source

source ↗
published May 15, 2025seen 5dcaptured 11hhttp 200method plain

sarvamai/computer_use_agents

Description: Computer Use Agents using OpenAI models and various VM instrastructure

Language: Python

License: MIT

Stars: 4

Forks: 2

Open issues: 0

Created: 2025-05-15T06:10:57Z

Pushed: 2025-05-20T09:02:50Z

Default branch: main

Fork: yes

Parent repository: morph-labs/openai-cua-sample-app

Archived: no

README:

Computer Using Agent (CUA) with Morph Cloud Integration

Pokemon Infinibranch

Computer Using Agents with Infinibranch

This fork enhances the OpenAI Computer Using Agent sample app with Morph Cloud integration, enabling powerful new capabilities for agents through environment snapshotting and Infinibranch technology.

Apply for early access here

Key Contributions

  • Autonomous Agent Implementation: New autonomous agent capabilities for self-directed task completion
  • MorphComputer: Custom computer implementation that interfaces with Morph Cloud
  • Branching Agent: Agent that can create multiple branches from environment snapshots
  • Branching Example: Practical demonstration of Infinibranch technology in action

Key Benefits

  • Instant Environment Access: Jump directly into pre-configured environments without waiting for setup
  • Infinibranch Technology: Create multiple branches from a single snapshot for parallel exploration
  • Persistent Environments: Save agent state and resume work without losing progress
  • Infrastructure Simplification: Eliminates the need to manage Docker containers locally
  • Remote Debian Desktop: Access a fully-featured Linux environment in the cloud

Getting Started with Morph Cloud

To use Morph Cloud with the CUA sample app:

# Set up Python environment with conda
conda create -n manas python=3.11
conda activate manas
pip install -r requirements.txt

# Install Morph Cloud and set API key

pip install morphcloud
export MORPH_API_KEY=your_api_key_here

# Run with Morph Cloud
python cli.py --input "Open tokyo wikipedia page" --storage-folder ./trajectory --computer morph

Exploring Infinibranch Capabilities

Try our branching example to see how you can interactively create multiple agent exploration paths from a single environment snapshot:

python examples/branching_agent_example.py

This demonstrates how an agent can pursue different strategies in parallel by creating branches from a snapshot, then compare results across branches.

Computer Using Agent Sample App

Get started building a Computer Using Agent (CUA) with the OpenAI API.

> [!CAUTION] > Computer use is in preview. Because the model is still in preview and may be susceptible to exploits and inadvertent mistakes, we discourage trusting it in authenticated environments or for high-stakes tasks.

Set Up & Run

Set up python env and install dependencies.

python3 -m venv env
source env/bin/activate
pip install -r requirements.txt

Run CLI to let CUA use a local browser window, using playwright. (Stop with CTRL+C)

python cli.py --computer local-playwright

Other included sample [computer environments](#computer-environments):

  • Docker (containerized desktop)
  • Browserbase (remote browser, requires account)
  • Scrapybara (remote browser or computer, requires account)
  • Morph (remote desktop, requires account)
  • ...or implement your own Computer!

Overview

The computer use tool and model are available via the Responses API. At a high level, CUA will look at a screenshot of the computer interface and recommend actions. Specifically, it sends computer_call(s) with actions like click(x,y) or type(text) that you have to execute on your environment, and then expects screenshots of the outcomes.

You can learn more about this tool in the Computer use guide.

Abstractions

This repository defines two lightweight abstractions to make interacting with CUA agents more ergonomic. Everything works without them, but they provide a convenient separation of concerns.

| Abstraction | File | Description | | ----------- | ----------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | Computer | computers/computer.py | Defines a Computer interface for various environments (local desktop, remote browser, etc.). An implementation of Computer is responsible for executing any computer_action sent by CUA (clicks, etc). | | Agent | agent/agent.py | Simple, familiar agent loop – implements run_full_turn(), which just keeps calling the model until all computer actions and function calls are handled. |

CLI Usage

The CLI (cli.py) is the easiest way to get started with CUA. It accepts the following arguments:

  • --computer: The computer environment to use. See the [Computer Environments](#computer-environments) section below for options. By default, the CLI will use the local-playwright environment.
  • --input: The initial input to the agent (optional: the CLI will prompt you for input if not provided)
  • --debug: Enable debug mode.
  • --show: Show images (screenshots) during the execution.
  • --start-url: Start the browsing session with a specific URL (only for browser environments). By default, the CLI will start the browsing session with https://bing.com.

Run examples (optional)

The examples folder contains more examples of how to use CUA.

python -m examples.weather_example

For reference, the file simple_cua_loop.py implements the basics of the CUA loop.

You can run it with:

python simple_cua_loop.py

Computer Environments

CUA can work with any Computer environment that can handle the CUA actions:

| Action | Example | | ---------------------------------- | ------------------------------- | | click(x, y, button="left") | `click(24,…

Excerpt shown — open the source for the full document.

Notability

notability 2.0/10

Low-traction fork of an existing repo