RepoQwen (Alibaba Cloud)Qwen (Alibaba Cloud)published Jun 1, 2026seen 2d

QwenLM/open-computer-use

Swift

Open original ↗

Captured source

source ↗
published Jun 1, 2026seen 2dcaptured 14hhttp 200method plain

QwenLM/open-computer-use

Description: MCP-based Computer Use service for Qwen Code and any AI agent — controls macOS, Linux, and Windows via accessibility APIs.

Language: Swift

License: MIT

Stars: 60

Forks: 7

Open issues: 1

Created: 2026-06-01T08:10:48Z

Pushed: 2026-06-10T16:38:07Z

Default branch: main

Fork: no

Archived: no

README:

open-computer-use

---

MCP-based Computer Use service for Qwen Code and any MCP client — controls macOS, Linux, and Windows via accessibility APIs.

Published to npm as `@qwen-code/open-computer-use`.

Demo

https://github.com/user-attachments/assets/cd0d1644-99e5-47fc-b998-c1eb3c1aabff

Quick Start

npm i -g @qwen-code/open-computer-use

On macOS, run it once and grant `Accessibility` and `Screen Recording`. Windows and Linux do not need this step.

open-computer-use

Add it to your MCP client config:

{
"mcpServers": {
"open-computer-use": {
"command": "open-computer-use",
"args": ["mcp"]
}
}
}

CLI Usage

# Call a single Computer Use tool and print the MCP-style JSON result
open-computer-use call list_apps
open-computer-use call get_app_state --args '{"app":"TextEdit"}'

# Run a sequence in one process so element_index state can be reused
open-computer-use call --calls '[{"tool":"get_app_state","args":{"app":"TextEdit"}},{"tool":"press_key","args":{"app":"TextEdit","key":"Return"}}]'
open-computer-use call --calls-file examples/textedit-overlay-seq.json --sleep 0.5

# Check permissions; onboarding only opens when something is missing
open-computer-use doctor

# Show help
open-computer-use -h

Configuration

Image capture (macOS)

The get_app_state screenshot and the post-action screenshots attached to every action tool can be tuned through environment variables read at capture time. All variables are optional; unset / non-numeric / out-of-range values fall back to the built-in defaults.

| Variable | Default | Meaning | |---|---|---| | OPEN_COMPUTER_USE_IMAGE_CAPTURE_TIMEOUT | 5 | Seconds to wait for SCScreenshotManager.captureImage before giving up. The MCP result still includes the accessibility tree on timeout; only the image block is dropped. Positive float. | | OPEN_COMPUTER_USE_IMAGE_MAX_BYTES | 900000 | Byte budget for the encoded PNG. The downsampler iterates scale *= 0.85 until the encoded data fits this budget OR OPEN_COMPUTER_USE_IMAGE_MIN_SCALE is reached. Positive integer. | | OPEN_COMPUTER_USE_IMAGE_MAX_DIMENSION | 1280 | Long-edge pixel cap for the returned PNG. Initial scale is min(1, OPEN_COMPUTER_USE_IMAGE_MAX_DIMENSION / largestNativeDimension), then clamped up to OPEN_COMPUTER_USE_IMAGE_MIN_SCALE. Positive float. | | OPEN_COMPUTER_USE_IMAGE_MIN_SCALE | 0.25 | Floor on the downsample ratio. Neither MAX_DIMENSION nor MAX_BYTES will shrink below MIN_SCALE × native; a MAX_DIMENSION that would require less is clamped to this floor (it does not fall back to the full-size original). Lower it for more aggressive sizes. Float in (0, 1]. |

Coordinate accuracy is preserved across any downsampling — coordinate tools (click, drag, scroll) read the actual pixel dimensions back from the returned PNG and rescale model-supplied coordinates against the live window bounds.

These variables only affect macOS today. The Windows and Linux runtimes return native-size PNGs without downsampling.

See [docs/IMAGE_CAPTURE.md](docs/IMAGE_CAPTURE.md) for the full capture → downsample → encode pipeline, the constraint interaction (maxDimension / maxBytes / minScale), coordinate-mapping details, and worked examples.

Acknowledge

This project is a QwenLM fork of `iFurySt/open-codex-computer-use`. We thank the original author for the foundational work on macOS accessibility-driven computer-use patterns.

Differences from upstream

  • Cross-platform: Added Windows (Go + PowerShell UI Automation) and Linux (Go + Python AT-SPI) runtimes
  • npm distribution: Published as `@qwen-code/open-computer-use` for easy installation
  • MCP server: Full MCP stdio transport with 9 Computer Use tools
  • CLI tools: Added doctor, call, snapshot, list-apps commands for diagnostics and scripting
  • Image capture tuning: Environment variables for screenshot size/quality control
  • Qwen Code skill: Installable skill for Qwen Code agent integration
  • Cursor Motion: Retained in experiments/ but not built or released in CI