Stay free if you only need full open-source rpa core (agpl) and browser automation for chrome, edge, and firefox. Upgrade if you need all pro features plus commercial use rights and commercial-grade ocr engines. Most solo builders can start free.
Why it matters: Browser-extension-based UI feels dated compared to standalone RPA studios like UiPath or Power Automate
Available from: PRO (Personal)
Why it matters: Desktop automation and command-line execution require paid XModules â not fully free for advanced use
Available from: PRO (Personal)
Why it matters: Documentation is functional but fragmented across manual, forum, and blog; steeper learning curve for non-developers
Available from: PRO (Personal)
Why it matters: No managed cloud orchestration or scheduling â users must build their own runner infrastructure
Available from: PRO (Personal)
Why it matters: Smaller ecosystem of pre-built connectors compared to major commercial RPA vendors
Available from: PRO (Personal)
Yes, the core Ui.Vision RPA browser extension is free and open-source under an AGPL license, and it's used by 150,000+ people. The free tier covers browser automation, visual record/replay, OCR, and CSV-driven testing. Paid PRO and Enterprise licenses unlock advanced features such as faster OCR engines, real desktop automation via XModules, command-line execution without watermarks, and commercial support. You can install it instantly from the Chrome Web Store, Edge Add-ons, or Firefox Add-on Gallery.
Ui.Vision is effectively a superset of Selenium IDE â it supports Selenium-style commands and can import/export Selenium IDE scripts directly. Beyond Selenium, it adds computer vision, OCR, desktop automation, and AI Computer Use, which Selenium IDE lacks. Teams often migrate from Selenium IDE to Ui.Vision to keep their existing test suites while gaining the ability to automate native desktop apps and handle image-based UI elements that DOM selectors can't reach.
Ui.Vision ships with built-in support for Anthropic's Claude Computer Use feature, which allows Claude AI to control a computer via screenshots and mouse/keyboard actions. Inside Ui.Vision, you can trigger Claude-driven agents to complete multi-step workflows using natural language instructions instead of explicit commands. This is particularly useful for tasks where the UI changes frequently or scripting every step would be fragile. The integration runs locally alongside Ui.Vision's classic deterministic automation, letting you mix AI and rule-based steps in one macro.
Ui.Vision can automate both. For browser workflows, the extension works natively in Chrome, Edge, and Firefox. For desktop automation on Windows, macOS, and Linux, you install the free XModules companion that grants access to real OS-level mouse/keyboard input, file system access, and screen OCR outside the browser sandbox. This lets you script hybrid workflows â for example, logging into a web app, downloading a file, then processing it in a desktop program.
Yes â Ui.Vision is explicitly designed so that your data never leaves your machine. All scripts, screenshots, OCR processing, and execution happen locally in the browser or via the local XModules. There is no cloud backend for macro storage or execution, which is why the tool is popular in regulated industries like finance, healthcare, and government. For AI Computer Use, calls to Claude are made directly from your machine to Anthropic's API using your own API key.
Start with the free plan â upgrade when you need more.
Get Started Free âStill not sure? Read our full verdict â
Last verified March 2026