Master Ui.Vision RPA with our step-by-step tutorial, detailed feature walkthrough, and expert tips.
Explore the key features that make Ui.Vision RPA powerful for automation workflows.
Yes, the core Ui.Vision RPA browser extension is free and open-source under an AGPL license, and it's used by 150,000+ people. The free tier covers browser automation, visual record/replay, OCR, and CSV-driven testing. Paid PRO and Enterprise licenses unlock advanced features such as faster OCR engines, real desktop automation via XModules, command-line execution without watermarks, and commercial support. You can install it instantly from the Chrome Web Store, Edge Add-ons, or Firefox Add-on Gallery.
Ui.Vision is effectively a superset of Selenium IDE â it supports Selenium-style commands and can import/export Selenium IDE scripts directly. Beyond Selenium, it adds computer vision, OCR, desktop automation, and AI Computer Use, which Selenium IDE lacks. Teams often migrate from Selenium IDE to Ui.Vision to keep their existing test suites while gaining the ability to automate native desktop apps and handle image-based UI elements that DOM selectors can't reach.
Ui.Vision ships with built-in support for Anthropic's Claude Computer Use feature, which allows Claude AI to control a computer via screenshots and mouse/keyboard actions. Inside Ui.Vision, you can trigger Claude-driven agents to complete multi-step workflows using natural language instructions instead of explicit commands. This is particularly useful for tasks where the UI changes frequently or scripting every step would be fragile. The integration runs locally alongside Ui.Vision's classic deterministic automation, letting you mix AI and rule-based steps in one macro.
Ui.Vision can automate both. For browser workflows, the extension works natively in Chrome, Edge, and Firefox. For desktop automation on Windows, macOS, and Linux, you install the free XModules companion that grants access to real OS-level mouse/keyboard input, file system access, and screen OCR outside the browser sandbox. This lets you script hybrid workflows â for example, logging into a web app, downloading a file, then processing it in a desktop program.
Yes â Ui.Vision is explicitly designed so that your data never leaves your machine. All scripts, screenshots, OCR processing, and execution happen locally in the browser or via the local XModules. There is no cloud backend for macro storage or execution, which is why the tool is popular in regulated industries like finance, healthcare, and government. For AI Computer Use, calls to Claude are made directly from your machine to Anthropic's API using your own API key.
Now that you know how to use Ui.Vision RPA, it's time to put this knowledge into practice.
Sign up and follow the tutorial steps
Check pros, cons, and user feedback
See how it stacks against alternatives
Follow our tutorial and master this powerful automation tool in minutes.
Tutorial updated March 2026