Open-source RPA software for web automation, desktop automation, and AI computer use across Windows, macOS and Linux. Includes OCR capabilities and browser automation features.
Ui.Vision RPA is an open-source automation platform that combines browser automation, desktop automation, OCR, and Anthropic Claude Computer Use integration into a single browser extension, with a free core product and paid PRO/Enterprise tiers for commercial features. It's built for QA engineers, IT admins, citizen developers, and enterprises who need cross-platform RPA without sending data to the cloud.
Developed by a9t9 software GmbH since 2016, Ui.Vision runs as a free extension for Chrome, Edge, and Firefox and currently serves 150,000+ users across Windows, macOS, and Linux. The tool uses Selenium-style commands combined with computer vision and OCR to automate both browser-based workflows and native desktop applications â letting users record and replay actions visually, scrape screen content, handle file uploads/downloads, and run data-driven tests using CSV imports. Its XModules add-ons unlock desktop automation, command-line execution, and real-keystroke simulation beyond the browser sandbox. In late 2024/2025, the team shipped built-in Anthropic Claude Computer Use integration, allowing AI-driven computer control directly from the Ui.Vision interface.
Compared to the other automation tools in our directory, Ui.Vision occupies a distinct niche: it is one of the few truly open-source (AGPL) RPA platforms with enterprise-grade security where all data stays local on the user's machine. Unlike cloud-heavy competitors such as UiPath or Automation Anywhere, Ui.Vision ships as a lightweight browser extension with no server dependency, making it attractive for regulated industries and privacy-sensitive workflows. Based on our analysis of 870+ AI tools, it is among the most affordable enterprise-viable RPA options, with a capable free tier and a one-time or subscription-based PRO license rather than the per-bot licensing model common in this category.
Was this helpful?
Ui.Vision records user actions as screenshots and selectors rather than just DOM paths, so macros keep working even when underlying HTML changes. This screenshot-driven approach is especially useful for complex websites with dynamic selectors and for legacy desktop apps that don't expose accessibility trees.
The tool bundles OCR engines that let macros read text directly from the screen, images, or PDFs â enabling automation of content that has no clickable DOM or API. Ui.Vision also exposes free PDF OCR and searchable-PDF utilities, which are reused by thousands of teams for document digitization pipelines.
Ui.Vision natively integrates Anthropic's Claude Computer Use so that AI agents can drive the screen, mouse, and keyboard from inside a Ui.Vision macro. You can mix deterministic Selenium-style commands with AI-driven steps, which is ideal for workflows where the UI is unpredictable or the task is described in natural language.
The XModules companion extends the browser extension with real OS-level input, clipboard access, and file system control on Windows, macOS, and Linux. This turns Ui.Vision into a full hybrid RPA tool capable of orchestrating web apps and native desktop applications in the same script.
Ui.Vision imports and exports Selenium IDE (.side) files, so existing Selenium suites run with minimal changes. A command-line API lets you invoke macros from CI/CD pipelines, schedulers, or other scripts â enabling headless test runs and integration into enterprise build systems.
$0
$49 one-time
$199/year per user
Ready to get started with Ui.Vision RPA?
View Pricing Options âWe believe in transparent reviews. Here's what Ui.Vision RPA doesn't handle well:
Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
As of early 2026, Ui.Vision continues to ship updates to its Anthropic Claude Computer Use integration, which was first introduced in late 2024 and expanded through 2025. The AI Computer Use mode is now a stable, production-ready feature that lets users mix natural-language-driven AI steps with deterministic Selenium-style commands in the same macro. In 2026, the team has also updated XModules compatibility for the latest Chrome and Edge Manifest V3 extension architecture, maintained cross-platform support across Windows, macOS, and Linux, and refreshed its OCR engine bindings. The dedicated Computer Use demo and updated AI integration documentation remain actively maintained on the site.
AI Automation
Enterprise automation platform that drives AI transformation with agentic automation, combining UiPath agents, third-party agents, and API workflows.
automation
Enterprise-grade Robotic Process Automation (RPA) platform that uses AI agents to automate complex business processes across organizations. The #1 provider of Agentic Process Automation (APA) with industry-leading Process Reasoning Engine.
Automation
Microsoft's workflow automation platform that integrates AI Builder capabilities for intelligent automation including form processing, text analysis, and prediction models.
No reviews yet. Be the first to share your experience!
Get started with Ui.Vision RPA and see if it's the right fit for your needs.
Get Started âTake our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack âExplore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates âStop drowning in repetitive tasks. These 10 AI automation workflows help small businesses save time on email, customer support, invoicing, social media, and more â with practical setup guidance using accessible tools.
A jargon-free guide to AI automation for business owners. Learn what AI can and can't do, the five functions where it saves the most time, and a practical 4-week implementation plan with real tool recommendations.
Managing social media accounts across five or six platforms used to mean hiring a dedicated team or spending your weekends writing captions. AI tools have compressed that workflow. A single marketer can now draft platform-specific posts, schedule them across channels, and track p
Two years ago, learning **how to build an AI agent** required a Python environment, API credentials, and at least a weekend of debugging async functions. That barrier has dropped sharply. Visual workflow builders now let operations managers, marketers, and solo founders assemble