AI computer agent for macOS that controls your browser, writes code, handles documents, and operates Google Apps through voice commands with direct DOM control.
Fazm is a free, open-source AI computer agent in the desktop automation category, designed exclusively for macOS, that enables users to control their entire desktop environment through voice commands. Unlike conversational AI assistants that simply provide text-based answers, Fazm directly manipulates the user's computer by moving the mouse, typing on the keyboard, clicking buttons, and navigating between applications. Its core differentiator is direct browser DOM control, which means it reads and interacts with the actual structure of web pages rather than relying on screenshot-based visual recognition, resulting in faster and more reliable web automation.
The tool operates as an always-on-top floating toolbar, remaining accessible while users work across any application. Fazm can handle a wide range of tasks including drafting and sending emails, filling out forms, managing spreadsheets and documents, writing and editing code, navigating websites, and operating Google Apps like Gmail, Google Sheets, and Google Calendar. For native macOS applications, it leverages accessibility APIs to interact with UI elements such as buttons, menus, and text fields across apps like Chrome, Safari, VS Code, Slack, Figma, and Terminal.
A notable feature is Fazm's memory layer, which builds a personal knowledge graph over time by extracting information from files, browsing history, conversations, and daily activity. This allows the agent to learn user contacts, preferences, formatting habits, and frequently used workflows, progressively reducing the amount of instruction needed for routine tasks. Importantly, all knowledge graph data is stored locally on the user's Mac and is never transmitted to cloud services.
Fazm includes safety mechanisms such as real-time visibility of all actions on screen, a keyboard shortcut to halt any operation instantly, and confirmation prompts before executing destructive actions like deleting files or sending messages. The project is fully open source with its code available on GitHub for auditing. The tool processes screen content locally to preserve privacy. Users can create reusable workflow automations to streamline repetitive multi-step processes. Fazm is offered as a free download, with its initial public release dating to December 2025. The GitHub repository has accumulated over 4,800 stars since launch, indicating strong early community interest. The project reports compatibility with macOS 13 Ventura and later, covering approximately 80% of the active Mac installed base according to Apple's platform adoption statistics. The DOM control approach reportedly achieves action execution in under 500 milliseconds per step for common browser interactions, compared to the 1â3 second latency typical of screenshot-based agents. Fazm currently integrates with over 15 native macOS applications through accessibility APIs and supports Chrome and Safari for browser-based automation.
Was this helpful?
Rather than capturing screenshots and using computer vision to identify clickable elements, Fazm reads the actual Document Object Model of web pages. This allows it to locate form fields, buttons, links, and content by their structural properties, resulting in faster execution and higher reliability. This approach avoids common failure modes of vision-based agents such as misidentifying elements due to overlapping UI components, dynamic content loading, or non-standard page layouts.
Fazm builds an evolving knowledge graph by extracting structured information from user files, browsing activity, conversations, and daily workflows. Over time it learns contacts, preferences, tone, scheduling habits, and frequently repeated tasks. This enables progressively more autonomous operation where the agent can anticipate needs and pre-fill information. All data remains on the local machine and is never uploaded to external servers, addressing privacy concerns common with cloud-based AI assistants.
The entire user experience is built around natural language voice commands rather than typed instructions or point-and-click configuration. Users speak their intent in conversational language and Fazm translates this into a sequence of computer actions. The always-on-top floating toolbar serves as the persistent voice interface, staying accessible across all applications without requiring window switching or a separate app to be in focus.
Users can define multi-step workflows that Fazm can replay on demand. Once a complex sequence of actions is performed â such as extracting data from a PDF, entering it into a spreadsheet, and emailing the result â it can be saved and triggered with a single voice command in the future. This bridges the gap between one-off voice commands and fully programmatic automation scripts, making it accessible to non-technical users.
$0
Ready to get started with Fazm?
View Pricing Options âWe believe in transparent reviews. Here's what Fazm doesn't handle well:
Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
No reviews yet. Be the first to share your experience!
Get started with Fazm and see if it's the right fit for your needs.
Get Started âTake our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack âExplore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates â