Honest pros, cons, and verdict on this automation tool
â First computer-use foundation model trained on internet-scale video (11M hours), versus the largest open computer-use dataset of under 20 hours of 30 FPS video
Starting Price
See Pricing
Free Tier
No
Category
Automation
Skill Level
Any
Foundation model for computer use trained on 11-million-hour video dataset that can perform complex computer actions like CAD modeling, website navigation, and real-world tasks at 30 FPS.
FDM-1 is an Automation foundation model for computer use that performs complex multi-step tasks like CAD modeling, website exploration, and real-world driving at 30 FPS, with pricing available only through enterprise engagements with Standard Intelligence. It is built for research labs, engineering teams, and enterprises that need long-horizon agentic computer-use capabilities far beyond traditional VLM-based agents.
Released on February 23, 2026 by Standard Intelligence (standard intelligence pbc), FDM-1 represents a fundamental departure from the prior recipe of fine-tuning vision-language models on contractor-annotated screenshots. Instead, FDM-1 was trained on a portion of an 11-million-hour screen recording video dataset, labeled using a custom inverse dynamics model. The architecture combines a video encoder that compresses nearly 2 hours of 30 FPS video into just 1 million tokens, an inverse dynamics model for action labeling, and a forward dynamics model that predicts future video frames conditioned on actions. This long-context training enables FDM-1 to act on minutes of context rather than the few seconds typical of conventional computer-use agents, and it consistently improves with scale.
per month
FDM-1 delivers on its promises as a automation tool. While it has some limitations, the benefits outweigh the drawbacks for most users in its target market.
Foundation model for computer use trained on 11-million-hour video dataset that can perform complex computer actions like CAD modeling, website navigation, and real-world tasks at 30 FPS.
Yes, FDM-1 is good for automation work. Users particularly appreciate first computer-use foundation model trained on internet-scale video (11m hours), versus the largest open computer-use dataset of under 20 hours of 30 fps video. However, keep in mind no public api, pricing page, or self-serve access â gated to enterprise and research partners.
FDM-1 offers various pricing options. Visit their website for current pricing details.
FDM-1 is best for Long-horizon CAD modeling workflows in tools like Blender where an agent must perform tens of continuous mouse movements and operations such as extrude, select, and transform without losing context and Automated website exploration and fuzzing for QA and security research, where the agent must navigate complex multi-step flows beyond the few-second context of screenshot-based agents. It's particularly useful for automation professionals who need 11-million-hour video training dataset.
There are several automation tools available. Compare features, pricing, and user reviews to find the best option for your needs.
Last verified March 2026