FDM-1 Review 2026

Name: FDM-1
Brand: FDM-1

Honest pros, cons, and verdict on this automation tool

✅ First computer-use foundation model trained on internet-scale video (11M hours), versus the largest open computer-use dataset of under 20 hours of 30 FPS video

Starting Price

See Pricing

Free Tier

What is FDM-1?

Foundation model for computer use trained on 11-million-hour video dataset that can perform complex computer actions like CAD modeling, website navigation, and real-world tasks at 30 FPS.

FDM-1 is an Automation foundation model for computer use that performs complex multi-step tasks like CAD modeling, website exploration, and real-world driving at 30 FPS, with pricing available only through enterprise engagements with Standard Intelligence. It is built for research labs, engineering teams, and enterprises that need long-horizon agentic computer-use capabilities far beyond traditional VLM-based agents.

Released on February 23, 2026 by Standard Intelligence (standard intelligence pbc), FDM-1 represents a fundamental departure from the prior recipe of fine-tuning vision-language models on contractor-annotated screenshots. Instead, FDM-1 was trained on a portion of an 11-million-hour screen recording video dataset, labeled using a custom inverse dynamics model. The architecture combines a video encoder that compresses nearly 2 hours of 30 FPS video into just 1 million tokens, an inverse dynamics model for action labeling, and a forward dynamics model that predicts future video frames conditioned on actions. This long-context training enables FDM-1 to act on minutes of context rather than the few seconds typical of conventional computer-use agents, and it consistently improves with scale.

Key Features

✓11-million-hour video training dataset

✓30 FPS native video inference

✓Video encoder compressing ~2 hours into 1M tokens

✓Inverse dynamics model for unsupervised action labeling

✓Forward dynamics model for action-conditioned video prediction

✓OS checkpoint / forking VM for test-time compute

Pricing Breakdown

Enterprise

Custom (contact sales)

per month

✓Full access to FDM-1 foundation model for computer use
✓30 FPS native video inference for long-horizon tasks
✓CAD modeling, website automation, and multi-step workflow capabilities
✓OS checkpoint and forking VM infrastructure for test-time compute
✓Custom deployment and integration support

Pros & Cons

✅Pros

•First computer-use foundation model trained on internet-scale video (11M hours), versus the largest open computer-use dataset of under 20 hours of 30 FPS video
•Native 30 FPS video processing enables continuous control like smooth mouse movement and CAD operations rather than discrete screenshot-by-screenshot reasoning
•Highly efficient video encoder compresses nearly 2 hours of footage into just 1M tokens, unlocking minute-scale context windows
•Unsupervised training via the inverse dynamics model removes the bottleneck of expensive contractor-labeled screenshots
•Test-time compute via OS checkpoints / forking VMs lets the model retry from validated intermediate states on long-horizon tasks
•Demonstrably general — the same model performs CAD modeling, website fuzzing, and real-world driving without task-specific RL environments

❌Cons

•No public API, pricing page, or self-serve access — gated to enterprise and research partners
•Capabilities are demonstrated through curated video clips rather than peer-reviewed benchmarks against established computer-use leaderboards
•Released February 23, 2026, so production track record, reliability, and safety guardrails are unproven at scale
•Inference at 30 FPS on minute-long video contexts implies significant GPU cost not disclosed publicly
•No documentation of supported operating systems, integrations, or developer tooling beyond the research blog post

Who Should Use FDM-1?

✓Long-horizon CAD modeling workflows in tools like Blender where an agent must perform tens of continuous mouse movements and operations such as extrude, select, and transform without losing context
✓Automated website exploration and fuzzing for QA and security research, where the agent must navigate complex multi-step flows beyond the few-second context of screenshot-based agents
✓Enterprise research partnerships exploring computer-use coworkers for finance, engineering, or ML research where minute-scale context and 30 FPS control are required
✓Embodied or physical-world tasks demonstrated by the team, such as driving a car, where continuous-time video understanding outperforms discrete screenshot reasoning
✓Internal R&D teams building on top of a foundation model rather than orchestrating a VLM with screenshot tools and per-task RL environments
✓Dataset and infrastructure teams that want to leverage internet-scale unlabeled video (livestreams, gameplay, tutorials) instead of paying for contractor-annotated computer-use data

Who Should Skip FDM-1?

×You're concerned about no public api, pricing page, or self-serve access — gated to enterprise and research partners
×You're concerned about capabilities are demonstrated through curated video clips rather than peer-reviewed benchmarks against established computer-use leaderboards
×You're concerned about released february 23, 2026, so production track record, reliability, and safety guardrails are unproven at scale

Our Verdict

✅

FDM-1 is a solid choice

FDM-1 delivers on its promises as a automation tool. While it has some limitations, the benefits outweigh the drawbacks for most users in its target market.

Try FDM-1 →Compare Alternatives →

Frequently Asked Questions

What is FDM-1?

Foundation model for computer use trained on 11-million-hour video dataset that can perform complex computer actions like CAD modeling, website navigation, and real-world tasks at 30 FPS.

Is FDM-1 good?

Yes, FDM-1 is good for automation work. Users particularly appreciate first computer-use foundation model trained on internet-scale video (11m hours), versus the largest open computer-use dataset of under 20 hours of 30 fps video. However, keep in mind no public api, pricing page, or self-serve access — gated to enterprise and research partners.

How much does FDM-1 cost?

FDM-1 offers various pricing options. Visit their website for current pricing details.

Who should use FDM-1?

FDM-1 is best for Long-horizon CAD modeling workflows in tools like Blender where an agent must perform tens of continuous mouse movements and operations such as extrude, select, and transform without losing context and Automated website exploration and fuzzing for QA and security research, where the agent must navigate complex multi-step flows beyond the few-second context of screenshot-based agents. It's particularly useful for automation professionals who need 11-million-hour video training dataset.

What are the best FDM-1 alternatives?

There are several automation tools available. Compare features, pricing, and user reviews to find the best option for your needs.

More about FDM-1

Pricing Alternatives Free vs Paid Pros & Cons Worth It?Tutorial

📖 FDM-1 Overview 💰 FDM-1 Pricing 🆚 Free vs Paid 🤔 Is it Worth It?

Last verified March 2026

What is FDM-1?

Foundation model for computer use trained on 11-million-hour video dataset that can perform complex computer actions like CAD modeling, website navigation, and real-world tasks at 30 FPS.

Pricing Breakdown

Enterprise

Custom (contact sales)

per month

✓Full access to FDM-1 foundation model for computer use
✓30 FPS native video inference for long-horizon tasks
✓CAD modeling, website automation, and multi-step workflow capabilities
✓OS checkpoint and forking VM infrastructure for test-time compute
✓Custom deployment and integration support

Pros & Cons

✅Pros

•First computer-use foundation model trained on internet-scale video (11M hours), versus the largest open computer-use dataset of under 20 hours of 30 FPS video
•Native 30 FPS video processing enables continuous control like smooth mouse movement and CAD operations rather than discrete screenshot-by-screenshot reasoning
•Highly efficient video encoder compresses nearly 2 hours of footage into just 1M tokens, unlocking minute-scale context windows
•Unsupervised training via the inverse dynamics model removes the bottleneck of expensive contractor-labeled screenshots
•Test-time compute via OS checkpoints / forking VMs lets the model retry from validated intermediate states on long-horizon tasks
•Demonstrably general — the same model performs CAD modeling, website fuzzing, and real-world driving without task-specific RL environments

❌Cons

•No public API, pricing page, or self-serve access — gated to enterprise and research partners
•Capabilities are demonstrated through curated video clips rather than peer-reviewed benchmarks against established computer-use leaderboards
•Released February 23, 2026, so production track record, reliability, and safety guardrails are unproven at scale
•Inference at 30 FPS on minute-long video contexts implies significant GPU cost not disclosed publicly
•No documentation of supported operating systems, integrations, or developer tooling beyond the research blog post

Who Should Use FDM-1?

✓Long-horizon CAD modeling workflows in tools like Blender where an agent must perform tens of continuous mouse movements and operations such as extrude, select, and transform without losing context
✓Automated website exploration and fuzzing for QA and security research, where the agent must navigate complex multi-step flows beyond the few-second context of screenshot-based agents
✓Enterprise research partnerships exploring computer-use coworkers for finance, engineering, or ML research where minute-scale context and 30 FPS control are required
✓Embodied or physical-world tasks demonstrated by the team, such as driving a car, where continuous-time video understanding outperforms discrete screenshot reasoning
✓Internal R&D teams building on top of a foundation model rather than orchestrating a VLM with screenshot tools and per-task RL environments
✓Dataset and infrastructure teams that want to leverage internet-scale unlabeled video (livestreams, gameplay, tutorials) instead of paying for contractor-annotated computer-use data

Who Should Skip FDM-1?

×You're concerned about no public api, pricing page, or self-serve access — gated to enterprise and research partners
×You're concerned about capabilities are demonstrated through curated video clips rather than peer-reviewed benchmarks against established computer-use leaderboards
×You're concerned about released february 23, 2026, so production track record, reliability, and safety guardrails are unproven at scale

Frequently Asked Questions

What is FDM-1?

Foundation model for computer use trained on 11-million-hour video dataset that can perform complex computer actions like CAD modeling, website navigation, and real-world tasks at 30 FPS.

Is FDM-1 good?

How much does FDM-1 cost?

FDM-1 offers various pricing options. Visit their website for current pricing details.

Who should use FDM-1?

What are the best FDM-1 alternatives?

There are several automation tools available. Compare features, pricing, and user reviews to find the best option for your needs.