No free plan. The cheapest way in is Enterprise at Custom (contact sales). Consider free alternatives in the automation category if budget is tight.
FDM-1 is a foundation model for general computer use built by Standard Intelligence (standard intelligence pbc), announced February 23, 2026. Unlike prior computer-use agents that fine-tune a vision-language model on screenshots, FDM-1 trains and infers directly on video at 30 FPS. It was trained on a portion of an 11-million-hour screen recording dataset labeled by a custom inverse dynamics model. The team positions it as the first fully general computer action model.
Traditional computer-use agents fine-tune a VLM on contractor-annotated screenshots, which limits them to a few seconds of context, low framerates, and short-horizon tasks. FDM-1 instead trains directly on 30 FPS video and uses a video encoder that compresses ~2 hours into 1M tokens, giving it minute-scale context. It also avoids per-task reinforcement learning environments, learning unsupervised from the open internet's video corpus. Based on our analysis of 870+ AI tools, this is the only Automation entry that trains a custom video foundation model end-to-end for computer use.
Standard Intelligence demonstrated FDM-1 performing multi-action CAD sequences in Blender (including extruding faces on an n-gon to make a gear), exploring and fuzzing complex websites, and driving a car in the real world â all at 30 FPS. The CAD demo uses OS checkpoints created at successful operations (extrude, select, etc.) to enable test-time compute via a forking VM. The blog post emphasizes that capabilities consistently improve with scale, and the team frames the current model as the first step toward CAD, finance, engineering, and ML-research coworker agents.
FDM-1 has no published pricing or self-serve access as of the February 23, 2026 announcement. Standard Intelligence describes it as a research milestone in a blog post at si.inc/posts/fdm1/, and access appears to be limited to enterprise or research partnerships. Compared to other Automation tools in our directory that publish $20â$200/month tiers, FDM-1 sits firmly in the enterprise / contact-sales segment with no free or developer tier announced.
The training recipe has three core components, all described in the launch post. First, a video encoder that compresses approximately 2 hours of 30 FPS video into 1 million tokens, enabling long-context training. Second, an inverse dynamics model that labels raw screen recordings with the actions that produced them, removing the need for contractor annotation. Third, a forward dynamics model that predicts future frames conditioned on actions, which is the component used to drive the agent at inference time.
See FDM-1 plans and find the right tier for your needs.
See Pricing Plans âStill not sure? Read our full verdict â
Last verified March 2026