Stay free if you only need basic features. Upgrade if you need advanced features. Most solo builders can start free.
Regular prompting sends text and hopes the model responds in the right format. Guidance interleaves fixed template text with constrained generation steps, enforcing structure at the token level so invalid outputs are impossible rather than merely unlikely.
The Rust-based grammar engine (llguidance) was further optimized for performance, with improved JSON schema support including oneOf/allOf/anyOf, better error messages, and expanded model backend compatibility.
Yes. Guidance supports OpenAI GPT models and Azure OpenAI deployments. Constrained generation works through provider-supported structured output features, though the strongest token-level guarantees are available with local models.
Instructor validates structured output after generation and retries on failure. Guidance enforces constraints during generation at the token level, preventing invalid output from being produced in the first place — eliminating retry overhead.
Yes, for applications requiring guaranteed structured output. It is used in production systems where output validity is critical, particularly with local model deployments.
Yes, and local models get the strongest guarantees because Guidance can directly control token sampling. Supported via Transformers and llama.cpp backends.
Start with the free plan — upgrade when you need more.
Get Started Free →Still not sure? Read our full verdict →
Last verified March 2026