Guidance review 2026: token-level constrained LLM generation with grammars, regex, and JSON schema — MIT open source — features, pros, cons, use cases.
Guidance review 2026: token-level constrained LLM generation with grammars, regex, and JSON schema — MIT open source — features, pros, cons, use cases.
Guidance is an open-source library that gives developers token-level control over LLM output. Instead of asking a model nicely for JSON and praying, Guidance lets you interleave normal Python with select statements, regex patterns, context-free grammars, and JSON schema constraints — and the library uses logit biasing during generation to make the model physically incapable of emitting tokens that would violate the constraint. The result is faster, cheaper, and bulletproof structured output: a JSON-schema-constrained generation never produces invalid JSON, a regex constraint always matches, a grammar constraint always parses. Guidance also supports stateful programs that branch on what the model produced, multi-turn role messages, tool calls, image inputs (where supported), and partial caching of shared prefixes for big speedups. It works with local models via transformers, llama.cpp, and vLLM, as well as hosted OpenAI and Anthropic APIs (with reduced constraint enforcement on hosted models that don't expose logits). Originally launched inside Microsoft Research, the project now lives at github.com/guidance-ai/guidance under MIT and is actively maintained by an independent community. There is no managed service or pricing — it's a pure Python library you install and use with your own compute.
Was this helpful?
Guidance from Microsoft Research is a powerful Python library for developers who need guaranteed structured output from LLMs. It excels at enforcing JSON schemas, regex patterns, and grammars at the token level, particularly with local model backends. The learning curve is steeper than simpler alternatives, but the guarantees it provides make it ideal for production systems where output validity is non-negotiable.
Define templates that mix fixed text with generation slots. The model generates only in specified locations, with template text serving as guaranteed context that cannot be modified. Variables capture generated values for downstream use in multi-step workflows.
Implement any context-free grammar for output control. Use regex patterns, selection from predefined lists, or complex nested structures. Works with both local models (logit masking) and API models (optimized prompting).
Automatically corrects tokenization artifacts that occur when template text ends mid-token. Prevents garbled output when template text ends mid-token, a common issue with standard LLM approaches.
Generate structured JSON with guaranteed schema compliance using Pydantic models. Supports complex schemas with oneOf, allOf, required properties, min/max constraints, and format validation for production data extraction.
State-of-the-art performance for constraint processing using the llguidance Rust library. Delivers significant speed improvements and fixes subtle bugs from the earlier Python implementation.
Implement conditional generation with if/else blocks and iterative generation with loops. The model can generate variable-length lists, make programmatic branching decisions, and handle complex multi-step logic.
Unified interface for local models (Transformers, llama.cpp with true constrained generation) and API models (OpenAI GPT-4, Anthropic Claude, Azure OpenAI with optimized prompting strategies).
Rich notebook visualization showing token probabilities, backtracking support, generation metrics, real-time model behavior, and dark mode support for interactive development workflows.
Grammar constraints often make some tokens predictable in advance. Guidance automatically inserts these tokens without model forward passes, reducing GPU usage and generation latency significantly.
Create reusable @guidance decorated functions that can be composed into complex grammars. Build libraries of generation patterns for specific domains like HTML generation, form filling, or structured interviews.
Free (MIT)
Ready to get started with Guidance?
View Pricing Options →Guidance works with these platforms and services:
We believe in transparent reviews. Here's what Guidance doesn't handle well:
Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
Rust-based llguidance grammar engine replaced Python implementation with faster constraint processing and bug fixes. Expanded JSON schema coverage including oneOf/allOf/format validation. Rewritten Jupyter notebook visualization with token probabilities, backtracking support, and improved dark mode. Added Python 3.14 compatibility and Phi-4 model support. Performance optimizations throughout the framework.
AI Agent Builders
Grammar-constrained generation for deterministic model outputs.
AI Agent Builders
The industry-standard framework for building production-ready LLM applications with comprehensive tool integration, agent orchestration, and enterprise observability through LangSmith.
No reviews yet. Be the first to share your experience!
Get started with Guidance and see if it's the right fit for your needs.
Get Started →Take our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack →Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates →