AI Frameworks🔴Developer

Guidance

Name: Guidance
Brand: Guidance

Guidance review 2026: token-level constrained LLM generation with grammars, regex, and JSON schema — MIT open source — features, pros, cons, use cases.

Starting atFree

Visit Guidance →

💡

In Plain English

Guidance review 2026: token-level constrained LLM generation with grammars, regex, and JSON schema — MIT open source — features, pros, cons, use cases.

Overview

Guidance is an open-source library that gives developers token-level control over LLM output. Instead of asking a model nicely for JSON and praying, Guidance lets you interleave normal Python with select statements, regex patterns, context-free grammars, and JSON schema constraints — and the library uses logit biasing during generation to make the model physically incapable of emitting tokens that would violate the constraint. The result is faster, cheaper, and bulletproof structured output: a JSON-schema-constrained generation never produces invalid JSON, a regex constraint always matches, a grammar constraint always parses. Guidance also supports stateful programs that branch on what the model produced, multi-turn role messages, tool calls, image inputs (where supported), and partial caching of shared prefixes for big speedups. It works with local models via transformers, llama.cpp, and vLLM, as well as hosted OpenAI and Anthropic APIs (with reduced constraint enforcement on hosted models that don't expose logits). Originally launched inside Microsoft Research, the project now lives at github.com/guidance-ai/guidance under MIT and is actively maintained by an independent community. There is no managed service or pricing — it's a pure Python library you install and use with your own compute.

🦞

Using with OpenClaw

▼

Install Guidance as an OpenClaw tool dependency and use it within agent steps to guarantee structured output from LLM calls

Use Case Example:

Use OpenClaw as the coordination layer while Guidance ensures each LLM call produces valid structured data

Learn about OpenClaw →

🎨

Vibe Coding Friendly?

▼

Difficulty:advanced

Not Recommended

Guidance is a Python library requiring programming expertise and understanding of grammars, schemas, and constrained generation concepts. It is not a managed platform and has no visual interface — developers must write Python code to use it.

Learn about Vibe Coding →

Was this helpful?

Editorial Review

Guidance from Microsoft Research is a powerful Python library for developers who need guaranteed structured output from LLMs. It excels at enforcing JSON schemas, regex patterns, and grammars at the token level, particularly with local model backends. The learning curve is steeper than simpler alternatives, but the guarantees it provides make it ideal for production systems where output validity is non-negotiable.

Key Features

Template-Based Generation Control+

Define templates that mix fixed text with generation slots. The model generates only in specified locations, with template text serving as guaranteed context that cannot be modified. Variables capture generated values for downstream use in multi-step workflows.

Context-Free Grammar Constraints+

Implement any context-free grammar for output control. Use regex patterns, selection from predefined lists, or complex nested structures. Works with both local models (logit masking) and API models (optimized prompting).

Token Healing Technology+

Automatically corrects tokenization artifacts that occur when template text ends mid-token. Prevents garbled output when template text ends mid-token, a common issue with standard LLM approaches.

JSON Schema Validation+

Generate structured JSON with guaranteed schema compliance using Pydantic models. Supports complex schemas with oneOf, allOf, required properties, min/max constraints, and format validation for production data extraction.

Rust-Based llguidance Engine+

State-of-the-art performance for constraint processing using the llguidance Rust library. Delivers significant speed improvements and fixes subtle bugs from the earlier Python implementation.

Advanced Control Flow+

Implement conditional generation with if/else blocks and iterative generation with loops. The model can generate variable-length lists, make programmatic branching decisions, and handle complex multi-step logic.

Multi-Backend Architecture+

Unified interface for local models (Transformers, llama.cpp with true constrained generation) and API models (OpenAI GPT-4, Anthropic Claude, Azure OpenAI with optimized prompting strategies).

Enhanced Jupyter Integration+

Rich notebook visualization showing token probabilities, backtracking support, generation metrics, real-time model behavior, and dark mode support for interactive development workflows.

Fast-Forwarding Optimization+

Grammar constraints often make some tokens predictable in advance. Guidance automatically inserts these tokens without model forward passes, reducing GPU usage and generation latency significantly.

Composable Guidance Functions+

Create reusable @guidance decorated functions that can be composed into complex grammars. Build libraries of generation patterns for specific domains like HTML generation, form filling, or structured interviews.

Pricing Plans

Open Source

Free (MIT)

See Full Pricing →Free vs Paid →Is it worth it? →

Ready to get started with Guidance?

View Pricing Options →

Getting Started with Guidance

1Install Guidance via pip install guidance and set up your preferred model backend (Transformers for local models, OpenAI API key for GPT-4, or Anthropic API key for Claude)
2Start with the basic tutorial: create a simple template using gen() for text generation and select() for constrained choices, then run it in a Jupyter notebook to see the interactive visualization
3Practice with JSON schema generation by defining a Pydantic model and using guidance.json() to generate structured data that validates against your schema
4Explore the examples repository on GitHub for real-world use cases like HTML generation, structured interviews, and multi-step reasoning pipelines to understand advanced patterns

Ready to start? Try Guidance →

Best Use Cases

🎯

Guaranteed-valid JSON or YAML for downstream parsers and APIs

⚡

Domain-specific languages (DSLs) where output must satisfy a grammar

🔧

Function-calling and tool-use on local open-weight models

🚀

High-throughput batch inference where retries are too expensive

Integration Ecosystem

9 integrations

Guidance works with these platforms and services:

🧠 LLM Providers

OpenAIAnthropicAzurelocal/transformerslocal/llamacpp

☁️ Cloud Platforms

Azure

🔗 Other

GitHubjupyterpydantic

View full Integration Matrix →

Limitations & What It Can't Do

We believe in transparent reviews. Here's what Guidance doesn't handle well:

⚠Requires learning specialized Guidance syntax that doesn't transfer to other LLM frameworks or traditional programming environments
⚠Full constraint capabilities including logit masking only work with locally deployed models, not API-based services like OpenAI or Anthropic
⚠Lacks built-in integrations with popular vector databases like Pinecone, Weaviate, or ChromaDB for retrieval-augmented generation workflows
⚠Debugging complex grammar programs is challenging when generation behavior deviates from expected patterns, especially with nested control flow
⚠Smaller ecosystem compared to LangChain or LlamaIndex results in fewer community contributions, tutorials, and third-party extensions
⚠No native support for tool calling or function execution, requiring custom integration for agent-like capabilities
⚠Microsoft's historical development patterns show periods of rapid updates followed by slower maintenance cycles, creating uncertainty for long-term projects

Pros & Cons

✓ Pros

✓Provable structural guarantees — invalid JSON or grammar matches become impossible by construction
✓Faster than retry-based structured output because invalid tokens are never sampled
✓Free and MIT-licensed, with an active independent community after the Microsoft Research origin

✗ Cons

✗Full constraint enforcement requires logit access — hosted-only APIs (OpenAI, Anthropic) get a watered-down experience
✗Higher learning curve than Instructor for developers who just want Pydantic-validated outputs
✗Local-model deployments inherit all the operational pain of running your own GPU inference

Frequently Asked Questions

How does Guidance differ from traditional prompt engineering approaches?+

Traditional prompting sends text to a model and hopes it formats the response correctly, then parses the output with error-prone string manipulation. Guidance programs specify exactly where the model generates text and what constraints apply, with template text guaranteed verbatim and generation happening only in specified slots. This eliminates format parsing issues entirely.

What are the major improvements in Guidance 2026?+

The Rust-based llguidance grammar engine replaced the Python implementation with faster constraint processing and bug fixes. Other updates include expanded JSON schema coverage with oneOf/allOf support, rewritten Jupyter notebook visualization with token probabilities and backtracking, Python 3.14 compatibility, and support for Phi-4 models.

Can Guidance work with OpenAI's GPT-4 and other API-based models?+

Yes, Guidance supports OpenAI's chat and completion APIs, Anthropic Claude, and Azure OpenAI through optimized prompting strategies. True constrained generation with logit masking only works with local models, but API models use intelligent prompting and output validation while maintaining the same programming interface.

How does Guidance compare to Instructor, Outlines, and other structured output frameworks?+

Guidance provides a full programming language for generation control with conditional logic, loops, and multi-step composition. Instructor focuses specifically on structured output via Pydantic models. Outlines specializes in grammar-constrained generation but has narrower model support. Marvin emphasizes simplicity but lacks Guidance's performance optimizations and advanced control flow.

What is token healing and why is it important?+

Token healing corrects tokenization artifacts that occur when template text ends mid-token. Standard LLM approaches often produce garbled output in these situations. Guidance automatically detects and heals these boundary issues, ensuring clean transitions between fixed template text and generated content - a critical feature for production reliability.

🔒 Security & Compliance

—

SOC2

Unknown

—

GDPR

Unknown

—

HIPAA

Unknown

—

SSO

Unknown

✅

Self-Hosted

Yes

✅

On-Prem

Yes

—

RBAC

Unknown

—

Audit Log

Unknown

—

API Key Auth

Unknown

✅

Open Source

Yes

—

Encryption at Rest

Unknown

—

Encryption in Transit

Unknown

Data Retention: configurable

Data Residency: CONFIGURABLE

📋 Privacy Policy →

🦞

New to AI tools?

Read practical guides for choosing and using AI tools

Read Guides →

Get updates on Guidance and 370+ other AI tools

Weekly insights on the latest AI tools, features, and trends delivered to your inbox.

What's New in 2026

Rust-based llguidance grammar engine replaced Python implementation with faster constraint processing and bug fixes. Expanded JSON schema coverage including oneOf/allOf/format validation. Rewritten Jupyter notebook visualization with token probabilities, backtracking support, and improved dark mode. Added Python 3.14 compatibility and Phi-4 model support. Performance optimizations throughout the framework.

Alternatives to Guidance

Outlines

AI Agent Builders

Grammar-constrained generation for deterministic model outputs.

LangChain

AI Agent Builders

The industry-standard framework for building production-ready LLM applications with comprehensive tool integration, agent orchestration, and enterprise observability through LangSmith.

View All Alternatives & Detailed Comparison →

User Reviews

No reviews yet. Be the first to share your experience!

Quick Info

Try Guidance Today

Get started with Guidance and see if it's the right fit for your needs.

Get Started →

Need help choosing the right AI stack?

Take our 60-second quiz to get personalized tool recommendations

Find Your Perfect AI Stack →

Want a faster launch?

Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.

Browse Agent Templates →

More about Guidance

Pricing Review Alternatives Free vs Paid Pros & Cons Worth It?Tutorial

Overview

Editorial Review

Key Features

Template-Based Generation Control+

Context-Free Grammar Constraints+

Token Healing Technology+

Automatically corrects tokenization artifacts that occur when template text ends mid-token. Prevents garbled output when template text ends mid-token, a common issue with standard LLM approaches.

JSON Schema Validation+

Rust-Based llguidance Engine+

State-of-the-art performance for constraint processing using the llguidance Rust library. Delivers significant speed improvements and fixes subtle bugs from the earlier Python implementation.

Advanced Control Flow+

Multi-Backend Architecture+

Unified interface for local models (Transformers, llama.cpp with true constrained generation) and API models (OpenAI GPT-4, Anthropic Claude, Azure OpenAI with optimized prompting strategies).

Enhanced Jupyter Integration+

Rich notebook visualization showing token probabilities, backtracking support, generation metrics, real-time model behavior, and dark mode support for interactive development workflows.

Fast-Forwarding Optimization+

Grammar constraints often make some tokens predictable in advance. Guidance automatically inserts these tokens without model forward passes, reducing GPU usage and generation latency significantly.

Composable Guidance Functions+

Getting Started with Guidance

1Install Guidance via pip install guidance and set up your preferred model backend (Transformers for local models, OpenAI API key for GPT-4, or Anthropic API key for Claude)

2Start with the basic tutorial: create a simple template using gen() for text generation and select() for constrained choices, then run it in a Jupyter notebook to see the interactive visualization

3Practice with JSON schema generation by defining a Pydantic model and using guidance.json() to generate structured data that validates against your schema

4Explore the examples repository on GitHub for real-world use cases like HTML generation, structured interviews, and multi-step reasoning pipelines to understand advanced patterns

Limitations & What It Can't Do

We believe in transparent reviews. Here's what Guidance doesn't handle well:

⚠Requires learning specialized Guidance syntax that doesn't transfer to other LLM frameworks or traditional programming environments

⚠Full constraint capabilities including logit masking only work with locally deployed models, not API-based services like OpenAI or Anthropic

⚠Lacks built-in integrations with popular vector databases like Pinecone, Weaviate, or ChromaDB for retrieval-augmented generation workflows

⚠Debugging complex grammar programs is challenging when generation behavior deviates from expected patterns, especially with nested control flow

⚠Smaller ecosystem compared to LangChain or LlamaIndex results in fewer community contributions, tutorials, and third-party extensions

⚠No native support for tool calling or function execution, requiring custom integration for agent-like capabilities

⚠Microsoft's historical development patterns show periods of rapid updates followed by slower maintenance cycles, creating uncertainty for long-term projects

Pros & Cons

✓ Pros

✓Provable structural guarantees — invalid JSON or grammar matches become impossible by construction
✓Faster than retry-based structured output because invalid tokens are never sampled
✓Free and MIT-licensed, with an active independent community after the Microsoft Research origin

✗ Cons

✗Full constraint enforcement requires logit access — hosted-only APIs (OpenAI, Anthropic) get a watered-down experience
✗Higher learning curve than Instructor for developers who just want Pydantic-validated outputs
✗Local-model deployments inherit all the operational pain of running your own GPU inference

Frequently Asked Questions

How does Guidance differ from traditional prompt engineering approaches?+

What are the major improvements in Guidance 2026?+

Can Guidance work with OpenAI's GPT-4 and other API-based models?+

How does Guidance compare to Instructor, Outlines, and other structured output frameworks?+

What is token healing and why is it important?+

What's New in 2026