AI Agent Builders🔴Developer

Guidance

Name: Guidance
Brand: Guidance
Availability: InStock

A programming language for controlling large language models with constrained generation and structured output guarantees

Starting atFree

Visit Guidance →

💡

In Plain English

Control exactly how AI models generate text by enforcing structure rules during output, guaranteeing valid JSON, categories, or formatted data without retry loops

Overview

Guidance is a free, open-source Python library and constrained generation framework from Microsoft Research, categorized among AI agent builder and structured output tools, that gives developers deterministic control over large language model output by enforcing JSON schemas, regex patterns, and context-free grammars at the token level during generation — completely free with no paid tiers, usage fees, or API costs.

With over 19,000 GitHub stars and more than 100 contributors, Guidance has become one of the most widely adopted structured output libraries in the LLM ecosystem. The project averages over 50,000 monthly downloads on PyPI and has accumulated more than 3,500 forks on GitHub, reflecting strong community interest across research and production use cases. Originally released in 2023, Guidance has been under continuous development with frequent releases and an active issue tracker.

Unlike traditional prompting approaches where developers send text to a model and hope it responds in the correct format, Guidance interleaves fixed template text with constrained generation steps. This means the model can only produce tokens that satisfy the specified constraints — whether those are regex patterns, JSON schemas with full support for oneOf, allOf, anyOf, and recursive definitions, or arbitrary context-free grammars. Invalid output is structurally impossible, not merely unlikely.

At the core of Guidance is llguidance, a high-performance grammar engine written in Rust that efficiently processes constraints during token generation. This engine validates and masks tokens in real time, ensuring that every generated token conforms to the specified grammar without significant performance overhead. The Rust implementation provides substantial speed improvements over pure-Python alternatives, handling complex nested schemas and large grammars with minimal latency impact.

Guidance supports multiple model backends through a unified programming interface. Developers can use OpenAI GPT models, Azure OpenAI deployments, local models via Hugging Face Transformers or llama.cpp, and experimental Anthropic model support. The strongest structural guarantees are available with local model backends where Guidance can directly control token sampling and masking. Cloud API backends leverage provider-specific structured output features but may have some limitations compared to local execution.

Key programming primitives include gen() for constrained text generation, select() for forcing output to match one of predefined options, and the @guidance decorator for packaging reusable generation patterns as composable Python functions. Token healing automatically corrects tokenization boundary artifacts where template text meets generated content. Conditional logic with if/else blocks and loop constructs enables dynamic multi-step generation pipelines.

Guidance integrates naturally with Jupyter notebooks, providing token-level highlighting and visualization for debugging and rapid prototyping. The Mock model allows developers to validate grammar constraints and test generation pipelines without making any real LLM API calls, reducing development costs and iteration time. These features make Guidance particularly valuable for research teams and developers who need to experiment with complex generation strategies before deploying to production.

The library is MIT licensed with zero telemetry, no usage tracking, and complete source code available for audit. When used with local model backends, all data processing occurs entirely on-premises with no external network calls, making it suitable for sensitive or regulated environments.

🦞

Using with OpenClaw

▼

Install Guidance as an OpenClaw tool dependency and use it within agent steps to guarantee structured output from LLM calls

Use Case Example:

Use OpenClaw as the coordination layer while Guidance ensures each LLM call produces valid structured data

Learn about OpenClaw →

🎨

Vibe Coding Friendly?

▼

Difficulty:advanced

Not Recommended

Guidance is a Python library requiring programming expertise and understanding of grammars, schemas, and constrained generation concepts. It is not a managed platform and has no visual interface — developers must write Python code to use it.

Learn about Vibe Coding →

Was this helpful?

Editorial Review

Guidance from Microsoft Research is a powerful Python library for developers who need guaranteed structured output from LLMs. It excels at enforcing JSON schemas, regex patterns, and grammars at the token level, particularly with local model backends. The learning curve is steeper than simpler alternatives, but the guarantees it provides make it ideal for production systems where output validity is non-negotiable.

Key Features

Template-Based Generation Control+

Interleave fixed template text with constrained generation steps, giving developers precise control over output structure while letting the model fill in variable content

Constrained Selection and Generation+

Force generation to only produce tokens that satisfy regex patterns, JSON schemas, or context-free grammars — invalid output is structurally impossible

Rust-Based Grammar Engine (llguidance)+

High-performance constraint processing engine written in Rust that efficiently validates and masks tokens during generation

JSON Schema Validation+

Generate structured JSON guaranteed to conform to complex schemas including oneOf, allOf, anyOf, recursive definitions, and nested objects

Token Healing+

Automatically corrects tokenization boundary artifacts where template text meets generated content, preventing subtle formatting errors

Conditional Logic and Loops+

Control flow with if/else blocks and loop constructs for dynamic multi-step generation pipelines

Multi-Model Backend Support+

Unified programming interface across OpenAI, Azure, local Transformers, and llama.cpp backends, with experimental Anthropic support

Composable @guidance Functions+

Package reusable generation patterns as decorated Python functions that can be composed into complex pipelines

Pricing Plans

Plan 1

Free

Plan 2

Pay-per-use via Azure

Plan 3

Pay-per-use via OpenAI

Plan 4

Free (hardware costs only)

See Full Pricing →Free vs Paid →Is it worth it? →

Ready to get started with Guidance?

View Pricing Options →

Getting Started with Guidance

1Install Guidance via pip: run 'pip install guidance' to get the latest version with the Rust grammar engine
2Configure your model backend: for OpenAI set OPENAI_API_KEY, for Azure configure endpoint and deployment, or for local models install transformers
3Write your first constrained generation: use @guidance decorator with gen() and select() primitives to control output
4Test with the Mock model to validate grammar constraints without API calls
5Explore the Jupyter notebooks at github.com/guidance-ai/guidance/tree/main/notebooks for interactive examples

Ready to start? Try Guidance →

Best Use Cases

🎯

Structured Data Extraction: Extract entities, relationships, and fields from unstructured text with guaranteed JSON schema compliance

⚡

Multi-Step Reasoning Pipelines: Build chains of constrained generation steps where each step's output feeds the next

🔧

Reliable Classification Systems: Force model output to match exactly one of predefined categories using select()

🚀

Template-Based Document Generation: Create documents with fixed structure and variable AI-generated content

💡

Research and Experimentation: Use Jupyter integration and Mock model for rapid prototyping of generation strategies

Integration Ecosystem

8 integrations

Guidance works with these platforms and services:

🧠 LLM Providers

OpenAIanthropic (experimental)Azurelocal/transformerslocal/llamacpp

☁️ Cloud Platforms

Azure

⚡ Code Execution

jupyter

🔗 Other

GitHub

View full Integration Matrix →

Limitations & What It Can't Do

We believe in transparent reviews. Here's what Guidance doesn't handle well:

⚠Guidance achieves its strongest guarantees with local models where it can directly control token sampling. Cloud API backends (OpenAI, Azure) support constrained generation through provider-specific features but may have limitations compared to local execution. The library requires Python programming knowledge and understanding of grammar concepts. Documentation and examples can lag behind rapid development. Very large or deeply nested grammars may impact generation performance.

Pros & Cons

✓ Pros

✓Guarantees output validity against JSON schemas, regex patterns, and context-free grammars at the token level
✓Rust-based grammar engine processes constraints efficiently during generation
✓Composable primitives (`gen`, `select`, `@guidance`) enable complex multi-step generation pipelines
✓Reduces token usage and latency by avoiding retry loops for structured output
✓Backed by Microsoft Research with an active open-source community
✓Fully open source under a permissive license with no usage fees or API keys required

✗ Cons

✗Strongest guarantees require local model backends — cloud APIs have more limited constraint support
✗Learning curve is steeper than simple prompt engineering or wrapper libraries like Instructor
✗Documentation and examples lag behind the pace of library development
✗Performance on very large grammars or deeply nested schemas may degrade
✗Fewer ready-made integrations with agent frameworks compared to simpler structured output tools

Frequently Asked Questions

How does Guidance differ from regular prompting?+

Regular prompting sends text and hopes the model responds in the right format. Guidance interleaves fixed template text with constrained generation steps, enforcing structure at the token level so invalid outputs are impossible rather than merely unlikely.

What changed in the 2026 release?+

The Rust-based grammar engine (llguidance) was further optimized for performance, with improved JSON schema support including oneOf/allOf/anyOf, better error messages, and expanded model backend compatibility.

Can I use Guidance with OpenAI or Azure models?+

Yes. Guidance supports OpenAI GPT models and Azure OpenAI deployments. Constrained generation works through provider-supported structured output features, though the strongest token-level guarantees are available with local models.

How does Guidance compare to Instructor?+

Instructor validates structured output after generation and retries on failure. Guidance enforces constraints during generation at the token level, preventing invalid output from being produced in the first place — eliminating retry overhead.

Is Guidance suitable for production use?+

Yes, for applications requiring guaranteed structured output. It is used in production systems where output validity is critical, particularly with local model deployments.

Does Guidance work with local models?+

Yes, and local models get the strongest guarantees because Guidance can directly control token sampling. Supported via Transformers and llama.cpp backends.

🔒 Security & Compliance

—

SOC2

Unknown

—

GDPR

Unknown

—

HIPAA

Unknown

—

SSO

Unknown

✅

Self-Hosted

Yes

✅

On-Prem

Yes

—

RBAC

Unknown

—

Audit Log

Unknown

—

API Key Auth

Unknown

✅

Open Source

Yes

—

Encryption at Rest

Unknown

—

Encryption in Transit

Unknown

Data Retention: configurable

Data Residency: CONFIGURABLE — FULLY LOCAL WITH LOCAL MODEL BACKENDS, PROVIDER-DEPENDENT WITH CLOUD APIS

📋 Privacy Policy →

🦞

New to AI tools?

Read practical guides for choosing and using AI tools

Read Guides →

Get updates on Guidance and 370+ other AI tools

Weekly insights on the latest AI tools, features, and trends delivered to your inbox.

What's New in 2026

Through 2025 and into 2026, Guidance has continued to mature its Rust-based grammar engine (llguidance) with improved JSON schema support, better error reporting, and expanded model backend compatibility. The library has seen performance improvements in constraint processing and broader community adoption for structured output use cases.

Alternatives to Guidance

Outlines

AI Agent Builders

Grammar-constrained generation for deterministic model outputs.

Instructor

Coding Agents

Extract structured, validated data from any LLM using Pydantic models with automatic retries and multi-provider support. Most popular Python library with 3M+ monthly downloads and 11K+ GitHub stars.

View All Alternatives & Detailed Comparison →

User Reviews

No reviews yet. Be the first to share your experience!

Quick Info

Try Guidance Today

Get started with Guidance and see if it's the right fit for your needs.

Get Started →

Need help choosing the right AI stack?

Take our 60-second quiz to get personalized tool recommendations

Find Your Perfect AI Stack →

Want a faster launch?

Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.

Browse Agent Templates →

More about Guidance

Pricing Review Alternatives Free vs Paid Pros & Cons Worth It?Tutorial

📚 Related Articles

Best No-Code AI Agent Builders in 2026: Complete Platform Comparison

An honest comparison of the best no-code AI agent builders: n8n, Flowise, Dify, Langflow, Make, Zapier, and more. Features, pricing, agent capabilities, and recommendations by use case.

2026-03-127 min read

Overview

Editorial Review

Key Features

Template-Based Generation Control+

Interleave fixed template text with constrained generation steps, giving developers precise control over output structure while letting the model fill in variable content

Constrained Selection and Generation+

Force generation to only produce tokens that satisfy regex patterns, JSON schemas, or context-free grammars — invalid output is structurally impossible

Rust-Based Grammar Engine (llguidance)+

High-performance constraint processing engine written in Rust that efficiently validates and masks tokens during generation

JSON Schema Validation+

Generate structured JSON guaranteed to conform to complex schemas including oneOf, allOf, anyOf, recursive definitions, and nested objects

Token Healing+

Automatically corrects tokenization boundary artifacts where template text meets generated content, preventing subtle formatting errors

Conditional Logic and Loops+

Control flow with if/else blocks and loop constructs for dynamic multi-step generation pipelines

Multi-Model Backend Support+

Unified programming interface across OpenAI, Azure, local Transformers, and llama.cpp backends, with experimental Anthropic support

Composable @guidance Functions+

Package reusable generation patterns as decorated Python functions that can be composed into complex pipelines

Getting Started with Guidance

1Install Guidance via pip: run 'pip install guidance' to get the latest version with the Rust grammar engine

2Configure your model backend: for OpenAI set OPENAI_API_KEY, for Azure configure endpoint and deployment, or for local models install transformers

3Write your first constrained generation: use @guidance decorator with gen() and select() primitives to control output

4Test with the Mock model to validate grammar constraints without API calls

5Explore the Jupyter notebooks at github.com/guidance-ai/guidance/tree/main/notebooks for interactive examples

Best Use Cases

🎯

Structured Data Extraction: Extract entities, relationships, and fields from unstructured text with guaranteed JSON schema compliance

⚡

Multi-Step Reasoning Pipelines: Build chains of constrained generation steps where each step's output feeds the next

🔧

Reliable Classification Systems: Force model output to match exactly one of predefined categories using select()

🚀

Template-Based Document Generation: Create documents with fixed structure and variable AI-generated content

💡

Research and Experimentation: Use Jupyter integration and Mock model for rapid prototyping of generation strategies

Limitations & What It Can't Do

We believe in transparent reviews. Here's what Guidance doesn't handle well:

⚠Guidance achieves its strongest guarantees with local models where it can directly control token sampling. Cloud API backends (OpenAI, Azure) support constrained generation through provider-specific features but may have limitations compared to local execution. The library requires Python programming knowledge and understanding of grammar concepts. Documentation and examples can lag behind rapid development. Very large or deeply nested grammars may impact generation performance.

Pros & Cons

✓ Pros

✓Guarantees output validity against JSON schemas, regex patterns, and context-free grammars at the token level
✓Rust-based grammar engine processes constraints efficiently during generation
✓Composable primitives (`gen`, `select`, `@guidance`) enable complex multi-step generation pipelines
✓Reduces token usage and latency by avoiding retry loops for structured output
✓Backed by Microsoft Research with an active open-source community
✓Fully open source under a permissive license with no usage fees or API keys required

✗ Cons

✗Strongest guarantees require local model backends — cloud APIs have more limited constraint support
✗Learning curve is steeper than simple prompt engineering or wrapper libraries like Instructor
✗Documentation and examples lag behind the pace of library development
✗Performance on very large grammars or deeply nested schemas may degrade
✗Fewer ready-made integrations with agent frameworks compared to simpler structured output tools

Frequently Asked Questions

How does Guidance differ from regular prompting?+

What changed in the 2026 release?+

Can I use Guidance with OpenAI or Azure models?+

How does Guidance compare to Instructor?+

Is Guidance suitable for production use?+

Yes, for applications requiring guaranteed structured output. It is used in production systems where output validity is critical, particularly with local model deployments.

Does Guidance work with local models?+

Yes, and local models get the strongest guarantees because Guidance can directly control token sampling. Supported via Transformers and llama.cpp backends.

What's New in 2026