Extract structured, validated data from any LLM using Pydantic models with automatic retries and multi-provider support. Most popular Python library with 3M+ monthly downloads and 11K+ GitHub stars.
Makes AI return structured, validated data instead of messy text — perfect when you need reliable, typed responses from any AI model.
Instructor is the most popular Python library for extracting structured, validated data from Large Language Models, transforming unreliable text outputs into type-safe Python objects through Pydantic model definitions. With over 3 million monthly downloads and 11,000+ GitHub stars, it has become the de facto standard for reliable LLM output processing in production applications.\n\nBuilt on Pydantic's validation framework, Instructor patches LLM client libraries to add a responsemodel parameter that defines the expected output structure. When you call client.create(responsemodel=MyModel, ...), Instructor automatically handles function-calling schema generation, response parsing, validation, and intelligent retry logic when the LLM output doesn't match the specified schema.\n\nThe library's core innovation lies in its automatic retry mechanism with validation feedback. When Pydantic validation fails, Instructor feeds specific error messages back to the LLM and retries the request. This feedback loop enables models to self-correct, achieving 99%+ success rates even with complex schemas that would otherwise fail frequently.\n\nInstructor supports 15+ LLM providers through its unified fromprovider() interface, including OpenAI, Anthropic, Google Gemini, Mistral, Cohere, DeepSeek, Ollama, and local models. This provider-agnostic approach prevents vendor lock-in and enables easy A/B testing across different models for specific extraction tasks without code changes.\n\nAdvanced features include streaming partial objects where Pydantic fields populate incrementally as the LLM generates tokens, iterable responses for extracting lists of objects, union types for classification tasks, and custom validators with arbitrary logic. Multiple extraction modes (TOOLS, JSON, MDJSON, PARALLEL) optimize for different model capabilities and use cases.\n\nThe library's focused scope as an extraction tool rather than a full agent framework is intentional. Instructor excels at the specific problem of getting reliable structured data from single LLM calls without the complexity of agent loops, tool calling, or conversation management. For complete agent workflows, the Instructor team recommends complementary tools like PydanticAI.\n\nCompared to alternatives, Instructor sits between raw function calling (which requires manual JSON parsing and error handling) and heavy agent frameworks. It provides more reliability than raw OpenAI function calls through validation and retries, but remains simpler than LangChain or other comprehensive frameworks by focusing solely on structured extraction.\n\nInstructor has expanded beyond Python with official ports to TypeScript, Go, Ruby, Elixir, and Rust, maintaining consistent APIs across languages. This multi-language support enables teams to use the same extraction patterns across different technology stacks while preserving the benefits of type safety and validation.\n\nCompanies using Instructor include teams at OpenAI, Google, Microsoft, AWS, and numerous Y Combinator startups. The library's production-ready status is evidenced by its extensive test suite, comprehensive documentation, and active community of 100+ contributors maintaining integrations and examples.
Was this helpful?
Instructor is the gold standard for structured LLM output extraction, with 3M+ monthly downloads and support for 15+ providers. Using Pydantic models for validation and automatic retry logic, it turns unreliable LLM text into guaranteed typed Python objects. Essential for any production system that needs reliable, structured responses from LLMs.
Define output structure as Pydantic models with typed fields, descriptions, and validators. Instructor converts these to function-calling schemas and returns validated Python objects automatically.
When Pydantic validation fails, Instructor provides specific error messages to the LLM and retries. Models receive context about validation failures and can self-correct, achieving 99%+ success rates.
Unified from_provider() interface works with OpenAI, Anthropic, Google, Cohere, Mistral, DeepSeek, Ollama, and 10+ more providers. Switch providers without code changes for easy A/B testing and cost optimization.
Get incremental Pydantic model updates as the LLM generates tokens. Fields populate progressively, enabling real-time UIs that show structured data appearing as extraction progresses.
TOOLS mode uses native function calling for maximum reliability, JSON mode forces JSON output for weaker models, MD_JSON extracts from markdown blocks, and PARALLEL extracts multiple objects simultaneously.
Use Union types to let the LLM select the appropriate Pydantic model for classification tasks. Supports discriminated unions and automatic routing based on input content analysis.
Free
Ready to get started with Instructor?
View Pricing Options →Instructor works with these platforms and services:
We believe in transparent reviews. Here's what Instructor doesn't handle well:
Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
In 2025-2026, Instructor introduced from_provider() for automatic multi-provider detection, expanded to 15+ LLM providers including DeepSeek, surpassed 3 million monthly downloads, and clarified its positioning as complementary to PydanticAI — Instructor for extraction, PydanticAI for agent workflows. Ports now available in TypeScript, Go, Ruby, Elixir, and Rust.
AI Agent Builders
Grammar-constrained generation for deterministic model outputs.
AI Agent Builders
A programming language for controlling large language models with constrained generation and structured output guarantees
No reviews yet. Be the first to share your experience!
Get started with Instructor and see if it's the right fit for your needs.
Get Started →Take our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack →Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates →