IBM-backed open-source document parsing toolkit that converts PDFs, DOCX, PPTX, images, audio, and more into structured formats for RAG pipelines and AI agent workflows.
An open-source tool from IBM that converts documents into AI-ready formats — handles PDFs, presentations, and more.
Docling is an open-source document processing toolkit originally developed by IBM Research that converts documents from virtually any format into clean, structured representations ready for AI consumption. With Apache 2.0 licensing, local execution, and integrations with every major AI framework, it's become one of the most practical tools for teams building RAG systems and document-understanding agents.
Format Coverage That Actually MattersDocling handles the formats teams actually encounter: PDF (including scanned), DOCX, PPTX, XLSX, HTML, LaTeX, images (PNG, JPEG, TIFF), and even audio files (WAV, MP3) via automatic speech recognition. Recent releases added WebVTT caption parsing, XBRL financial reports, and USPTO patent documents. This breadth means you don't need separate parsers for each document type — Docling normalizes everything into its unified DoclingDocument format.
Advanced PDF UnderstandingPDF parsing is where Docling truly separates from simpler tools like PyPDF or pdfplumber. The Heron layout model (released December 2025) provides faster parsing while accurately detecting page layout, reading order, table structures, code blocks, mathematical formulas, and image classification. It handles multi-column layouts, headers/footers, and complex nested tables that break most other parsers. For OCR on scanned documents, Docling integrates multiple OCR engines and even supports IBM's Granite-Docling-258M vision-language model — a 258M parameter VLM purpose-built for document-to-text conversion that preserves complex layouts in a single inference pass.
Structured Output FormatsEvery parsed document converts to the DoclingDocument unified representation, which you can then export as Markdown, HTML, JSON (lossless), WebVTT, or DocTags. The JSON export preserves the full document structure — headings, paragraphs, tables, lists, figures — with coordinates and reading order metadata. This is critical for RAG systems where chunk boundaries and document structure affect retrieval quality. See our guide on building effective RAG systems for why document structure matters.
AI Framework IntegrationsDocling provides plug-and-play integrations with LangChain, LlamaIndex, CrewAI, and Haystack. These aren't thin wrappers — they're maintained connectors that feed parsed documents directly into each framework's document loaders and chunking pipelines. The MCP server integration (added in 2025) lets any MCP-compatible AI agent use Docling as a document parsing tool, making it accessible from Claude, Cursor, and other MCP clients.
Local Execution and PrivacyUnlike cloud-based document AI services from Google or Azure, Docling runs entirely locally. Install with pip install docling and process sensitive documents without sending data to any external server. This is essential for healthcare, legal, and financial teams with strict data governance requirements. The CLI makes batch processing straightforward for pipeline automation.
Recent releases added rich metadata extraction capabilities including document language detection, page-level bounding boxes for every element, confidence scores on OCR results, and hierarchical section labeling. The TableFormer model achieves over 90% F1 on complex table structure recognition benchmarks (PubTabNet, FinTabNet), making it among the best open-source options for extracting structured data from tables embedded in PDFs. Docling's chunking utilities — HybridChunker and HierarchicalChunker — leverage this metadata to split documents at semantically meaningful boundaries rather than arbitrary token counts, which measurably improves retrieval precision in RAG systems.
Performance and ScaleOn GPU hardware (e.g., a single NVIDIA A100), Docling processes approximately 10–15 pages per second for standard layout analysis, and 3–5 pages per second when the full VLM pipeline is engaged. CPU-only throughput is roughly 5–10× slower depending on document complexity. The project's GitHub repository has accumulated over 18,000 stars since its public release, reflecting strong community adoption. The SmolDocling model variant (released early 2026) reduced the VLM footprint to under 256M parameters while maintaining competitive accuracy, making GPU requirements more accessible for smaller teams.
Was this helpful?
Docling from IBM Research provides accurate, modular document conversion with particular strength in scientific and technical documents. The layout analysis and table extraction capabilities are excellent for academic papers, reports, and structured documents. Being open-source and self-hostable is a significant advantage for data-sensitive organizations. The processing speed is slower than simpler parsers, and the focus on structured documents means it's less suited for highly visual or creative document formats.
Free
Ready to get started with Docling?
View Pricing Options →Docling works with these platforms and services:
We believe in transparent reviews. Here's what Docling doesn't handle well:
Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
Through late 2025 and into 2026 the project expanded well beyond its original PDF focus. Notable additions include audio file ingestion with transcription, a Model Context Protocol (MCP) server so MCP-compatible agents and IDEs can call Docling as a tool, and tighter integration with IBM's Granite-Docling and the compact SmolDocling vision-language models for image-first document understanding. The project also moved under the LF AI & Data Foundation umbrella as docling-project, broadening governance beyond IBM, and continued to add ecosystem integrations (Crew AI, Haystack, txtai) alongside maturing the layout-aware HybridChunker for RAG.
Document AI
Document ETL engine that converts messy PDFs, Word files, and images into AI-ready structured data with intelligent chunking.
Document AI
LlamaParse: Extract and analyze structured data from complex PDFs and documents using LLM-powered parsing.
No reviews yet. Be the first to share your experience!
Get started with Docling and see if it's the right fit for your needs.
Get Started →Take our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack →Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates →