High-performance open-source tool that converts PDFs, images, PPTX, DOCX, and other documents to clean markdown, JSON, or HTML with deep learning-powered layout detection.
Converts PDFs and documents to clean markdown or JSON — fast, accurate, handles tables, equations, and complex layouts with AI.
Marker is an open-source document conversion tool built by DataLab (Vik Paruchuri) that converts PDFs, images, PPTX, DOCX, XLSX, HTML, and EPUB files into clean markdown, JSON, chunks, or HTML. It combines deep learning models for layout detection, OCR, table recognition, and equation detection into a single pipeline optimized for producing high-fidelity structured output from complex documents.
Marker's pipeline uses Surya for OCR and layout detection, identifying document regions like text blocks, headers, tables, figures, equations, code blocks, and page artifacts. Each region gets appropriate extraction — text is OCR'd, tables are structured, equations are converted to LaTeX, and images are extracted and saved separately. The output preserves document hierarchy with proper heading levels, formatted markdown tables, and reading order that handles multi-column layouts.
The tool now supports multiple output formats beyond markdown. JSON output provides structured document representation with element types, and chunked output pre-segments documents for RAG pipelines. An optional LLM enhancement mode (--use_llm flag) pairs Marker with Gemini, Claude, OpenAI, or Ollama models to improve table formatting, handle inline math, merge tables across pages, and extract form values. Benchmarks show the LLM-enhanced mode outperforms both Marker alone and standalone LLM extraction.
Performance is strong — projected throughput of 25 pages/second on H100 hardware in batch mode. The tool runs on GPU, CPU, or Apple MPS, though GPU is strongly recommended for any non-trivial workload. Memory requirements are moderate at approximately 2-4GB for loading the deep learning models.
Marker is available as both open-source software (GPL license with a modified AI Pubs Open Rail-M license for model weights — free for research, personal use, and startups under $2M) and a managed API through DataLab. The managed API processes documents at 1/4th the price of competing cloud services, with 99.99% uptime and approximately 15-second processing for a 250-page PDF.
For teams building RAG knowledge bases, search indexes, or documentation sites from document collections, Marker produces significantly cleaner output than basic text extraction tools. Its combination of layout detection, OCR, table recognition, equation handling, and extensible post-processing in a single pipeline is hard to match.
Was this helpful?
Marker is the leading open-source document conversion tool, combining deep learning layout detection with high-quality OCR to produce clean markdown, JSON, or HTML from complex documents. Its LLM-enhanced mode pushes accuracy beyond what either traditional extraction or standalone LLMs achieve. The managed API provides a cost-effective production option. Main limitations are GPL licensing restrictions and the practical need for GPU hardware for batch workloads.
Uses Surya models for detecting document regions: text blocks, headers, tables, figures, equations, code blocks, page headers, and footers. Handles multi-column layouts and complex page structures with reading order detection.
Use Case:
Converting a two-column research paper into single-column markdown with correct reading order and section hierarchy.
Integrated Surya OCR engine optimized for document text recognition. Supports 90+ languages and handles mixed-language documents with higher accuracy than Tesseract for most document types.
Use Case:
Processing scanned technical documents in multiple languages where Tesseract OCR produces too many errors.
Detects tables and converts them to properly formatted markdown tables or structured JSON with column alignment. Handles simple and moderately complex table structures, with LLM-enhanced mode for merging tables across pages.
Use Case:
Converting a technical specification PDF with comparison tables into structured data where table relationships are preserved.
Optional --use_llm flag pairs Marker with Gemini, Claude, OpenAI, or Ollama models to improve table formatting, handle inline math, extract form values, and merge tables split across pages. Benchmarks show higher accuracy than either Marker or LLM alone.
Use Case:
Processing complex financial reports where tables span multiple pages and inline calculations need accurate LaTeX conversion.
Accepts PDF, image, PPTX, DOCX, XLSX, HTML, and EPUB files. Outputs markdown, JSON (structured), chunks (pre-segmented for RAG), or HTML. Extensible with custom processors for specialized formatting logic.
Use Case:
Building an ingestion pipeline that converts a mix of PowerPoint presentations, Word documents, and PDFs into chunked JSON for a vector database.
DataLab offers a hosted API with 99.99% uptime that processes documents at 1/4th the price of competitors, handling 250-page PDFs in approximately 15 seconds. Self-serve on-premise licensing is also available for enterprise deployments.
Use Case:
A compliance team that processes thousands of regulatory PDFs monthly using the managed API to avoid maintaining GPU infrastructure.
Free
forever
Pay-per-page
Custom
Ready to get started with Marker?
View Pricing Options →Marker works with these platforms and services:
We believe in transparent reviews. Here's what Marker doesn't handle well:
Weekly insights on the latest AI tools, features, and trends delivered to your inbox.
Marker now supports multiple output formats (markdown, JSON, chunks, HTML) and multi-format input (PDF, PPTX, DOCX, XLSX, HTML, EPUB). The --use_llm mode pairs extraction with Gemini, Claude, OpenAI, or Ollama for significantly improved table and math accuracy. The repository moved to the DataLab organization with a managed API offering 99.99% uptime at 1/4th competitor pricing.
Document AI
IBM-backed open-source document parsing toolkit that converts PDFs, DOCX, PPTX, images, audio, and more into structured formats for RAG pipelines and AI agent workflows.
Document AI
LlamaParse: Extract and analyze structured data from complex PDFs and documents using LLM-powered parsing.
Document AI
Document ETL engine that converts messy PDFs, Word files, and images into AI-ready structured data with intelligent chunking.
Automation & Workflows
Enterprise-grade text extraction and document processing framework that detects and extracts content from 1,000+ file formats. Free, containerized, and battle-tested across 18 years of production deployment.
Automation & Workflows
AWS document intelligence service that extracts text, tables, forms, and handwriting from scanned documents using machine learning — with specialized APIs for invoices, IDs, and lending documents.
No reviews yet. Be the first to share your experience!
Get started with Marker and see if it's the right fit for your needs.
Get Started →Take our 60-second quiz to get personalized tool recommendations
Find Your Perfect AI Stack →Explore 20 ready-to-deploy AI agent templates for sales, support, dev, research, and operations.
Browse Agent Templates →