Document Processing
Parse and extract data from images, PDFs, Word documents, spreadsheets, and more using AI-powered document processing. This tool handles both document parsing and data extraction in a single step.
Overview
Supported formats: Images (JPG, PNG, GIF, BMP, WebP, TIFF), Documents (PDF, DOC, DOCX, TXT, RTF, ODT), Spreadsheets (XLS, XLSX, CSV, ODS), Presentations (PPT, PPTX, ODP), Web formats (HTML, HTM, MD, XML)
Built-in extraction modes:
Structured Extraction: Extract data into a predefined Zod schema using AI
Unstructured Extraction: Extract information based on prompt instructions
Raw Text Extraction: Extract plain text without AI processing
Processing Modes
Processing mode is controlled via the DOCUMENT_PROCESSING_MODE environment variable:
Managed (Default)
Backend handles processing. Benefits: secure API key storage, automatic cost tracking, no SDK configuration required.
Local
SDK handles processing. Set DOCUMENT_PROCESSING_MODE=local in your environment. Requires LLAMA_CLOUD_API_KEY in SDK environment.
Using in Flows
Document processing includes built-in AI extraction - use systemPrompt to specify what data to extract. No additional extraction tool is needed.
Example: Document Processing
Available parameters:
documentSource: URL or file path to the document (auto-detected)extractRaw: Set totruefor raw text without AIschema: Zod schema for structured extractionsystemPrompt: Instructions for AI-powered extraction
Programmatic Usage
Structured Extraction
Note: use nullable() instead of optional() for fields that are not required.
Unstructured Extraction
Raw Text Extraction
Using URLs
Configuration
Environment Variables
Document Processor Options
LLM Configuration
Example Use Cases
Invoice Processing
Identity Document Verification
Extract Names and Addresses
Troubleshooting
PDF Processing Fails
Solution (local mode only): Set the LLAMA_CLOUD_API_KEY environment variable.
Image Too Large
Solution: Reduce maxImageWidth in configuration.
Schema Validation Errors
Solution: Verify your Zod schema matches the expected data structure.
Supported File Types
Images
.jpg, .jpeg, .png, .gif, .bmp, .webp, .tiff
Direct processing
Documents
.pdf, .doc, .docx, .txt, .rtf, .odt
Requires LlamaCloud for advanced formats
Spreadsheets
.xls, .xlsx, .csv, .ods
Requires LlamaCloud for binary formats
Presentations
.ppt, .pptx, .odp
Requires LlamaCloud
Web
.html, .htm, .md, .xml
Basic text extraction
Last updated