Document Processing

Parse and extract data from images, PDFs, Word documents, spreadsheets, and more using AI-powered document processing. This tool handles both document parsing and data extraction in a single step.

Overview

Supported formats: Images (JPG, PNG, GIF, BMP, WebP, TIFF), Documents (PDF, DOC, DOCX, TXT, RTF, ODT), Spreadsheets (XLS, XLSX, CSV, ODS), Presentations (PPT, PPTX, ODP), Web formats (HTML, HTM, MD, XML)

Built-in extraction modes:

  1. Structured Extraction: Extract data into a predefined Zod schema using AI

  2. Unstructured Extraction: Extract information based on prompt instructions

  3. Raw Text Extraction: Extract plain text without AI processing

Processing Modes

Processing mode is controlled via the DOCUMENT_PROCESSING_MODE environment variable:

Managed (Default)

Backend handles processing. Benefits: secure API key storage, automatic cost tracking, no SDK configuration required.

Local

SDK handles processing. Set DOCUMENT_PROCESSING_MODE=local in your environment. Requires LLAMA_CLOUD_API_KEY in SDK environment.

Using in Flows

Document processing includes built-in AI extraction - use systemPrompt to specify what data to extract. No additional extraction tool is needed.

Example: Document Processing

Available parameters:

  • documentSource: URL or file path to the document (auto-detected)

  • extractRaw: Set to true for raw text without AI

  • schema: Zod schema for structured extraction

  • systemPrompt: Instructions for AI-powered extraction

Programmatic Usage

Structured Extraction

Note: use nullable() instead of optional() for fields that are not required.

Unstructured Extraction

Raw Text Extraction

Using URLs

Configuration

Environment Variables

Document Processor Options

LLM Configuration

Example Use Cases

Invoice Processing

Identity Document Verification

Extract Names and Addresses

Troubleshooting

PDF Processing Fails

Solution (local mode only): Set the LLAMA_CLOUD_API_KEY environment variable.

Image Too Large

Solution: Reduce maxImageWidth in configuration.

Schema Validation Errors

Solution: Verify your Zod schema matches the expected data structure.

Supported File Types

Category
Extensions
Notes

Images

.jpg, .jpeg, .png, .gif, .bmp, .webp, .tiff

Direct processing

Documents

.pdf, .doc, .docx, .txt, .rtf, .odt

Requires LlamaCloud for advanced formats

Spreadsheets

.xls, .xlsx, .csv, .ods

Requires LlamaCloud for binary formats

Presentations

.ppt, .pptx, .odp

Requires LlamaCloud

Web

.html, .htm, .md, .xml

Basic text extraction

Last updated