Data Extraction

MindedJS provides a powerful AI-based extraction tool for extracting structured data from unstructured text. The extraction system uses LLM capabilities to parse content and return data in a predefined format.

Overview

The extraction tool (minded-extraction) enables you to extract specific information from text using:

  • Structured Extraction with Zod Schema: Define exact data structure using Zod schemas

  • Prompt-based Extraction: Extract information using custom prompts

  • Validation and Retries: Automatic validation against schema with retry logic

Key Features

  • LLM Structured Output Support: When available, uses the LLM's native structured output capabilities for guaranteed schema compliance

  • Fallback JSON Parsing: Automatically falls back to JSON parsing with validation when structured output is unavailable

  • Schema Validation: Built-in Zod validation ensures extracted data matches expected structure

  • Retry Logic: Configurable retry attempts with error feedback for improved accuracy

  • Strict and Non-strict Modes: Choose between validated extraction (strict) or flexible extraction

Library Tool Integration

The extraction functionality is available as a library tool called minded-extraction that can be added to your flows through the Minded platform.

Configuration Options

  • content: The text to extract information from

  • schema: Optional Zod-compatible schema defining the structure

  • systemPrompt: Custom instructions for extraction

  • examples: Input/output examples to guide extraction

  • strictMode: Enable/disable schema validation (default: true)

  • maxRetries: Number of retry attempts on validation failure (default: 3)

  • defaultValue: Fallback value if extraction fails

How It Works

  1. With Structured Output Support (when available):

    • The tool uses the LLM's withStructuredOutput method for direct schema-compliant extraction

    • No manual JSON parsing or validation needed

    • Guaranteed to match the provided Zod schema

  2. Fallback Mode (JSON parsing):

    • Generates a prompt with schema description

    • Uses JSON output parser to extract structured data

    • Validates against Zod schema

    • Retries with error feedback if validation fails

  3. Non-strict Mode:

    • Skips validation for more flexible extraction

    • Useful when schema compliance is not critical

Standalone Usage

The extraction utility can also be used programmatically:

import { extract, createExtractor } from '@minded-ai/mindedjs';
import { z } from 'zod';

// Direct extraction
const result = await extract(
  content,
  {
    schema: z.object({
      name: z.string(),
      age: z.number(),
    }),
    systemPrompt: 'Extract person details',
  },
  agent.llm,
);

// Create reusable extractor
const extractor = createExtractor(schema, { systemPrompt: 'Extract data' });
const result = await extractor(content, agent.llm);

Last updated