Data Extraction

Extract structured data from unstructured text using AI. The minded-extraction tool uses LLM capabilities to parse content and return data in a predefined format.

Overview

Structured Extraction with Zod Schema: Define exact data structure
Prompt-based Extraction: Extract information using custom prompts
Validation and Retries: Automatic validation with configurable retry logic
Structured Output Support: Uses LLM native structured output when available

Using in Flows

- id: extractCustomerInfo
  type: tool
  toolName: minded-extraction
  prompt: Extract customer name, email, and phone number from the message

Tool Parameters

Parameter

Type

Description

Required

content

string

Text to extract from

Yes

schema

object

Zod-compatible schema

systemPrompt

string

Custom instructions

examples

array

Input/output examples

strictMode

boolean

Enable validation (default: true)

maxRetries

number

Retry attempts on failure (default: 3)

defaultValue

any

Fallback value

Overriding Parameters in Flows

- name: 'Extract Customer Info'
  type: 'tool'
  toolName: 'minded-extraction'
  parameters:
    content: '{state.memory.rawText}'
    schema:
      name:
        type: 'string'
        description: 'Customer full name'
      email:
        type: 'string'
        description: 'Email address'
        required: false
      phone:
        type: 'string'
    systemPrompt: 'Extract contact information from the text'
    strictMode: true
    maxRetries: 3

Available schema field properties:

type: 'string', 'number', 'boolean', 'array', or 'object'
description: Optional field description
required: Optional boolean (defaults to true)

Programmatic Usage

import { extract, createExtractor } from '@minded-ai/mindedjs';
import { z } from 'zod';

// Direct extraction
const result = await extract(
  content,
  {
    schema: z.object({
      name: z.string(),
      age: z.number(),
    }),
    systemPrompt: 'Extract person details',
  },
  agent.llm,
);

// Create reusable extractor
const extractor = createExtractor(schema, { systemPrompt: 'Extract data' });
const result = await extractor(content, agent.llm);

How It Works

With Structured Output Support: Uses LLM's withStructuredOutput for direct schema-compliant extraction
Fallback Mode: Generates prompt with schema description, parses JSON, validates against Zod schema, retries with error feedback
Non-strict Mode: Skips validation for flexible extraction

PreviousDocument Processing NextClassifier

Last updated 29 days ago