RPA (Robotic Process Automation) tools are special tools that automate browser interactions using Playwright. They provide automatic screenshot capture, action logging, and browser session management.
What are RPA Tools?
RPA tools automate tasks on the web by controlling a browser programmatically. They are ideal for:
Web scraping and data extraction
Form filling and submission
Clicking buttons and navigating websites
Automating repetitive web-based workflows
Interacting with web applications that don't have APIs
RPA Tool Structure
RPA tools must implement the RPATool interface with type: 'rpa':
// Example: RPA Tool with input and optional output schemasinterfaceRPATool<Inputextendsz.ZodSchema,Memory=any,Outputextendsz.ZodSchema=z.ZodTypeAny>{name:string;// Unique tool identifierdescription:string;// What the tool does (used by LLM)input:Input;// Zod schema for input validationoutput?:Output;// Optional: Zod schema for output validation (e.g., outputSchema)isGlobal?:boolean;// Optional: available across all LLM callstype:'rpa';// Required: marks this as an RPA toolproxyConfig?:ProxyConfig;// Optional: unified proxy configurationbrowserTaskMode?:BrowserTaskMode;// Optional: browser provider (local/cloud/onPrem)persistSession?:boolean;// Optional: persist cookies & localStorage across executions (default: true)execute: ({input,state,agent,page}) =>Promise<{result?}>;}
Execute Function Signature
RPA tools receive an additional page parameter (Playwright Page object) in their execute function:
Parameters
input: Validated data matching your Zod schema
state: Current conversation state including memory, sessionId, and other context
agent: Agent instance providing access to PII gateway, logging, and other platform features
page: Playwright Page object for browser automation - automatically provided for RPA tools
Proxy Configuration
RPA tools can optionally specify proxy configuration using the unified ProxyConfig type. This allows you to use Minded-managed proxies or custom proxy servers:
Proxy Configuration Options:
Minded Proxy - Trusted IP: Routes through Minded proxy for IP whitelisting without region selection
Minded Proxy - Region: Routes through Minded proxy with a specific country/region (requires countryCode)
Custom Proxy: Use your own proxy server (requires server, optional username and password)
Note: Proxy configuration only applies when using cloud or on-prem browser providers. Local browser sessions ignore proxy settings.
Browser Provider Configuration
RPA tools can optionally specify which browser provider to use:
Available options:
BrowserTaskMode.LOCAL: Use local browser instance (browser lives in the same machine as the agent), for local development only.
BrowserTaskMode.CLOUD: Use cloud browser provider (allows for proxy configuration, also automatically passes CloudFlare and other anti-bot protection)
If not specified, the tool will use the browser provider configured in your environment (BROWSER_TASK_MODE environment variable).
Automatic Features
When you mark a tool as type: 'rpa', the platform automatically:
Browser Session Management: Creates and manages a browser session
Screenshot Capture: Takes screenshots before and after every Playwright action (click, fill, type, goto, etc.)
Action Logging: Automatically logs all browser actions (e.g., "Navigate to: https://example.com", "Click: .submit-button")
Error Handling: Captures screenshots and HTML on errors for debugging. No need to add any additional code to handle errors (try/catch blocks).
All screenshots and logs are automatically displayed in the UI when the tool executes.
Basic RPA Tool Example
Here's a simple RPA tool that navigates to a website and extracts data:
RPA Development Guidelines
Prerequisites
Before creating an RPA tool, gather:
The URL of the website you need to automate
The task to be performed (e.g., "Fill out a form", "Extract order details")
Any required information (e.g., login credentials, form data)
Development Workflow
Plan the automation steps
Break down the task into discrete steps (e.g., "Navigate to URL", "Click on X button", "Type Y into Z input field")
Identify the selectors you'll need for each element
Test with Playwright MCP
Use the Playwright MCP to interact with the browser interactively
Read the HTML of the page to find correct selectors
Test each step before writing the code
Iterate up to 3 times if a step fails
Write the tool code
Use the page object provided in the execute function
Add comments to explain what each action does
Use appropriate Playwright methods (click, fill, type, goto, etc.)
Test the complete tool
Run: npm run tool <toolName> <param1>=<value1> <param2>=<value2> ...
Add cookies=false flag to avoid using cookies from previous sessions (useful for login tasks)
Check screenshots and logs in the UI
If the tool fails, check the rpaTestResults folder for error screenshots and HTML
Available Data Sources
In the context of an RPA tool, you have access to:
Tool input parameters: Data extracted by the LLM from the conversation
Environment variables: process.env for configuration and secrets
State and memory: Current conversation state and agent memory
If you need additional data, ask the user whether you should add it to the input parameters.
General Rules
Important - avoid using waitForNavigation: As it can lead to hangs. Use waitForSelector instead.
Test before implementing: Use Playwright MCP to test interactions before writing code
Don't close browser sessions: The platform manages browser sessions automatically
Never assume credentials: Always ask the user for credentials if they're unavailable
Verify changes: When modifying an RPA script, re-run the complete flow to verify it works
Follow instructions: Complete only the task specified - don't add extra steps
Use Playwright MCP during development: Prefer interactive testing over running the tool directly while building
CAPTCHA Handling
If a website requires a CAPTCHA:
Use the resolve_captcha MCP tool in the development phase.
Use the resolveCaptcha function exported by the @minded-ai/mindedjs package in the rpa tool code.
Captcha resolution is not always successful, add retry mechanism that would retry to resolve the CAPTCHA up to 5 times.
Tool Registration
RPA tools are registered the same way as regular tools:
Tool Nodes
RPA tools can be used in tool nodes just like regular tools:
The platform will automatically:
Create a browser session
Execute the RPA tool with screenshot and log capture
Display results in the UI
Testing
Local Testing
Debugging Failed Executions
If a tool fails:
Check the rpaTestResults/ folder for:
screenshot.jpeg: Final state screenshot
content.html: HTML content at failure point
Review logs in the UI to see which step failed
Check screenshots to see the visual state at each step
Limitations
Browser Context: Each RPA tool execution uses a fresh browser context (cookies are preserved)
Session Management: Browser sessions are managed automatically - don't manually create or destroy them
Concurrent Execution: Multiple RPA tools in the same flow share the same browser session
import { ProxyConfig, ProxyProvider, MindedProxyMode } from '@minded-ai/mindedjs';
// Custom proxy server
const rpaToolCustom: Tool<typeof schema, Memory> = {
name: 'rpa_get_order_details',
description: 'Navigate to a website and extract order details',
input: schema,
type: 'rpa',
proxyConfig: {
provider: ProxyProvider.CUSTOM,
server: 'http://proxy.example.com:8080',
username: 'user', // Optional
password: 'pass', // Optional
},
browserTaskMode: BrowserTaskMode.CLOUD,
};
import { BrowserTaskMode } from '@minded-ai/mindedjs';
const rpaTool: Tool<typeof schema, Memory> = {
name: 'rpa_get_order_details',
description: 'Navigate to a website and extract order details',
input: schema,
type: 'rpa',
browserTaskMode: BrowserTaskMode.CLOUD, // Optional: local/cloud/onPrem
// ... rest of tool definition
};
import { z } from 'zod';
import { Tool, logger, BrowserTaskMode } from '@minded-ai/mindedjs';
import memorySchema from '../schema';
type Memory = z.infer<typeof memorySchema>;
const schema = z.object({
url: z.string().describe('The website URL to navigate to'),
});
import { ProxyConfig, ProxyProvider, MindedProxyMode } from '@minded-ai/mindedjs';
const rpaGetOrderDetailsTool: Tool<typeof schema, Memory> = {
name: 'rpa_get_order_details',
description: 'Navigate to a website and extract order details',
input: schema,
type: 'rpa', // Mark as RPA tool
proxyConfig: {
provider: ProxyProvider.MINDED,
mode: MindedProxyMode.REGION,
countryCode: 'US',
}, // Optional: use US proxy
browserTaskMode: BrowserTaskMode.CLOUD, // Optional: use cloud provider
execute: async ({ input, state, agent, page }) => {
logger.info({
message: 'Navigating to website',
sessionId: state.sessionId,
url: input.url,
});
// Navigate to the website
// Screenshot and log are automatically captured
await page.goto(input.url, { timeout: 60000 });
// Wait for content to load
await page.waitForSelector('.order-summary', { timeout: 10000 });
// Extract data from the page
const orderItems = await page.locator('.order-item').allInnerTexts();
const totalAmount = await page.locator('.total-amount').innerText();
logger.info({
message: 'Extracted order details',
sessionId: state.sessionId,
itemCount: orderItems.length,
});
// Update state if needed
state.memory.lastOrderUrl = input.url;
return {
result: {
items: orderItems,
total: totalAmount,
},
};
},
};
export default rpaGetOrderDetailsTool;
// In your tools/index.ts
import rpaGetOrderDetailsTool from './rpaGetOrderDetails';
export default [
rpaGetOrderDetailsTool,
// ... other tools
];
nodes:
- name: 'Extract Order Data'
type: 'tool'
toolName: 'rpa_get_order_details'
# Test an RPA tool locally
npm run tool rpa_get_order_details url=https://example.com
# Test without cookies (useful for login flows)
npm run tool rpa_get_order_details url=https://example.com cookies=false