Skip to main content

Working with External Data

Learn how you can integrate external data in your data definition

Updated today

Overview

External Data allows you to pull information from sources outside the document itself. Instead of extracting data from document content, you can fetch data from APIs, databases, or other systems. This is useful when document processing requires additional context or reference data not present in the document.

What is External Data?

External Data is information retrieved from external sources:

  • REST APIs - Fetch data from web services

  • Databases - Query external databases

  • Associated Data - Use data uploaded alongside documents

  • Metadata Services - Look up enrichment data

  • Reference Systems - Fetch product catalogs, price lists, etc.

Example use cases:

  • Look up customer details from CRM using customer ID found in document

  • Fetch current pricing from product database based on SKU

  • Retrieve tax rates from external service based on location

  • Get company information from business directory using tax ID

How External Data Works

The Flow

  1. Document processed - AI extracts document data

  2. External data elements evaluated - Expressions execute

  3. External requests made - API calls or data lookups

  4. Data returned - External data added to document

  5. Combined data available - Both document and external data ready for use

Two Ways to Use External Data

1. Data Element Level

  • Individual data elements fetch external data

  • Each element makes its own external request

  • Useful for single lookups

  • Example: Fetch tax rate for a specific location

2. Data Group Level

  • Entire groups created from external data

  • Expression returns array of objects

  • Creates multiple data objects from external source

  • Example: Fetch all line items from external order system

Configuring External Data at Element Level

Steps

  1. Open your Data Definition

  2. Select the data element

  3. Navigate to the Data Source tab

  4. Set source to External

  5. Go to the Semantics tab

  6. Write a Groovy expression to fetch the data

The Semantics Tab for External Data

When data source is set to External, the Semantics tab provides:

  • Code editor - Groovy code editor with syntax highlighting

  • Expression field - Write Groovy code to fetch external data

  • Full screen mode - Large editor for complex expressions

  • Real-time validation - Syntax checking as you type

Writing External Data Expressions

External data expressions use Groovy code:

Simple value from external data:

externalData.customerName

Nested property access:

externalData.customer.address.city

Array access:

externalData.items[0].price

With default value:

externalData.taxRate ?: 0.08

Configuring External Data at Group Level

Purpose

Create multiple data objects from external data:

  • Repeating groups populated from external arrays

  • Line items from external order system

  • Related records from external database

  • API responses with multiple results

Steps

  1. Create a data group element

  2. Enable "Repeating" checkbox

  3. Set data source to External

  4. Write expression that returns an array

  5. Each array element creates a data object

  6. Child elements can reference data variable

Group Expression Example

// Return array of line items from external data
externalData.lineItems

Child Element Expressions

Within the group, child elements access the current data object:

// In a child element:
data.productName  // Current line item's product name
data.quantity     // Current line item's quantity
data.price        // Current line item's price

Providing External Data

Via API at Upload Time

When uploading documents via API, include external data:

POST /api/documentFamilies/{familyId}/externalData
Content-Type: application/json{
  "customer": {
    "id": "CUST-001",
    "name": "Acme Corp",
    "taxRate": 0.08
  },
  "lineItems": [
    {"sku": "ABC-123", "price": 99.99, "quantity": 2},
    {"sku": "DEF-456", "price": 49.99, "quantity": 1}
  ]
}

Via Document Upload Header

Set the externalData header when uploading:

POST /api/stores/{storeId}/upload
externalData: {"customerId": "CUST-001", "orderTotal": 249.97}
file: [document file]

Via API Lookup in Expression

Fetch external data directly in your expression:

// Make HTTP request from Groovy expression
def response = new URL("https://api.example.com/customer/${customerIdFromDocument}").text
def json = new groovy.json.JsonSlurper().parseText(response)
return json.taxRate

Common Patterns

Customer Lookup

Scenario: Invoice contains customer ID, need full customer details

Expression:

// Assuming customerID was extracted from document
def custId = document.dataObjects.find { it.name == 'customerID' }?.value
def response = new URL("https://api.crm.com/customers/${custId}").text
def customer = new groovy.json.JsonSlurper().parseText(response)
return customer.name

Product Price Lookup

Scenario: Document has SKU, need current price from catalog

Expression:

def sku = document.dataObjects.find { it.name == 'SKU' }?.value
def response = new URL("https://api.catalog.com/products/${sku}").text
def product = new groovy.json.JsonSlurper().parseText(response)
return product.currentPrice

Tax Rate by Location

Scenario: Calculate tax based on shipping address

Expression:

def zipCode = document.dataObjects.find { it.name == 'shippingZip' }?.value
def response = new URL("https://api.tax.com/rates?zip=${zipCode}").text
def taxInfo = new groovy.json.JsonSlurper().parseText(response)
return taxInfo.rate

Repeating Group from External API

Scenario: Load line items from external order system

Group expression:

def orderId = document.dataObjects.find { it.name == 'orderID' }?.value
def response = new URL("https://api.orders.com/orders/${orderId}/items").text
def items = new groovy.json.JsonSlurper().parseText(response)
return items.lineItems

Child element expressions:

// Product Name element:
data.productName// Quantity element:
data.quantity// Price element:
data.unitPrice

Working with Review Data Source

External data commonly works with Review data source:

The Pattern

  1. External data loaded with document

  2. Data elements with "Review" source display in forms

  3. User reviews and corrects both document and external data

  4. Both sets of data saved together

Use Cases

  • Show external reference data alongside document

  • Pre-populate form fields from external systems

  • Allow user to verify/correct external data

  • Combine document extraction with manual data entry

Available Variables in Expressions

Standard Variables

  • externalData - The external data object uploaded with document

  • document - The document being processed

  • metadata - Document metadata (filename, upload date, etc.)

  • family - The document family

In Group Child Elements

  • data - The current data object from the group's array

  • Access properties directly: data.propertyName

Best Practices

  • Handle missing data - Use ?: operator for defaults

  • Validate external responses - Check for null or unexpected formats

  • Consider performance - API calls add processing time

  • Cache when possible - Avoid duplicate requests for same data

  • Error handling - Use try-catch for external calls

  • Document expressions - Add comments for complex lookups

  • Test thoroughly - Verify with various external data scenarios

  • Secure credentials - Don't hardcode API keys in expressions

Error Handling Example

try {
    def custId = document.dataObjects.find { it.name == 'customerID' }?.value
    if (!custId) return "No customer ID found"    def response = new URL("https://api.example.com/customers/${custId}").text
    def customer = new groovy.json.JsonSlurper().parseText(response)
    return customer.name ?: "Unknown"
} catch (Exception e) {
    return "Error fetching customer: ${e.message}"
}

Server-Side Processing

The platform handles external data automatically:

  • Loads external data when document is processed

  • Makes it available to expressions via externalData variable

  • Evaluates external data expressions in order

  • Creates data objects based on group expressions

  • Resolves child element expressions with data context

  • Stores both document and external data together

Common Issues and Solutions

External Data Not Available

  • Verify external data was uploaded with document

  • Check API endpoint for external data upload

  • Ensure JSON format is valid

  • Verify expression references correct properties

Expression Returns Null

  • Check property path is correct

  • Use safe navigation: externalData?.property

  • Provide default value: externalData.property ?: 'default'

  • Add error handling with try-catch

Group Creates No Objects

  • Verify expression returns an array

  • Check array is not empty

  • Ensure external data structure matches expectation

  • Test expression with sample data

Tips

  • External data uses Groovy expressions (same as fallback/serialization)

  • The Semantics tab provides a full-height code editor for complex expressions

  • Group-level external data creates multiple data objects from arrays

  • Child elements access current data object via data variable

  • External data commonly pairs with Review data source

  • Use safe navigation (?.) to prevent null pointer errors

  • Test expressions incrementally as you build them

  • Consider API rate limits when making external calls

  • Document your expressions for future maintenance

Did this answer your question?