POST
/
extract
const options = {
  method: 'POST',
  headers: {
    'x-api-key': '<your-api-key>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    url: '<file-url>',
    templateId: '<template-id>',
  }),
};

fetch('https://api.getomni.ai/extract', options)
  .then((response) => response.json())
  .then((response) => console.log(response))
  .catch((err) => console.error(err));
{
  "jobId": "a9b5c4f0-8202-4375-8e2c-4383a5b0d450",
  "result": "https://api.getomni.ai/extract?jobId=a9b5c4f0-8202-4375-8e2c-4383a5b0d450",
  "status": "PENDING"
}
This is an asynchronous API endpoint. The initial request returns a jobId and status. You can use the jobId to check the processing status and fetch results.
When using templates, you can provide a templateId to load predefined configurations. Any configuration parameters (schema, extractPerPage, etc.) explicitly specified in the API request will override the corresponding template settings.

Body Parameters

Either file or URL is required but not both. See Accepted File Types.
url
string
required
URL of the document to extract data from.
file
file
required
The file to extract data from. Use multipart/form-data as the Content-Type header.
templateId
string
The template ID used for extraction.
schema
object
JSON schema to define the structure of extracted data. See JSON schema examples.
excludeOCRResult
bool
Whether to exclude OCR result from the response. Defaults to false.
maintainFormat
bool
Whether to maintain format from the previous page. Defaults to false.
pageRange
number[]
Array of page numbers to process. Defaults to all pages.
extractPerPage
string[]
Array of schema properties to extract per page. Defaults to empty array.
enableHybridExtraction
boolean
If true, both document images and OCR result will be used to extract data. Defaults to false.
bypassCache
boolean
Whether to bypass the cache and process the document from scratch. Defaults to false.
directImageExtraction
boolean
Whether to extract directly from document images. Defaults to false.
includeConfidence
boolean
Whether to include confidence intervals. Defaults to false.
webhookId
string
Unique identifier for the webhook callback
metadata
object
Custom JSON data to be included in the response

Example JSON Schema

This is a JSON Schema, which defines the structure and validation rules for the JSON. For more examples and details, see JSON Schema Examples.
{
  "type": "object",
  "properties": {
    "bill_to": {
      "type": "string",
      "description": "The name of person who receives the invoice"
    },
    "ship_to": {
      "type": "string",
      "description": "The location of the person who receives the invoice"
    },
    "balance_due": {
      "type": "number",
      "description": "The total balance due"
    }
  }
}

Response

jobId
string
Unique identifier for the extraction request
status
string
Status of the extraction (success, processing, or error)
result
string
URL for polling the extraction result
const options = {
  method: 'POST',
  headers: {
    'x-api-key': '<your-api-key>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    url: '<file-url>',
    templateId: '<template-id>',
  }),
};

fetch('https://api.getomni.ai/extract', options)
  .then((response) => response.json())
  .then((response) => console.log(response))
  .catch((err) => console.error(err));
{
  "jobId": "a9b5c4f0-8202-4375-8e2c-4383a5b0d450",
  "result": "https://api.getomni.ai/extract?jobId=a9b5c4f0-8202-4375-8e2c-4383a5b0d450",
  "status": "PENDING"
}