POST
/
extract
const options = {
  method: "POST",
  headers: {
    "x-api-key": "<your-api-key>",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    url: "<file-url>",
    templateId: "<template-id>",
  }),
};

fetch("https://api.getomni.ai/extract", options)
  .then((response) => response.json())
  .then((response) => console.log(response))
  .catch((err) => console.error(err));
{
  "jobId": "a9b5c4f0-8202-4375-8e2c-4383a5b0d450",
  "result": "https://api.getomni.ai/extract?jobId=a9b5c4f0-8202-4375-8e2c-4383a5b0d450",
  "status": "PENDING"
}

This is an asynchronous API endpoint. The initial request returns a jobId and status. You can use the jobId to check the processing status and fetch results.

When using templates, you can provide a templateId to load predefined configurations. Any configuration parameters (schema, extractPerPage, etc.) explicitly specified in the API request will override the corresponding template settings.

Body Parameters

Either file or URL is required but not both. See Accepted File Types.

url
string
required

URL of the document to extract data from.

file
file
required

The file to extract data from. Use multipart/form-data as the Content-Type header.

templateId
string

The template ID used for extraction.

schema
object

JSON schema to define the structure of extracted data. See JSON schema examples.

excludeOCRResult
bool

Whether to exclude OCR result from the response. Defaults to false.

maintainFormat
bool

Whether to maintain format from the previous page. Defaults to false.

pageRange
number[]

Array of page numbers to process. Defaults to all pages.

extractPerPage
string[]

Array of schema properties to extract per page. Defaults to empty array.

bypassCache
boolean

Whether to bypass the cache and process the document from scratch. Defaults to false.

directImageExtraction
boolean

Whether to extract directly from document images. Defaults to false.

includeConfidence
boolean

Whether to include confidence intervals. Defaults to false.

webhookId
string

Unique identifier for the webhook callback

metadata
object

Custom JSON data to be included in the response

Example JSON Schema

This is a JSON Schema, which defines the structure and validation rules for the JSON. For more examples and details, see JSON Schema Examples.

{
  "type": "object",
  "properties": {
    "bill_to": {
      "type": "string",
      "description": "The name of person who receives the invoice"
    },
    "ship_to": {
      "type": "string",
      "description": "The location of the person who receives the invoice"
    },
    "balance_due": {
      "type": "number",
      "description": "The total balance due"
    }
  }
}

Response

jobId
string

Unique identifier for the extraction request

status
string

Status of the extraction (success, processing, or error)

result
string

URL for polling the extraction result

const options = {
  method: "POST",
  headers: {
    "x-api-key": "<your-api-key>",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    url: "<file-url>",
    templateId: "<template-id>",
  }),
};

fetch("https://api.getomni.ai/extract", options)
  .then((response) => response.json())
  .then((response) => console.log(response))
  .catch((err) => console.error(err));
{
  "jobId": "a9b5c4f0-8202-4375-8e2c-4383a5b0d450",
  "result": "https://api.getomni.ai/extract?jobId=a9b5c4f0-8202-4375-8e2c-4383a5b0d450",
  "status": "PENDING"
}