Run Extract Sync

const options = {
  method: 'POST',
  headers: {
    'x-api-key': '<your-api-key>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    url: '<file-url>',
    templateId: '<template-id>',
  }),
};

fetch('https://api.getomni.ai/extract/sync', options)
  .then((response) => response.json())
  .then((response) => console.log(response))
  .catch((err) => console.error(err));

{
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "result": {
    "ocr": {
      "pages": [
        {
          "page": 1,
          "content": "# Invoice ...",
          "contentLength": 698
        }
      ],
      "fileName": "7faf9e7fd6cb4b3dbb4accca979023bb",
      "inputTokens": 931,
      "outputTokens": 220,
      "completionTime": 8593
    },
    "extracted": {
      "file_type": "invoice"
    },
    "inputTokens": 292,
    "outputTokens": 7
  }
}

POST

extract

sync

const options = {
  method: 'POST',
  headers: {
    'x-api-key': '<your-api-key>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    url: '<file-url>',
    templateId: '<template-id>',
  }),
};

fetch('https://api.getomni.ai/extract/sync', options)
  .then((response) => response.json())
  .then((response) => console.log(response))
  .catch((err) => console.error(err));

{
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "result": {
    "ocr": {
      "pages": [
        {
          "page": 1,
          "content": "# Invoice ...",
          "contentLength": 698
        }
      ],
      "fileName": "7faf9e7fd6cb4b3dbb4accca979023bb",
      "inputTokens": 931,
      "outputTokens": 220,
      "completionTime": 8593
    },
    "extracted": {
      "file_type": "invoice"
    },
    "inputTokens": 292,
    "outputTokens": 7
  }
}

This is a synchronous API endpoint. This endpoint will return the result once the document is processed.

When using templates, you can provide a templateId to load predefined configurations. Any configuration parameters (schema, extractPerPage, etc.) explicitly specified in the API request will override the corresponding template settings.

Body Parameters

Either file or URL is required but not both. See Accepted File Types.

url

string

required

URL of the document to extract data from

file

required

The file to extract data from. Use multipart/form-data as the Content-Type header.

templateId

string

The template ID used for extraction.

schema

object

JSON schema to define the structure of extracted data. See JSON schema examples.

Overrides the schema from the template if provided.

excludeOCRResult

boolean

Whether to exclude OCR result from the response. Defaults to false.

maintainFormat

boolean

Whether to maintain format from the previous page. Defaults to false.

Overrides the maintainFormat from the template if provided.

pageRange

number[]

Array of page numbers to process. Defaults to all pages.

extractPerPage

string[]

Array of schema properties to extract per page. Defaults to empty array.

Overrides the extractPerPage from the template if provided.

enableHybridExtraction

boolean

If true, both document images and OCR result will be used to extract data. Defaults to false.

bypassCache

boolean

Whether to bypass the cache and process the document from scratch. Defaults to false.

directImageExtraction

boolean

Whether to extract directly from document images. Defaults to false.

includeConfidence

boolean

Whether to include confidence intervals. Defaults to false.

webhookId

string

Unique identifier for the webhook callback

Example JSON Schema

This is a JSON Schema, which defines the structure and validation rules for the JSON. For more examples and details, see JSON Schema Examples.

{
  "type": "object",
  "properties": {
    "bill_to": {
      "type": "string",
      "description": "The name of person who receives the invoice"
    },
    "ship_to": {
      "type": "string",
      "description": "The location of the person who receives the invoice"
    },
    "balance_due": {
      "type": "number",
      "description": "The total balance due"
    }
  }
}

Response

jobId

string

Unique identifier for the job

result

object

Response object containing OCR results and extracted data

Show result

ocr

object

OCR results from document processing

Show ocr

pages

array

Array of processed pages

Show pages

page

number

Page number

content

string

Raw text content extracted from the page

contentLength

number

Length of the extracted content

fileName

string

Name of the processed file

inputTokens

number

Number of input tokens processed for OCR

outputTokens

number

Number of output tokens generated for OCR

completionTime

number

Processing time in milliseconds

extracted

object

Structured data extracted according to the provided schema

inputTokens

number

Total number of input tokens used in the extraction

outputTokens

number

Total number of output tokens generated for the extraction

confidence

object

Confidence intervals for OCR and extracted values

Show confidence

ocr

array

Confidence intervals per OCR page

Show ocr

page

number

Page number

value

number

Confidence interval for the page

extracted

object

Confidence intervals for extracted structured data

const options = {
  method: 'POST',
  headers: {
    'x-api-key': '<your-api-key>',
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    url: '<file-url>',
    templateId: '<template-id>',
  }),
};

fetch('https://api.getomni.ai/extract/sync', options)
  .then((response) => response.json())
  .then((response) => console.log(response))
  .catch((err) => console.error(err));

{
  "jobId": "550e8400-e29b-41d4-a716-446655440000",
  "result": {
    "ocr": {
      "pages": [
        {
          "page": 1,
          "content": "# Invoice ...",
          "contentLength": 698
        }
      ],
      "fileName": "7faf9e7fd6cb4b3dbb4accca979023bb",
      "inputTokens": 931,
      "outputTokens": 220,
      "completionTime": 8593
    },
    "extracted": {
      "file_type": "invoice"
    },
    "inputTokens": 292,
    "outputTokens": 7
  }
}

Run Extract Poll Extract

API Documentation

Extract

Classify

Embedded

Human in the Loop

Queue

Pipeline

Webhook

API Status

Body Parameters

Example JSON Schema

Response

API Documentation

Extract

Classify

Embedded

Human in the Loop

Queue

Pipeline

Webhook

API Status

​Body Parameters

​Example JSON Schema

​Response

Body Parameters

Example JSON Schema

Response