POST
/
extract
/
sync
const options = {
  method: "POST",
  headers: {
    "x-api-key": "<your-api-key>",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    url: "<file-url>",
    templateId: "<template-id>",
  }),
};

fetch("https://api.getomni.ai/extract/sync", options)
  .then((response) => response.json())
  .then((response) => console.log(response))
  .catch((err) => console.error(err));
{
  "result": {
    "ocr": {
      "pages": [
        {
          "page": 1,
          "content": "# Invoice ...",
          "contentLength": 698
        }
      ],
      "fileName": "7faf9e7fd6cb4b3dbb4accca979023bb",
      "inputTokens": 931,
      "outputTokens": 220,
      "completionTime": 8593
    },
    "extracted": {
      "file_type": "invoice"
    },
    "inputTokens": 292,
    "outputTokens": 7
  }
}

This is a synchronous API endpoint. This endpoint will return the result once the document is processed.

When using templates, you can provide a templateId to load predefined configurations. Any configuration parameters (schema, extractPerPage, etc.) explicitly specified in the API request will override the corresponding template settings.

Body Parameters

Either file or URL is required but not both. See Accepted File Types.

url
string
required

URL of the document to extract data from

file
file
required

The file to extract data from. Use multipart/form-data as the Content-Type header.

templateId
string

The template ID used for extraction.

schema
object

JSON schema to define the structure of extracted data. See JSON schema examples.

Overrides the schema from the template if provided.

excludeOCRResult
boolean

Whether to exclude OCR result from the response. Defaults to false.

maintainFormat
boolean

Whether to maintain format from the previous page. Defaults to false.

Overrides the maintainFormat from the template if provided.

pageRange
number[]

Array of page numbers to process. Defaults to all pages.

extractPerPage
string[]

Array of schema properties to extract per page. Defaults to empty array.

Overrides the extractPerPage from the template if provided.

bypassCache
boolean

Whether to bypass the cache and process the document from scratch. Defaults to false.

directImageExtraction
boolean

Whether to extract directly from document images. Defaults to false.

includeConfidence
boolean

Whether to include confidence intervals. Defaults to false.

webhookId
string

Unique identifier for the webhook callback

Example JSON Schema

This is a JSON Schema, which defines the structure and validation rules for the JSON. For more examples and details, see JSON Schema Examples.

{
  "type": "object",
  "properties": {
    "bill_to": {
      "type": "string",
      "description": "The name of person who receives the invoice"
    },
    "ship_to": {
      "type": "string",
      "description": "The location of the person who receives the invoice"
    },
    "balance_due": {
      "type": "number",
      "description": "The total balance due"
    }
  }
}

Response

result
object

Response object containing OCR results and extracted data

const options = {
  method: "POST",
  headers: {
    "x-api-key": "<your-api-key>",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    url: "<file-url>",
    templateId: "<template-id>",
  }),
};

fetch("https://api.getomni.ai/extract/sync", options)
  .then((response) => response.json())
  .then((response) => console.log(response))
  .catch((err) => console.error(err));
{
  "result": {
    "ocr": {
      "pages": [
        {
          "page": 1,
          "content": "# Invoice ...",
          "contentLength": 698
        }
      ],
      "fileName": "7faf9e7fd6cb4b3dbb4accca979023bb",
      "inputTokens": 931,
      "outputTokens": 220,
      "completionTime": 8593
    },
    "extracted": {
      "file_type": "invoice"
    },
    "inputTokens": 292,
    "outputTokens": 7
  }
}