Node.js
Zerox Node.js SDK
Supports vision models from different providers like OpenAI, Azure OpenAI, Anthropic, AWS Bedrock, Google Gemini, etc.
Installation
Zerox uses graphicsmagick
and ghostscript
for the PDF to image processing step. These should be pulled automatically, but you may need to manually install.
Usage
With file URL
From local path
Params
Name | Type | Description |
---|---|---|
filePath | string | File URL or local path |
credentials | Credentials | Provider-specific credentials |
cleanup | boolean (optional) | Clear images from tmp after run. Defaults to true |
concurrency | number (optional) | Number of pages to run at a time. Defaults to 10 |
errorMode | enum (optional) | Supported enums: THROW , IGNORE . Defaults to IGNORE |
extractOnly | boolean (optional) | Set to true to only extract structured data using a schema. Defaults to false |
extractPerPage | string[] (optional) | Extract data per page instead of the entire document |
imageDensity | number (optional) | DPI for image conversion. Defaults to 300 |
imageHeight | number (optional) | Maximum height for converted images. Defaults to 2048 |
llmParams | object (optional) | Additional parameters to pass to the LLM |
maintainFormat | boolean (optional) | Slower but helps maintain consistent formatting. Defaults to false |
maxRetries | number (optional) | Number of retries to attempt on a failed page. Defaults to 1 |
maxTesseractWorkers | number (optional) | Maximum number of Tesseract workers. Defaults to -1 |
model | enum (optional) | Model to use (see Supported Models). Defaults to OPENAI_GPT_4O |
modelProvider | enum (optional) | Supported enums: OPENAI , BEDROCK , GOOGLE , AZURE . Defaults to OPENAI |
customModelFunction | function (optional) | Runs custom model function |
outputDir | string (optional) | Save combined result.md to a file |
pagesToConvertAsImages | number[] (optional) | Page numbers to convert to image as array |
schema | object (optional) | Schema for structured data extraction |
tempDir | string (optional) | Directory to use for temporary files. Defaults to /os/tmp |
trimEdges | boolean (optional) | Trims pixels from all edges that contain values similar to the given background color, which defaults to that of the top-left pixel. Defaults to true |
The maintainFormat
option tries to return the markdown in a consistent format by passing the output of a prior page in as additional context for the next page. This requires the requests to run synchronously, so it’s a lot slower. But valuable if your documents have a lot of tabular data, or frequently have tables that cross pages.
Supported Models
Zerox supports a wide range of models across different providers:
Data Extraction
Zerox supports structured data extraction from documents using a schema. This allows you to pull specific information from documents in a structured format instead of getting the full markdown conversion.
Set extractOnly: true
and provide a schema
to extract structured data. The schema follows the JSON Schema standard.
Use extractPerPage
to extract data per page instead of from the whole document at once.
You can also set extractionModel
, extractionModelProvider
, and extractionCredentials
to use a different model for extraction than OCR. By default, the same model is used.
Response
Example