SDKs
Python
Zerox Python SDK
Supports vision models from different providers like OpenAI, Azure OpenAI, Anthropic, AWS Bedrock, etc.
Installation
Install poppler
on the system, it should be available in path variable. See the pdf2image documentation for instructions by platform.
The pyzerox.zerox
function is an asynchronous API that performs OCR (Optical Character Recognition) to markdown using vision models. It processes PDF files and converts them into markdown format. Make sure to set up the environment variables for the model and the model provider before using this API.
Refer to the LiteLLM Documentation for setting up the environment and passing the correct model name.
Usage
Params
Name | Type | Description |
---|---|---|
file_path | string | Path to the PDF file to process |
cleanup | bool (optional) | Whether to clean up temporary files after processing. Defaults to True |
concurrency | int (optional) | Number of concurrent processes to run. Defaults to 10 |
custom_system_prompt | str (optional) | System prompt to use for the model. Defaults to None |
kwargs | dict (optional) | Additional keyword arguments to pass to the litellm.completion method |
maintain_format | bool (optional) | Whether to maintain the format from the previous page. Defaults to False |
model | str (optional) | Model to use for generating completions. Defaults to gpt-4o-mini |
output_dir | str (optional) | Directory to store temporary files |
select_pages | int | Iterable[int] (optional) | Pages to process, can be a single page number or an iterable of page numbers. Defaults to None |
tmp_dir | str (optional) | Directory to store temporary files |
Response
Example