Compute Requirements
AWS Instance Type Recommendations for Model Deployment in Private Cloud
The below compute requirements should be sufficient for deploying OmniAI vis Docker container. Recommended configuration may vary based on total data throughput:
Compute Requirements (For API + Workers + Frontend)
Recommended: t3.large or t3.xlarge, offering 2–4 vCPUs and 8–16 GiB of memory. This setup is sufficient to support a Node.js API, a few background workers, and a lightweight frontend without overprovisioning resources.
Database
Use a mid-tier Amazon RDS PostgreSQL instance (e.g., db.m6g.large or db.m5.large), which offers a good balance of performance and cost for moderate query volumes and storage needs.
GPU Requirements (For Document Processing)
Use a single g5.12xlarge instance (approx. $5.672/hr on-demand). This configuration is capable of processing approximately 1 million pages per month, making it suitable for high-throughput document extraction workloads.
Storage
EBS-optimized instances with Provisioned IOPS SSD (io1 or io2) volumes are recommended. For most deployments, starting with at least 500GB of storage with 3,000 IOPS is advisable, allowing for high-performance read/write operations and accommodating the storage demands of our Docker images and data.
Networking
Choose instances with at least 10 Gigabit connectivity to ensure high throughput and low latency, particularly crucial for maintaining performance and responsiveness in network-intensive operations.