Deploy models in your own environment
DeploymentConfig
reference for the full
definition and defaults.
Accelerator | ID | Predibase Tiers | GPUs | SKU |
---|---|---|---|---|
1 A10G 24GB | a10_24gb_100 | All | 1 | A10G |
1 L40S 48GB | l40s_48gb_100 | All | 1 | L40S |
1 L4 24GB | l4_24gb_100 | Enterprise VPC | 1 | L4 |
1 A100 80GB | a100_80gb_100 | Enterprise SaaS | 1 | A100 |
2 A100 80GB | a100_80gb_200 | Enterprise SaaS | 2 | A100 |
4 A10G 24GB | a10_24gb_400 | Enterprise VPC | 4 | A10G |
1 H100 80GB PCIe | h100_80gb_pcie_100 | Enterprise SaaS and VPC | 1 | H100 |
1 H100 80GB SXM | h100_80gb_sxm_100 | Enterprise SaaS and VPC | 1 | H100 |
vpce-0123456789-01abcde.vpce-svc-012345abc.us-west-2.vpce.amazonaws.com
)min_replicas=0
to avoid paying for gpu hours when your deployment isn’t receiving requests