Available language models and their capabilities
Predibase supports two categories of language models for deployment:
First, install the Predibase Python SDK:
Get started quickly with our pre-deployed shared endpoints:
For production use cases, create your own dedicated deployment:
These models are fully tested, optimized, and supported by Predibase:
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
deepseek-r1-distill-qwen-32b | 32.8B | 8K | Qwen | MIT | A100 | ❌ |
Additional DeepSeek R1 and V3 models (original or distilled) are available upon request. Contact sales@predibase.com for deploying other DeepSeek models.
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
mistral-7b-instruct-v0-2 | 7B | 32K | Mistral | Apache 2.0 | A100 | ❌ |
mixtral-8x7b-instruct-v0-1 | 47B | 8K | Mistral MoE | Apache 2.0 | A100 | ❌ |
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
llama-3-3-70b-instruct | 70B | 32K | Llama-3 | Meta | A100 | ❌ |
llama-3-2-1b | 1B | 64K | Llama-3 | Meta | A100 | ❌ |
llama-3-2-1b-instruct | 1B | 64K | Llama-3 | Meta | A100 | ❌ |
llama-3-2-3b | 3B | 32K | Llama-3 | Meta | A100 | ❌ |
llama-3-2-3b-instruct | 3B | 32K | Llama-3 | Meta | A100 | ❌ |
llama-3-1-8b | 8B | 64K | Llama-3 | Meta | A100 | ❌ |
llama-3-1-8b-instruct | 8B | 64K | Llama-3 | Meta | A100 | ✅ |
llama-3-8b | 8B | 8K | Llama-3 | Meta | A10G+ | ❌ |
llama-3-8b-instruct | 8B | 8K | Llama-3 | Meta | A10G+ | ❌ |
llama-3-70b | 70B | 8K | Llama-3 | Meta | A100 | ❌ |
llama-3-70b-instruct | 70B | 8K | Llama-3 | Meta | A100 | ❌ |
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
llama-2-7b | 7B | 4K | Llama-2 | Meta | A10G+ | ❌ |
llama-2-7b-chat | 7B | 4K | Llama-2 | Meta | A10G+ | ❌ |
llama-2-13b | 13B | 4K | Llama-2 | Meta | A100 | ❌ |
llama-2-13b-chat | 13B | 4K | Llama-2 | Meta | A100 | ❌ |
llama-2-70b | 70B | 4K | Llama-2 | Meta | A100 | ❌ |
llama-2-70b-chat | 70B | 4K | Llama-2 | Meta | A100 | ❌ |
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
codellama-7b | 7B | 4K | Llama-2 | Meta | A10G+ | ❌ |
codellama-7b-instruct | 7B | 4K | Llama-2 | Meta | A10G+ | ❌ |
codellama-13b-instruct | 13B | 4K | Llama-2 | Meta | A100 | ❌ |
codellama-70b-instruct | 70B | 4K | Llama-2 | Meta | A100 | ❌ |
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
qwen3-8b | 8.19B | 64K | Qwen | Tongyi Qianwen | A100 | ✅ |
qwen3-14b | 14.8B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen3-32b | 32.8B | 16K | Qwen | Tongyi Qianwen | A100 | ✅ |
qwen3-30b-a3b | 30.5B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-coder-3b-instruct | 3.09B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-coder-7b-instruct | 7.62B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-coder-32b-instruct | 32.8B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-1-5b | 1.5B | 64K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-1-5b-instruct | 1.5B | 64K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-7b | 7B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-7b-instruct | 7B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-14b | 14B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-14b-instruct | 14B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-32b | 32B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-32b-instruct | 32B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-72b | 72.7B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-72b-instruct | 72.7B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
solar-1-mini-chat-240612 | 10.7B | 32K | Llama | Custom License | A100 | ❌ |
solar-pro-preview-instruct-v2 | 22.1B | 4K | Solar | Custom License | A100 | ❌ |
solar-pro-241126 | 22.1B | 32K | Solar | Custom License | A100 | ❌ |
solar-pro-preview-instruct (deprecated) | 22.1B | 4K | Solar | Custom License | A100 | ❌ |
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
gemma-2b | 2.5B | 8K | Gemma | A10G+ | ❌ | |
gemma-2b-instruct | 2.5B | 8K | Gemma | A10G+ | ❌ | |
gemma-7b | 8.5B | 8K | Gemma | A100 | ❌ | |
gemma-7b-instruct | 8.5B | 8K | Gemma | A100 | ❌ | |
gemma-2-9b | 9.24B | 8K | Gemma | A100 | ❌ | |
gemma-2-9b-instruct | 9.24B | 8K | Gemma | A100 | ❌ | |
gemma-2-27b | 27.2B | 8K | Gemma | A100 | ❌ | |
gemma-2-27b-instruct | 27.2B | 8K | Gemma | A100 | ❌ |
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
zephyr-7b-beta | 7B | 32K | Mistral | MIT | A100 | ❌ |
phi-2 | 2.7B | 2K | Phi-2 | MIT | A10G+ | ❌ |
phi-3-mini-4k-instruct | 3.8B | 4K | Phi-3 | MIT | A10G+ | ❌ |
phi-3-5-mini-instruct | 3.8B | 64K | Phi-3 | MIT | A100 | ❌ |
openhands-lm-32b-v0.1 | 32.8B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
For detailed information about how to properly prompt each model, see our Chat Templates guide.
Predibase allows you to deploy custom (public or private) base models from Hugging Face.
Before deploying a custom model, verify these requirements:
Architecture Compatibility
Format Requirements
Deploy a custom model from Hugging Face Hub: