- Officially Supported LLMs - These are models we have first-class support, meaning they have been verified and are ensured to work well. They are also available as Shared Endpoints for SaaS customers.
- Custom Base Models - Predibase offers best-effort support for deploying custom base models from Huggingface.
Quick Start
First, install the Predibase Python SDK:Using Shared Endpoints
Get started quickly with our pre-deployed shared endpoints:Creating Private Deployments
For production use cases, create your own dedicated deployment:Officially Supported Models
These models are fully tested, optimized, and supported by Predibase:DeepSeek Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
deepseek-r1-distill-qwen-32b | 32.8B | 8K | Qwen | MIT | A100 | ❌ |
Additional DeepSeek R1 and V3 models (original or distilled) are
available upon request. Contact
sales@predibase.com for deploying other DeepSeek models.
Mistral & Mixtral Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
mistral-7b-instruct-v0-2 | 7B | 32K | Mistral | Apache 2.0 | A100 | ❌ |
mixtral-8x7b-instruct-v0-1 | 47B | 8K | Mistral MoE | Apache 2.0 | A100 | ❌ |
Llama 3 Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
llama-3-3-70b-instruct | 70B | 32K | Llama-3 | Meta | A100 | ❌ |
llama-3-2-1b | 1B | 64K | Llama-3 | Meta | A100 | ❌ |
llama-3-2-1b-instruct | 1B | 64K | Llama-3 | Meta | A100 | ❌ |
llama-3-2-3b | 3B | 32K | Llama-3 | Meta | A100 | ❌ |
llama-3-2-3b-instruct | 3B | 32K | Llama-3 | Meta | A100 | ❌ |
llama-3-1-8b | 8B | 64K | Llama-3 | Meta | A100 | ❌ |
llama-3-1-8b-instruct | 8B | 64K | Llama-3 | Meta | A100 | ✅ |
llama-3-8b | 8B | 8K | Llama-3 | Meta | A10G+ | ❌ |
llama-3-8b-instruct | 8B | 8K | Llama-3 | Meta | A10G+ | ❌ |
llama-3-70b | 70B | 8K | Llama-3 | Meta | A100 | ❌ |
llama-3-70b-instruct | 70B | 8K | Llama-3 | Meta | A100 | ❌ |
Llama 2 Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
llama-2-7b | 7B | 4K | Llama-2 | Meta | A10G+ | ❌ |
llama-2-7b-chat | 7B | 4K | Llama-2 | Meta | A10G+ | ❌ |
llama-2-13b | 13B | 4K | Llama-2 | Meta | A100 | ❌ |
llama-2-13b-chat | 13B | 4K | Llama-2 | Meta | A100 | ❌ |
llama-2-70b | 70B | 4K | Llama-2 | Meta | A100 | ❌ |
llama-2-70b-chat | 70B | 4K | Llama-2 | Meta | A100 | ❌ |
Code Llama Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
codellama-7b | 7B | 4K | Llama-2 | Meta | A10G+ | ❌ |
codellama-7b-instruct | 7B | 4K | Llama-2 | Meta | A10G+ | ❌ |
codellama-13b-instruct | 13B | 4K | Llama-2 | Meta | A100 | ❌ |
codellama-70b-instruct | 70B | 4K | Llama-2 | Meta | A100 | ❌ |
Qwen Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
qwen3-8b | 8.19B | 64K | Qwen | Tongyi Qianwen | A100 | ✅ |
qwen3-14b | 14.8B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen3-32b | 32.8B | 16K | Qwen | Tongyi Qianwen | A100 | ✅ |
qwen3-30b-a3b | 30.5B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-coder-3b-instruct | 3.09B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-coder-7b-instruct | 7.62B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-coder-32b-instruct | 32.8B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-1-5b | 1.5B | 64K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-1-5b-instruct | 1.5B | 64K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-7b | 7B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-7b-instruct | 7B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-14b | 14B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-14b-instruct | 14B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-32b | 32B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-32b-instruct | 32B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-72b | 72.7B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-72b-instruct | 72.7B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
Solar Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
solar-1-mini-chat-240612 | 10.7B | 32K | Llama | Custom License | A100 | ❌ |
solar-pro-preview-instruct-v2 | 22.1B | 4K | Solar | Custom License | A100 | ❌ |
solar-pro-241126 | 22.1B | 32K | Solar | Custom License | A100 | ❌ |
solar-pro-preview-instruct (deprecated) | 22.1B | 4K | Solar | Custom License | A100 | ❌ |
Gemma Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
gemma-2b | 2.5B | 8K | Gemma | A10G+ | ❌ | |
gemma-2b-instruct | 2.5B | 8K | Gemma | A10G+ | ❌ | |
gemma-7b | 8.5B | 8K | Gemma | A100 | ❌ | |
gemma-7b-instruct | 8.5B | 8K | Gemma | A100 | ❌ | |
gemma-2-9b | 9.24B | 8K | Gemma | A100 | ❌ | |
gemma-2-9b-instruct | 9.24B | 8K | Gemma | A100 | ❌ | |
gemma-2-27b | 27.2B | 8K | Gemma | A100 | ❌ | |
gemma-2-27b-instruct | 27.2B | 8K | Gemma | A100 | ❌ |
Other Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
zephyr-7b-beta | 7B | 32K | Mistral | MIT | A100 | ❌ |
phi-2 | 2.7B | 2K | Phi-2 | MIT | A10G+ | ❌ |
phi-3-mini-4k-instruct | 3.8B | 4K | Phi-3 | MIT | A10G+ | ❌ |
phi-3-5-mini-instruct | 3.8B | 64K | Phi-3 | MIT | A100 | ❌ |
openhands-lm-32b-v0.1 | 32.8B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
Custom Base Models
Predibase allows you to deploy custom (public or private) base models from Hugging Face.Model Requirements
Before deploying a custom model, verify these requirements:-
Architecture Compatibility
- Uses one of the supported vLLM architectures
- Has the “Text Generation” and “Transformer” tags
- Does not have a “custom_code” tag
-
Format Requirements
- Complete model weights
- Proper configuration files
- Compatible tokenizer implementation
- Correct metadata and tags
- Clear licensing for commercial use or private model
Deploying Custom Models
Deploy a custom model from Hugging Face Hub:Next Steps
- View chat templates for each model
- Try shared endpoints for quick testing
- Set up private deployments for production
- Use Fine-tuned Models to customize models
- Explore vision models for image tasks