Language Models
Available language models and their capabilities
Predibase supports two categories of language models for deployment:
- Officially Supported LLMs - These are models we have first-class support, meaning they have been verified and are ensured to work well. They are also available as Shared Endpoints for SaaS customers.
- Custom Base Models - Predibase offers best-effort support for deploying custom base models from Huggingface.
Quick Start
First, install the Predibase Python SDK:
Using Shared Endpoints
Get started quickly with our pre-deployed shared endpoints:
Creating Private Deployments
For production use cases, create your own dedicated deployment:
Officially Supported Models
These models are fully tested, optimized, and supported by Predibase:
DeepSeek Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
deepseek-r1-distill-qwen-32b | 32.8B | 8K | Qwen | MIT | A100 | ❌ |
Additional DeepSeek R1 and V3 models (original or distilled) are available upon request. Contact sales@predibase.com for deploying other DeepSeek models.
Mistral & Mixtral Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
mistral-7b-instruct-v0-2 | 7B | 32K | Mistral | Apache 2.0 | A100 | ❌ |
mixtral-8x7b-instruct-v0-1 | 47B | 8K | Mistral MoE | Apache 2.0 | A100 | ❌ |
Llama 3 Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
llama-3-3-70b-instruct | 70B | 32K | Llama-3 | Meta | A100 | ❌ |
llama-3-2-1b | 1B | 64K | Llama-3 | Meta | A100 | ❌ |
llama-3-2-1b-instruct | 1B | 64K | Llama-3 | Meta | A100 | ❌ |
llama-3-2-3b | 3B | 32K | Llama-3 | Meta | A100 | ❌ |
llama-3-2-3b-instruct | 3B | 32K | Llama-3 | Meta | A100 | ❌ |
llama-3-1-8b | 8B | 64K | Llama-3 | Meta | A100 | ❌ |
llama-3-1-8b-instruct | 8B | 64K | Llama-3 | Meta | A100 | ✅ |
llama-3-8b | 8B | 8K | Llama-3 | Meta | A10G+ | ❌ |
llama-3-8b-instruct | 8B | 8K | Llama-3 | Meta | A10G+ | ❌ |
llama-3-70b | 70B | 8K | Llama-3 | Meta | A100 | ❌ |
llama-3-70b-instruct | 70B | 8K | Llama-3 | Meta | A100 | ❌ |
Llama 2 Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
llama-2-7b | 7B | 4K | Llama-2 | Meta | A10G+ | ❌ |
llama-2-7b-chat | 7B | 4K | Llama-2 | Meta | A10G+ | ❌ |
llama-2-13b | 13B | 4K | Llama-2 | Meta | A100 | ❌ |
llama-2-13b-chat | 13B | 4K | Llama-2 | Meta | A100 | ❌ |
llama-2-70b | 70B | 4K | Llama-2 | Meta | A100 | ❌ |
llama-2-70b-chat | 70B | 4K | Llama-2 | Meta | A100 | ❌ |
Code Llama Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
codellama-7b | 7B | 4K | Llama-2 | Meta | A10G+ | ❌ |
codellama-7b-instruct | 7B | 4K | Llama-2 | Meta | A10G+ | ❌ |
codellama-13b-instruct | 13B | 4K | Llama-2 | Meta | A100 | ❌ |
codellama-70b-instruct | 70B | 4K | Llama-2 | Meta | A100 | ❌ |
Qwen Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
qwen3-8b | 8.19B | 64K | Qwen | Tongyi Qianwen | A100 | ✅ |
qwen3-14b | 14.8B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen3-32b | 32.8B | 16K | Qwen | Tongyi Qianwen | A100 | ✅ |
qwen3-30b-a3b | 30.5B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-coder-3b-instruct | 3.09B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-coder-7b-instruct | 7.62B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-coder-32b-instruct | 32.8B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-1-5b | 1.5B | 64K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-1-5b-instruct | 1.5B | 64K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-7b | 7B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-7b-instruct | 7B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-14b | 14B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-14b-instruct | 14B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-32b | 32B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-5-32b-instruct | 32B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-72b | 72.7B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
qwen2-72b-instruct | 72.7B | 32K | Qwen | Tongyi Qianwen | A100 | ❌ |
Solar Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
solar-1-mini-chat-240612 | 10.7B | 32K | Llama | Custom License | A100 | ❌ |
solar-pro-preview-instruct-v2 | 22.1B | 4K | Solar | Custom License | A100 | ❌ |
solar-pro-241126 | 22.1B | 32K | Solar | Custom License | A100 | ❌ |
solar-pro-preview-instruct (deprecated) | 22.1B | 4K | Solar | Custom License | A100 | ❌ |
Gemma Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
gemma-2b | 2.5B | 8K | Gemma | A10G+ | ❌ | |
gemma-2b-instruct | 2.5B | 8K | Gemma | A10G+ | ❌ | |
gemma-7b | 8.5B | 8K | Gemma | A100 | ❌ | |
gemma-7b-instruct | 8.5B | 8K | Gemma | A100 | ❌ | |
gemma-2-9b | 9.24B | 8K | Gemma | A100 | ❌ | |
gemma-2-9b-instruct | 9.24B | 8K | Gemma | A100 | ❌ | |
gemma-2-27b | 27.2B | 8K | Gemma | A100 | ❌ | |
gemma-2-27b-instruct | 27.2B | 8K | Gemma | A100 | ❌ |
Other Models
Model Name | Parameters | Context | Architecture | License | GPU | Always On Shared Endpoint |
---|---|---|---|---|---|---|
zephyr-7b-beta | 7B | 32K | Mistral | MIT | A100 | ❌ |
phi-2 | 2.7B | 2K | Phi-2 | MIT | A10G+ | ❌ |
phi-3-mini-4k-instruct | 3.8B | 4K | Phi-3 | MIT | A10G+ | ❌ |
phi-3-5-mini-instruct | 3.8B | 64K | Phi-3 | MIT | A100 | ❌ |
openhands-lm-32b-v0.1 | 32.8B | 16K | Qwen | Tongyi Qianwen | A100 | ❌ |
For detailed information about how to properly prompt each model, see our Chat Templates guide.
Custom Base Models
Predibase allows you to deploy custom (public or private) base models from Hugging Face.
Model Requirements
Before deploying a custom model, verify these requirements:
-
Architecture Compatibility
- Uses one of the supported vLLM architectures
- Has the “Text Generation” and “Transformer” tags
- Does not have a “custom_code” tag
-
Format Requirements
- Complete model weights
- Proper configuration files
- Compatible tokenizer implementation
- Correct metadata and tags
- Clear licensing for commercial use or private model
Deploying Custom Models
Deploy a custom model from Hugging Face Hub:
Next Steps
- View chat templates for each model
- Try shared endpoints for quick testing
- Set up private deployments for production
- Use Fine-tuned Models to customize models
- Explore vision models for image tasks