Supported Models
Well-supported LLMs
Large Language Models (Text)
You may fine-tune LoRA, Turbo LoRA, and Turbo adapters on any of these base LLMs, but note that for Turbo LoRA and Turbo adapters, some models may require additional deployment configurations.
Model Name | Parameters | Architecture | License | Context Window | Supported Fine-Tuning Context Window | Adapter Pre-load Not Required |
---|---|---|---|---|---|---|
solar-1-mini-chat-240612 | 10.7 billion | Llama | Custom License | 32768 | 32768 | ✅ |
solar-pro-preview-instruct-v2 | 22.1 billion | Solar | Custom License | 4096 | 4096 | ❌ |
solar-pro-241126 | 22.1 billion | Solar | Custom License | 32768 | 16384 | ❌ |
mistral-7b | 7 billion | Mistral | Apache 2.0 | 32768 | 32768 | ✅ |
mistral-7b-instruct | 7 billion | Mistral | Apache 2.0 | 32768 | 32768 | ✅ |
mistral-7b-instruct-v0-2 | 7 billion | Mistral | Apache 2.0 | 32768 | 32768 | ✅ |
mistral-7b-instruct-v0-3 | 7 billion | Mistral | Apache 2.0 | 32768 | 32768 | ✅ |
mistral-nemo-12b-2407 | 12 billion | Mistral | Apache 2.0 | 131072 | 32768 | ❌ |
mistral-nemo-12b-instruct-2407 | 12 billion | Mistral | Apache 2.0 | 131072 | 32768 | ❌ |
zephyr-7b-beta | 7 billion | Mistral | MIT | 32768 | 32768 | ✅ |
llama-3-3-70b | 70.6 billion | Llama-3 | Meta (request for commercial use) | 131072 | 8192 | ❌ |
llama-3-2-1b | 1.24 billion | Llama-3 | Meta (request for commercial use) | 32768 | 32768 | ❌ |
llama-3-2-1b-instruct | 1.24 billion | Llama-3 | Meta (request for commercial use) | 32768 | 32768 | ❌ |
llama-3-2-3b | 3.21 billion | Llama-3 | Meta (request for commercial use) | 32768 | 32768 | ❌ |
llama-3-2-3b-instruct | 3.21 billion | Llama-3 | Meta (request for commercial use) | 32768 | 32768 | ❌ |
llama-3-1-8b | 8 billion | Llama-3 | Meta (request for commercial use) | 62999 | 32768 | ❌ |
llama-3-1-8b-instruct | 8 billion | Llama-3 | Meta (request for commercial use) | 62999 | 32768 | ✅ |
llama-3-8b | 8 billion | Llama-3 | Meta (request for commercial use) | 8192 | 8192 | ❌ |
llama-3-8b-instruct | 8 billion | Llama-3 | Meta (request for commercial use) | 8192 | 8192 | ❌ |
llama-3-70b | 70 billion | Llama-3 | Meta (request for commercial use) | 8192 | 8192 | ❌ |
llama-3-70b-instruct | 70 billion | Llama-3 | Meta (request for commercial use) | 8192 | 8192 | ❌ |
llama-2-7b | 7 billion | Llama-2 | Meta (request for commercial use) | 4096 | 4096 | ❌ |
llama-2-7b-chat | 7 billion | Llama-2 | Meta (request for commercial use) | 4096 | 4096 | ❌ |
llama-2-13b | 13 billion | Llama-2 | Meta (request for commercial use) | 4096 | 4096 | ❌ |
llama-2-13b-chat | 13 billion | Llama-2 | Meta (request for commercial use) | 4096 | 4096 | ❌ |
llama-2-70b | 70 billion | Llama-2 | Meta (request for commercial use) | 4096 | 4096 | ❌ |
llama-2-70b-chat | 70 billion | Llama-2 | Meta (request for commercial use) | 4096 | 4096 | ❌ |
codellama-7b | 7 billion | Llama-2 | Meta (request for commercial use) | 4096 | 4096 | ❌ |
codellama-7b-instruct | 7 billion | Llama-2 | Meta (request for commercial use) | 4096 | 4096 | ❌ |
codellama-13b-instruct | 13 billion | Llama-2 | Meta (request for commercial use) | 16384 | 16384 | ❌ |
codellama-70b-instruct | 70 billion | Llama-2 | Meta (request for commercial use) | 4096 | 4096 | ❌ |
mixtral-8x7b-instruct-v0-1 | 46.7 billion | Mixtral | Apache 2.0 | 32768 | 7168 | ❌ |
phi-2 | 2.7 billion | Phi-2 | Microsoft | 2048 | 2048 | Turbo not supported |
phi-3-mini-4k-instruct | 3.8 billion | Phi-3 | Microsoft | 4096 | 4096 | Turbo LoRA not supported |
phi-3-5-mini-instruct | 3.8 billion | Phi-3 | Microsoft | 131072 | 16384 | ❌ |
gemma-2b | 2.5 billion | Gemma | 8192 | 8192 | ❌ | |
gemma-2b-instruct | 2.5 billion | Gemma | 8192 | 8192 | ❌ | |
gemma-7b | 8.5 billion | Gemma | 8192 | 8192 | ❌ | |
gemma-7b-instruct | 8.5 billion | Gemma | 8192 | 8192 | ❌ | |
gemma-2-9b | 9.24 billion | Gemma | 8192 | 8192 | ❌ | |
gemma-2-9b-instruct | 9.24 billion | Gemma | 8192 | 8192 | ❌ | |
gemma-2-27b | 27.2 billion | Gemma | 8192 | 4096 | ❌ | |
gemma-2-27b-instruct | 27.2 billion | Gemma | 8192 | 4096 | ❌ | |
qwen2-5-1-5b | 1.54 billion | Qwen | Tongyi Qianwen | 131072 | 32768 | ❌ |
qwen2-5-1-5b-instruct | 1.54 billion | Qwen | Tongyi Qianwen | 32768 | 32768 | ❌ |
qwen2-5-7b | 7.62 billion | Qwen | Tongyi Qianwen | 131072 | 32768 | ❌ |
qwen2-5-7b-instruct | 7.62 billion | Qwen | Tongyi Qianwen | 32768 | 32768 | ❌ |
qwen2-5-14b | 14.8 billion | Qwen | Tongyi Qianwen | 131072 | 32768 | ❌ |
qwen2-5-14b-instruct | 14.8 billion | Qwen | Tongyi Qianwen | 32768 | 32768 | ❌ |
qwen2-5-32b | 32.8 billion | Qwen | Tongyi Qianwen | 131072 | 16384 | ❌ |
qwen2-5-32b-instruct | 32.8 billion | Qwen | Tongyi Qianwen | 32768 | 16384 | ❌ |
qwen2-1-5b | 1.54 billion | Qwen | Tongyi Qianwen | 131072 | 32768 | ❌ |
qwen2-1-5b-instruct | 1.54 billion | Qwen | Tongyi Qianwen | 131072 | 32768 | ❌ |
qwen2-7b | 7.62 billion | Qwen | Tongyi Qianwen | 131072 | 32768 | ❌ |
solar-pro-preview-instruct (deprecated) | 22.1 billion | Solar | Custom License | 4096 | 4096 | ❌ |
Many of the latest OSS models are released in two variants:
- Base model (llama-2-7b, etc): These are models that are primarily trained on the objective of text completion.
- Instruction-Tuned (llama-2-7b-chat, mistral-7b-instruct, etc): These are models that have been further trained on (instruction, output) pairs in order to better respond to human instruction-styled inputs. The instructions effectively constrains the model’s output to align with the response characteristics or domain knowledge.
Read more about the different types of LoRA adapters here.
(New) Visual Language Models
Model Name | Parameters | Architecture | License | LoRA | Turbo LoRA | Turbo | Context Window | Supported Fine-Tuning Context Window |
---|---|---|---|---|---|---|---|---|
llama-3-2-11b-vision | 10.7 billion | Mllama | Meta (request for commercial use) | ✅ | ❌ | ❌ | 131072 | 32768 |
llama-3-2-11b-vision-instruct | 10.7 billion | Mllama | Meta (request for commercial use) | ✅ | ❌ | ❌ | 131072 | 32768 |
To get started with VLM fine-tuning, check out this user-guide here on how to format your data for training and test inference.
Best-Effort LLMs (via HuggingFace)
Best-effort fine-tuning is also offered for any Huggingface LLM meeting the following criteria:
- Has the "Text Generation" and "Transformer" tags
- Does not have a "custom_code" tag
- Are not post-quantized (ex. model containing a quantization method such as "AWQ" in the name)
- Has text inputs and outputs
"Best-effort" means we will try to support these models but it is not guaranteed.
Fine-tuning a custom LLM
- Get the Huggingface ID for your model by clicking the the copy icon on the custom base model's page, ex. "BioMistral/BioMistral-7B".
- Pass the Huggingface ID as the
base_model
.
# Create an adapter repository
repo = pb.repos.create(name="bio-summarizer", description="Bio News Summarizer", exists_ok=True)
# Start a fine-tuning job, blocks until training is finished
adapter = pb.adapters.create(
config=FinetuningConfig(
base_model="BioMistral/BioMistral-7B"
),
dataset="bio-dataset",
repo=repo,
description="initial model with defaults"
)
Predibase training metrics will be automatically streamed to stdout. To view additional metrics via Tensorboard, pass show_tensorboard=True
to the create
call:
adapter = pb.adapters.create(
config=FinetuningConfig(
base_model="BioMistral/BioMistral-7B"
),
dataset="bio-dataset",
repo=repo,
description="initial model with defaults",
show_tensorboard=True
)
Note that tensorboard data make take some time to refresh.
Note that if you fine-tune a custom model not on our shared deployments list, you'll need to deploy the custom base model as a private serverless deployment in order to run inference on your newly trained adapter.