Skip to main content

Supported Models

Well-supported LLMs

Large Language Models (Text)

You may fine-tune LoRA, Turbo LoRA, and Turbo adapters on any of these base LLMs, but note that for Turbo LoRA and Turbo adapters, some models may require additional deployment configurations.

Model NameParametersArchitectureLicenseContext WindowSupported Fine-Tuning Context WindowAdapter Pre-load Not Required
solar-1-mini-chat-24061210.7 billionSolarCustom License3276832768
solar-pro-preview-instruct22.1 billionSolarCustom License40964096
mistral-7b7 billionMistralApache 2.03276832768
mistral-7b-instruct7 billionMistralApache 2.03276832768
mistral-7b-instruct-v0-27 billionMistralApache 2.03276832768
mistral-7b-instruct-v0-37 billionMistralApache 2.03276832768
mistral-nemo-12b-240712 billionMistralApache 2.013107232768
mistral-nemo-12b-instruct-240712 billionMistralApache 2.013107232768
zephyr-7b-beta7 billionMistralMIT3276832768
llama-3-2-1b1.24 billionLlama-3Meta (request for commercial use)3276832768
llama-3-2-1b-instruct1.24 billionLlama-3Meta (request for commercial use)3276832768
llama-3-2-3b3.21 billionLlama-3Meta (request for commercial use)3276832768
llama-3-2-3b-instruct3.21 billionLlama-3Meta (request for commercial use)3276832768
llama-3-1-8b8 billionLlama-3Meta (request for commercial use)6299932768
llama-3-1-8b-instruct8 billionLlama-3Meta (request for commercial use)6299932768
llama-3-8b8 billionLlama-3Meta (request for commercial use)81928192
llama-3-8b-instruct8 billionLlama-3Meta (request for commercial use)81928192
llama-3-70b70 billionLlama-3Meta (request for commercial use)81928192
llama-3-70b-instruct70 billionLlama-3Meta (request for commercial use)81928192
llama-2-7b7 billionLlama-2Meta (request for commercial use)40964096
llama-2-7b-chat7 billionLlama-2Meta (request for commercial use)40964096
llama-2-13b13 billionLlama-2Meta (request for commercial use)40964096
llama-2-13b-chat13 billionLlama-2Meta (request for commercial use)40964096
llama-2-70b70 billionLlama-2Meta (request for commercial use)40964096
llama-2-70b-chat70 billionLlama-2Meta (request for commercial use)40964096
codellama-7b7 billionLlama-2Meta (request for commercial use)40964096
codellama-7b-instruct7 billionLlama-2Meta (request for commercial use)40964096
codellama-13b-instruct13 billionLlama-2Meta (request for commercial use)1638416384
codellama-70b-instruct70 billionLlama-2Meta (request for commercial use)40964096
mixtral-8x7b-instruct-v0-146.7 billionMixtralApache 2.0327687168
phi-22.7 billionPhi-2Microsoft20482048Turbo not supported
phi-3-mini-4k-instruct3.8 billionPhi-3Microsoft40964096Turbo LoRA not supported
phi-3-5-mini-instruct3.8 billionPhi-3Microsoft13107216384
gemma-2b2.5 billionGemmaGoogle81928192
gemma-2b-instruct2.5 billionGemmaGoogle81928192
gemma-7b8.5 billionGemmaGoogle81928192
gemma-7b-instruct8.5 billionGemmaGoogle81928192
gemma-2-9b9.24 billionGemmaGoogle81928192
gemma-2-9b-instruct9.24 billionGemmaGoogle81928192
gemma-2-27b27.2 billionGemmaGoogle81924096
gemma-2-27b-instruct27.2 billionGemmaGoogle81924096
qwen2-5-1-5b1.54 billionQwenTongyi Qianwen13107232768
qwen2-5-1-5b-instruct1.54 billionQwenTongyi Qianwen3276832768
qwen2-5-7b7.62 billionQwenTongyi Qianwen13107232768
qwen2-5-7b-instruct7.62 billionQwenTongyi Qianwen3276832768
qwen2-5-14b14.8 billionQwenTongyi Qianwen13107232768
qwen2-5-14b-instruct14.8 billionQwenTongyi Qianwen3276832768
qwen2-5-32b32.8 billionQwenTongyi Qianwen13107216384
qwen2-5-32b-instruct32.8 billionQwenTongyi Qianwen3276816384
qwen2-1-5b1.54 billionQwenTongyi Qianwen13107232768
qwen2-1-5b-instruct1.54 billionQwenTongyi Qianwen13107232768
qwen2-7b7.62 billionQwenTongyi Qianwen13107232768

Many of the latest OSS models are released in two variants:

  • Base model (llama-2-7b, etc): These are models that are primarily trained on the objective of text completion.
  • Instruction-Tuned (llama-2-7b-chat, mistral-7b-instruct, etc): These are models that have been further trained on (instruction, output) pairs in order to better respond to human instruction-styled inputs. The instructions effectively constrains the model’s output to align with the response characteristics or domain knowledge.

Read more about the different types of LoRA adapters here.

(New) Visual Language Models

Model NameParametersArchitectureLicenseLoRATurbo LoRATurboContext WindowSupported Fine-Tuning Context Window
llama-3-2-11b-vision10.7 billionMllamaMeta (request for commercial use)13107232768
llama-3-2-11b-vision-instruct10.7 billionMllamaMeta (request for commercial use)13107232768

To get started with VLM fine-tuning, check out this user-guide here on how to format your data for training and test inference.

Best-Effort LLMs (via HuggingFace)

Best-effort fine-tuning is also offered for any Huggingface LLM meeting the following criteria:

  • Has the "Text Generation" and "Transformer" tags
  • Does not have a "custom_code" tag
  • Are not post-quantized (ex. model containing a quantization method such as "AWQ" in the name)
  • Has text inputs and outputs

"Best-effort" means we will try to support these models but it is not guaranteed.

Fine-tuning a custom LLM

  1. Get the Huggingface ID for your model by clicking the the copy icon on the custom base model's page, ex. "BioMistral/BioMistral-7B".

Huggingface screenshot

  1. Pass the Huggingface ID as the base_model.
# Create an adapter repository
repo = pb.repos.create(name="bio-summarizer", description="Bio News Summarizer", exists_ok=True)

# Start a fine-tuning job, blocks until training is finished
adapter = pb.adapters.create(
config=FinetuningConfig(
base_model="BioMistral/BioMistral-7B"
),
dataset="bio-dataset",
repo=repo,
description="initial model with defaults"
)

Predibase training metrics will be automatically streamed to stdout. To view additional metrics via Tensorboard, pass show_tensorboard=True to the create call:

adapter = pb.adapters.create(
config=FinetuningConfig(
base_model="BioMistral/BioMistral-7B"
),
dataset="bio-dataset",
repo=repo,
description="initial model with defaults",
show_tensorboard=True
)

Note that tensorboard data make take some time to refresh.

private serverless deployment needed for inference

Note that if you fine-tune a custom model not on our shared deployments list, you'll need to deploy the custom base model as a private serverless deployment in order to run inference on your newly trained adapter.