Skip to main content

llm.finetune

# Pick a Huggingface LLM to fine-tune.
# The URI must look something like "hf://meta-llama/Llama-2-7b-hf"
llm = pc.LLM(uri)

# Asynchronous fine-tuning
llm.finetune(prompt_template=None, target=None, dataset=None, config=None, repo=None)

# Synchronous (blocking) fine-tuning
llm.finetune(...).get()

This method allows you to train a finetuned LLM (without deploying it). To learn more about finetuning, you can read our primer on it here.

Parameters:

Where possible, Predibase will use sensible defaults for your fine-tuning job (including generating a training config, selecting an engine for you, etc.).

prompt_template: Optional[Union[str, PromptTemplate]]

The prompt, as either a template string or PromptTemplate object, to be used for finetuning the LLM. If using a template string, the name of the feature from the dataset to use as input should be surrounded by curly brackets ({feature_name}). If instantiating a PromptTemplate object, feature names should be preceded by a dollar sign ($feature_name).

target: Required[str]

The name of the column or feature in the dataset to finetune against.

dataset: Required[Union[str, Dataset]]

The dataset to use for finetuning (and which should contain the target above). This can either be a Predibase Dataset object, or a raw string mapping to the name of one of your Predibase Datasets.

epochs: Optional[integer]

An epoch refers to a single complete pass through the entire training dataset. Defaults to 3.

learning_rate: Optional[float]

Learning rate refers to how much the weights of the model are updated at each training step. Defaults to 0.0002.

train_steps: Optional[integer]

A train step is a single forward pass & backward pass through the model. The model takes in a batch of examples, computes the loss and gradients, and then updates the model's parameters during the backward pass. An alternative to epochs.

config: Optional[Union[str, Dict]]

The model config to use for training.

repo: Optional[str]

The name of the model repo that will be created for training. If left blank, a repo name will be autogenerated using the base model and dataset name.

Returns:

llm.finetune: A ModelFuture object representing the training job kicked off by Predibase.
llm.finetune.get: A Model object that holds a trained, finetuned LLM model.

Example Usage:

fine_tuned_llm = llm.finetune(
prompt_template="Given a target sentence: {ref}, say something.", # `ref` is a name of a column from `viggo`
target="mr", # `mr` is a name of a column from `viggo`
dataset="file_uploads/viggo",
)
# Model repository llama-2-7b-viggo already exists and new models will be added to it.
# Check Status of Model Training Here: https://api.predibase.com/models/version/XXXXX
# Monitoring status of model training...
# Compute summary:
# Cloud: aws
# * T4 16 GB x2
# Training job submitted to Predibase. Track progress here: https://api.predibase.com/models/version/XXXXX

Supported OSS LLMs

See the updated list of LLMs that we support for fine-tuning.