Different types of tasks supported for fine-tuning
Predibase supports several post-training fine-tuning methods, each designed for specific use cases. This guide explains the different task types and how to use them effectively.
Supervised Fine-Tuning (SFT) focuses on instruction tuning, where the model is trained specifically on the completions (outputs) to improve its instruction following capabilities. This method teaches the model to generate appropriate responses to user prompts by learning from high-quality examples of prompt-completion pairs.
SFT supports two main formats:
Instruction Format: For task-oriented applications requiring specific instructions
prompt
and completion
columns in the datasetChat Format: For conversational applications like chatbots and customer support
messages
column with JSON-style conversation formatuser
and one assistant
role per conversationExample configuration:
Reinforcement Learning through Group Relative Policy Optimization (GRPO) is an advanced fine-tuning method that applies reinforcement learning techniques to optimize model behavior without requiring labeled data. Unlike traditional Supervised Fine-Tuning, GRPO uses one or more programmable reward functions to score the correctness of generated outputs during training, allowing the model to self-improve by iteratively refining its responses.
This approach is particularly effective for:
Example configuration:
Learn more about how to do Reinforcement Learning fine-tuning on Predibase →
Continued Pre-Training extends the original pretraining phase of the model by training over your text data using the next token objective loss function. This approach allows the model to further adapt to domain-specific language patterns and knowledge, improving its overall language understanding and generation capabilities.
This method is especially valuable for:
The dataset requires a single text
column containing the training sequences.
Learn more about how to do continued pre-training on Predibase →
Function calling fine-tuning enables models to learn which function calls to make based on user requests. This specialized form of fine-tuning teaches models to:
This is particularly useful for:
The dataset requires a specific format with tool definitions and examples of their usage.
Learn more about how to do Function Calling fine-tuning on Predibase →