Fine-Tuning Tasks
Different types of tasks supported for fine-tuning
Predibase supports several post-training fine-tuning methods, each designed for specific use cases. This guide explains the different task types and how to use them effectively.
Supervised Fine-tuning
Supervised Fine-Tuning (SFT) focuses on instruction tuning, where the model is trained specifically on the completions (outputs) to improve its instruction following capabilities. This method teaches the model to generate appropriate responses to user prompts by learning from high-quality examples of prompt-completion pairs.
SFT supports two main formats:
-
Instruction Format: For task-oriented applications requiring specific instructions
- Uses
prompt
andcompletion
columns in the dataset - Ideal for classification, translation, summarization, and creative tasks
- Example dataset schema:
- Uses
-
Chat Format: For conversational applications like chatbots and customer support
- Uses a
messages
column with JSON-style conversation format - Requires at least one
user
and oneassistant
role per conversation - Example dataset schema:
- Uses a
Example configuration:
Reinforcement Fine-tuning
Reinforcement Learning through Group Relative Policy Optimization (GRPO) is an advanced fine-tuning method that applies reinforcement learning techniques to optimize model behavior without requiring labeled data. Unlike traditional Supervised Fine-Tuning, GRPO uses one or more programmable reward functions to score the correctness of generated outputs during training, allowing the model to self-improve by iteratively refining its responses.
This approach is particularly effective for:
- Reasoning tasks where Chain-Of-Thoughts (CoT) helps improve base performance
- Scenarios where explicit labels aren’t available but there’s an objective metric
- Optimizing model behavior without extensive human feedback
- Developing generalized strategies for solving tasks
Example configuration:
Learn more about how to do Reinforcement Learning fine-tuning on Predibase →
Continued Pre-Training
Continued Pre-Training extends the original pretraining phase of the model by training over your text data using the next token objective loss function. This approach allows the model to further adapt to domain-specific language patterns and knowledge, improving its overall language understanding and generation capabilities.
This method is especially valuable for:
- Domain adaptation (legal, medical, technical documentation)
- Learning new vocabulary or writing styles
- Incorporating new knowledge not present in the original training data
- Improving performance on domain-specific tasks
The dataset requires a single text
column containing the training sequences.
Learn more about how to do continued pre-training on Predibase →
Function Calling
Function calling fine-tuning enables models to learn which function calls to make based on user requests. This specialized form of fine-tuning teaches models to:
- Make appropriate function calls based on user requests
- Format arguments correctly according to function schemas
- Handle function responses and incorporate them into replies
This is particularly useful for:
- Building tool-using agents
- API integration
- Structured output generation
The dataset requires a specific format with tool definitions and examples of their usage.
Learn more about how to do Function Calling fine-tuning on Predibase →
Next Steps
- Learn about dataset preparation for each task type
- Explore different adapter types for fine-tuning
- Start evaluating your fine-tuned models