Examples
GRPO for Countdown
Fine-tune a model to play Countdown using reinforcement learning
This example demonstrates how to use the Predibase SDK to use Reinforcement Finetuning to train a model to play Countdown.
Fine-tune a model to play Countdown using reinforcement learning
This example demonstrates how to use the Predibase SDK to use Reinforcement Finetuning to train a model to play Countdown.