Example: GRPO Finetuning for Playing Countdown
This example demonstrates how to use the Predibase SDK to use Reinforcement Finetuning to train a model to play Countdown.
This example demonstrates how to use the Predibase SDK to use Reinforcement Finetuning to train a model to play Countdown.