GRPO for Countdown

This example demonstrates how to use the Predibase SDK to use Reinforcement Finetuning to train a model to play Countdown.