Evaluation Methods
You can evaluate your fine-tuned models in two ways:- Online Evaluation: Test your model’s performance in real-time through the Predibase API
- Offline Evaluation: Batch evaluate your model’s performance on a test dataset (coming soon)
Evaluation Harness
We provide an evaluation harness as part of our LoRA Bakeoff repository. This harness allows you to:- Compare different fine-tuning approaches
- Benchmark against baseline models
- Measure performance across multiple metrics
- Evaluate on standard benchmarks