After fine-tuning your model, it’s crucial to evaluate its performance to ensure it meets your requirements and to identify areas for improvement. Predibase provides several tools and methods for evaluating your fine-tuned adapters.Documentation Index
Fetch the complete documentation index at: https://docs.predibase.com/llms.txt
Use this file to discover all available pages before exploring further.
Evaluation Methods
You can evaluate your fine-tuned models in two ways:- Online Evaluation: Test your model’s performance in real-time through the Predibase API
- Offline Evaluation: Batch evaluate your model’s performance on a test dataset (coming soon)
Evaluation Harness
We provide an evaluation harness as part of our LoRA Bakeoff repository. This harness allows you to:- Compare different fine-tuning approaches
- Benchmark against baseline models
- Measure performance across multiple metrics
- Evaluate on standard benchmarks