Fine-tuned Adapters
Learn how to use custom fine-tuned adapters with Predibase models
Predibase supports using fine-tuned adapters to customize model behavior. You can:
- Upload local PEFT adapters
- Use adapters from Hugging Face Hub
- Deploy adapters with any compatible base model
Quick Start
First, install the Predibase Python SDK:
Adapter Sources and IDs
Adapters can come from three sources:
1. Predibase Adapters
You can fine-tune adapters on Predibase using the SDK or UI. Once trained, you can prompt them using a deployment of the base model which was fine-tuned, without needing to pre-load or recreate the deployment. All deployments support adapter inference out of the box.
2. Upload Local Adapters
Import an adapter trained outside of Predibase for inference. Your local adapter directory must follow the PEFT format:
Upload and prompt the adapter.
3. Hugging Face Hub Adapters
When using adapters from Hugging Face:
- Adapter ID format:
"organization/adapter-name"
- Example:
"predibase/tldr_headline_gen"
,"predibase/mistral-instruct"
- Must specify
adapter_source="hub"
Public Adapters
Private Adapters
To run inference on your private adapter, you’ll additionally need:
- HuggingFace API token with write access
With REST API
Access adapters through the REST API for language-agnostic integration. First, set up your environment variables:
For PREDIBASE_DEPLOYMENT
, the base model must correspond to the model that was
fine-tuned:
- For shared LLMs, use the model name (e.g., “qwen3-8b”)
- For private serverless deployments, use your deployment name (e.g., “my-qwen3-8b”)
Important Notes
- When querying fine-tuned models, include the prompt template used for
fine-tuning in the
inputs
- For streaming responses, use the
/generate_stream
endpoint instead of/generate
- Parameters follow the same format as the LoRAX generate endpoint
Next Steps
- Train a custom speculator to improve performance
- Explore supported models
- Set up private deployments