Predibase home page
Search...
⌘K
Ask AI
Support
Sign In
Sign In
Search...
Navigation
Documentation
Python SDK
REST API
User Guides
Getting Started
Introduction
Quickstart
Inference
Overview
Querying Models
Models
Deployments
Fine-tuned Adapters
Batch Inference
Fine-Tuning
Overview
Supported Models
Datasets
Adapters
Tasks
Distributed Training
Evaluation
Hyperparameter Tuning
Account
Roles & Permissions
VPC Provisioning
Integrations
Integrations
Examples
LoRA Land for Customer Support
Toxic Comment Classifier
GRPO for Countdown
Recommender System → LLM Generation
Retrieval-Augmented Generation
Resources
Usage and Billing
Frequently Asked Questions
Changelog
On this page
2025-06-23
2025-05-19
2025-04-24
2025-03-18
2025-02-27
2025-02-05
2024-12-17
2024-11-20
2024-10-31
2024-10-03
2024-09-12
2024-09-05
2024-08-15
2024-08-08
2024-07-17
Resources
Changelog
2025-06-23
v2025.6.1
🎯 New Features & Improvements
Vision Language Models (VLMs)
Enhanced VLM Fine-tuning Support: Improved support for Qwen2-VL and Qwen2.5-VL models with better memory management and batch size optimization
OpenAI Dataset Format Support: Now supports OpenAI-compatible image dataset formats for easier migration and data preparation
S3 URL Support: Direct support for training with S3 URLs, including both public and private image URLs
Fine-tuning Enhancements
Cost & Time Estimates: Show estimated training price and duration before starting fine-tuning jobs
Better Dataset Validation: Enhanced validation of dataset files before starting training jobs to catch issues early
Infrastructure & Platform
H200 GPU Support: Added support for H200 GPUs for improved inference performance
User Experience
Homepage UI Refresh: New streamlined homepage design focused on inference workflows
New User Onboarding: Added TODO list and guided workflows for new users getting started
🐛 Bug Fixes
Training & Fine-tuning
Sequence Length Computation: Fixed incorrect sequence length calculation for vision models
Batch Size Optimization: Corrected batch size tuning that was overestimating optimal batch sizes
Deployment & Serving
Vision Model Deployment: Resolved deployment failures for vision model examples in documentation
UI & API
Metrics Collection: Fixed missing deployment metrics in UI for VPC tenants
📚 Documentation Updates
Vision Fine-tuning Docs: Updated documentation for vision model fine-tuning workflows
Dataset Format Conversion: Added guides for converting OpenAI-compatible image datasets to Predibase format
Supported Models: Updated supported models list to include Qwen2-VL and Qwen2.5-VL models
2025-05-19
v2025.5.1
New Features 🎉
Hide utilization graph, only show spec decoding graph if speculator enabled
Classification head training support
Bring back speculative decoding chart in UI
Add num_iterations support
Add a warning when all scores are 0 for a set of completions
Add reward server logging about reward function score
Bug Fixes 🐛
Permissions checking for readonly user API tokens
Remove duplicate pass over the example dict
Send json data instead of url encoded data when creating connections from UI
Cast additional input data to Python literals before passing it to the reward function
Use CPU node for rewardserver when recreating
Recreate services when retrying after backoff for compute unavailability
2025-04-24
v2025.4.1
New Features 🎉
Initial deployment metrics frontend
Initial deployment metrics backend
Refresh GRPO logs tab upon new step or new epoch
Reward graph refinements
Add and delete reward functions in the UI
Show placeholders when there is no reward or completions data
Reward graphs cleanup and improvements
Add train_steps as parameter in SDK
Add step counter for GRPO jobs
Collapse timeline by default when GRPO job is training
Bug Fixes 🐛
Fix deployment metric queries
Use debounce to prevent ddos from refetching logs
Fix invite link URLs returned by API
Enable adapter download in UI
GRPO multipod bugs around lora loading and unloading
Only show edit/add/delete reward function buttons during training
Load medusa weights from checkpoint separately
Set max height on reward graph legends when legend takes up >30% of tile height
Use right query key for reward functions query
Stop Completions View from DDOSing server for long running GRPO jobs
Make GRPO completions viewer data collection robust to restarts
More reward graph refinements
Limit to 2 concurrent batch jobs per tenant
Log metrics once per step in GRPO and Turbo fine-tuning jobs
Use better colors for reward graphs
Show reward graphs even if GRPO job was cancelled
Documentation 📝
Fix mistral chat template
Add new turbo example to docs
Add QwQ back to docs
Clarify that turbo doesn’t use LoRA parameters
Add RFT pricing info in docs
2025-03-18
v2025.3.1
New Features 🎉
Expose EMA slider for reward graphs
Add synchronization to reward graphs
Remove y-axis label from reward graphs
Hide tabs when GRPO job is queued
SDK function for getting config via adapters API
Show total_reward and total_reward_std charts
Rewards Function View
Reward function graphs tab
Add timestamps to deployment/GRPO logs views
Add dropdown to deployment modal for configuring speculator options
GRPO pod logs UI
Automatic frontend proxy
Bug Fixes 🐛
Treat reward function version history as ascending
Remove extraneous comma from completions comparison viewer
Properly reuse old reward function info if no code has changed
Add virtualenv to rewardserver deps
Remove extraneous parameter from metrics push
Do not push duplicate completions metadata
Show all versions of the same reward function on the same graph
Print GRPO metrics properly in SDK
Don’t require strong types for pushing payload to metrics stream
Remove inaccurate cost for VPC inference costs
Skip normalizing the split column name if it is named “split”
Don’t set base model in config when in continued training flow
Documentation 📝
Wordsmith GRPO UI usage docs
Change order of fine-tuning pages, rename GRPO page to RFT and add info msg about tiers
GRPO feature docs and UI links
Add example instructions for updating reward functions via SDK
Add apply_chat_template to finetuning config docs
Improvements to GRPO training docs
Change int to float in reward function docs
2025-02-27
v2025.2.27
New Features 🎉
Add GRPO to Predibase
Bug Fixes 🐛
Allow uploaded adapters (which have no types) to pass through speculator validation
Ignore missing servingParams in speculator selection during updates for private models
Fix return_to URL routing
Return lowercase literals for auto/disabled speculator, perform case-insensitive checks for create/update
When preloading adapters, skip adapter status validation if falling back to medusa adapter as speculator
Get base model name from local cache when continuing function calling training
Refine the list of supported models for function calling
If using a medusa adapter as the speculator, skip type validation
Speculator param bug bash
Extend timeout for post-preprocessing multi-GPU barrier
VPC UI fixes
Show “disabled”, not empty string for an empty speculator field in API responses
Active batch jobs count against an independent limit from normal deployments
Documentation 📝
Update GRPO docs with new reward functions payload
Add docs for GRPO
Document new model@revision syntax for specifying base models
Add deepseek to models list and extra info msg
2025-02-05
v2025.1.1
New Features 🎉
Add top-level speculator param to deployment config
Raise batch inference model size restriction to 32B
Add Preloaded Adapters and Guaranteed Capacity to Create Deployment UI
Add delete deployment button to config tab
Show tooltip for prompting keyboard shortcut in prompt UI
Add similar time ranges to Grafana for deployment monitoring
Bug Fixes 🐛
Don’t pass —adapter-source twice when using speculator param
If a speculator is in use, set the container arg —adapter-source to pbase
Return error if —adapter-id custom arg and speculator param are both active
Add checkpoint UUID for batch inference to avoid transaction ID collision
Pass both system and user API tokens to sidecar for batch inference
Batch inference dataset type check for file uploads
Distinguish activity names for fetching internal API token between modules
Make internal API tokens user-level and use activity to pass to llm deployer
Allow admin scope when validating user token
Make created_by_user field of auth tokens nullable
Move api token generation outside of batch deploy activity
Properly handle vault’s return values for internal API token
Automatically inject user API token into batch inference lorax container
Remove extra slash in batch inference checkpoint endpoint
Ensure default engines are assigned when moving user to new tenant
Set is_batch flag when creating batch inference deployments
Filter batch deployments in v2 API
Clean up user signup workflow
Increase size of work-volume for batch inference worker
Add redirect after deleting deployment
Adjust max_steps computation to account for multiple GPUs when fine-tuning with streaming data
Callback fixes for multi-GPU training
Clearer file and zip archive names for downloaded request logs
Manually set cooldown period for HPA when configuring KEDA scaling
Download adapters in the same tab (so correct headers are sent) instead of in new tab
Restrict post-training logic to rank 0
Leading zeros in S3 prefix formatting when searching for log files
Update Auth0 Org and User connection validation
Handle null reservations response in UI
Tighten auth boundaries at the root of the app
Separate Okta and Email Signin Buttons
Use mutually exclusive conditionals around useNavigate hook
Documentation 📝
Update config docs with new parameters
Fix update command usage on private_deployments page
Various docs fixes around getting rid of custom args
Batch inference user guide
Add timeouts and retries section to inference overview
Add quick motivating example to quickstart
Fix tool example
Add redirect from old end-to-end URL to new
Some improvements to end-to-end example documentation
Formatting improvements for function calling docs page
2024-12-17
v2024.12.1
New Features 🎉
Make L4s available and fixes missing L40s
Reservations Tab V0, small reorganization of settings module
Default to lora adapter type when continuing training
Bug Fixes 🐛
Training timeline fixes
Replace word “instances” with “replicas” in reservations UI
Show L4 only for VPC
Navigate calls should be inside useEffects
Wait until environment is defined before deciding whether to redirect to VPC setup view
Fixes Zapier Workflow issue
Avoid swallowing errors when enqueuing metrics to posthog client
Replace all react router Navigate component usage with useNavigate hook
Ensure LLM auth policies are re-setup when tenant upgrades
Clear invite token when user proceeds to existing tenant
Metronome transaction ID conflicts for LLM deployments
Allow readonly users to use the users/move endpoint
Fetch accelerator properties from config when returning deployment metadata
Hide new adapter version (normal training flow) option for turbo adapters
Auto-select file uploads connection in new adapter view
Fix and refine connection selector in Create Adapter View
New adapter version from latest should use latest non-imported adapter
Cast steps_per_epoch to int
Delete parent model distributed logs when continuing training on the same dataset
Documentation 📝
Add docs for function calling
Add l40s to accelerator
Update colab positioning in docs
Clarify vision docs
2024-11-20
v2024.11.1
New Features 🎉
Track and return adapter output type for finetuning jobs
Display adapter type on individual and list adapter page
Turbo and Turbo LoRA batch size tuning support
Add adapter filter to SDK function for request logs download
Show rank, target modules, and base model for imported adapters
Automatically infer base model when uploading adapters
Track and return base version for continued finetuning jobs
API to opt into/out of inference request logging
Bug Fixes 🐛
Default value for REQUEST_LOGGING_ENABLED env var
Turbo UI - hide parameters form in continue mode until ready
Bump unsloth version to enable loading local adapter weights
Fix V2 account deletion endpoint
Fix continued training parameter logic
For continued adapters, only show learning rate if the parent and child datasets are different
Pass invite token to anonymous auth boundary
Don’t set request_logger_url on lorax container unless logging is enabled
Account for repos where adapters are not yet written
Manually set missing values in adapter type column up migration
Multiple Turbo UI fixes
Disable timeline polling for imported adapters
Fix missing hoisted properties
Set batch size to 1 when continuing training on a new dataset
Display of imported adapter properties on version page
Parsing of adapter config file for displaying imported adapter properties
Time bounds passing for request log download
Log key parsing for request log download
Fetching base model name and conversion to shortname
Load pretokenized batches on each worker instead of dispatching from rank 0
Missing logic to pipe requestLoggingEnabled parameter through LLM deployment path
Documentation 📝
Update turbo lora adapter id load docs
Add MLlama user guide for VLM training
Open up all base model support for turbo lora and turbo
Fix header for download_request_logs
For request logs opt-in and download
Remove specifying base model when uploading an adapter
2024-10-31
v2024.10.2
New Features 🎉
Organization/Okta/SSO support in Auth0
Allow users to click on archived adapters
Enable continued training from imported adapters and with modified adapter type
Expose usesGuaranteedCapacity in deployment config and on deployment object
Add readonly user role and disable access to appropriate operations
Move prompt UI’s tooltips to hover over labels/titles instead of over question mark icons
Add support for Turbo adapters
Enable DDP in Predibase
Bug Fixes 🐛
Fix operation blockers for readonly users and preload for tenant information in user identity DB call
Fix Org ID nil pointers
Fix V2 invite link bugs and Auth0 casing inconsistencies
Refresh page when merging accounts to get new JWT
Update Current User to handle empty roles
Store invite token so users can validate email
Use correct parameter name for adapter type
Disable delete user UI and fix wording on Sign In page
Exclusively update config_obj during fine-tuning setup
Move llm policy to create new tenant flow
Dangling readonly role UI fixes
Monospaced font
Disable turbo / turbo-lora adapter type validation in the gateway for HF models
Documentation 📝
Update pydantic model_json_schema()
Add Llama-3.2 and Qwen-2.5 models to fine-tuning models page
Consolidate two deployment models tables into one, add new models, and more
2024-10-03
v2024.10.1
New Features 🎉
Implement chat format in Predibase
Enable showing tensorboard with finetuning jobs
Save lorax image tag as a new column in the deployments DB
Disable update deployment button when deployment is not updatable
Add TRAIN/EVALUATION split check
Add support for fine-tuning VLMs in Predibase
Bug Fixes 🐛
Add adaptive tag to h100 engine
Remove possibly spurious error raise param in copy_within_bucket
Default value for checkpoint_step param in update_trainer_settings activity
Disable update deployment button popup when updatable
Replace usage of overloaded deployment model name param with new model path param
Avoid resetting engine during failure cleanup if it was never started
Another incorrect reference to target instead of base version
Avoid using protocol prefix when doing itemized model file copy
Additional migration logic fixes for continued training
Render turbo lora message in adapter builder correctly
Correctly read checkpoint from base version
Runtime issues in finetuning flow related to new checkpoint code
Fixed reading image as bytes from parquet file
Fix dayjs plugin errors with vite
Fix vite production build
Improve model matching logic (replace periods with dashes in mod…)
Documentation 📝
Small R17 docs updates
Update docs
Messages docs
Add docs and expose param for showing tensorboard
Add docs for apply chat template
Update docs for continuing to train from arbitrary checkpoint
Update Comet integration page and add force_bare_client=True
Moved headers into REST API section
2024-09-12
v2024.9.3
New Features 🎉
Add solar pro instruct model
Add convenience method for getting openai-compatible endpoint URL for deployments
Add deployment update and get_recommended_config functions, expose quantization on deployment object
Bug Fixes 🐛
Mark quantization optional on deployment object
Documentation 📝
Improve SDK docs
Don’t use snake case for lorax args
2024-09-05
v2024.9.1
New Features 🎉
Add guardrails for continued training with new dataset case
Load adapter weights from checkpoint without additional state
Add custom_args, Quantization and Update Deployment to UI
Ingest arrow datasets
Add support for dataset streaming for large, pretokenized datasets
Added frontend for Comet API key integration
Add support for Comet experiment tracking integration
Add Phi-3.5 model to Predibase
Invoke resume state setup activities in FinetuningWorkflow
Add tooltip on guaranteed deployment chip
Guaranteed deployment chip in UI
Adding Mistral Nemo models
Add latest gpt-4o model for augmentation
Reveal GPU utilization metrics to all, nowhere to hide
Enable turbo lora for solar-mini
Bug Fixes 🐛
Fix issue during synthesis with MoA
Set adapter weights path on config rather than config_obj
Update catch-all error message to be more general
Fix nil pointer and description usage in update deployment handler
Various fixes for continued training
Set base_model correctly in metadata when continuing training, also additional guardrails
Use fully resolved model name when creating a deployment
Handle 404s in the Prompt UI
Minor fixes for checkpoint resume
Parameter passing to new continued training migration activities
Remove unnecessary SDK-side default value for enable_early_stopping parameter
Fix several bugs found in Engines and Modal Versions pages
Pydantic model for Adapter versions to have correct tag and checkpoint types
Guard import of HF transformers from the critical SDK path
Don’t swallow error when loading specific checkpoint
Model Version page loads again
Fix API call to set default engines in UI
Restart fine-tuning engine pods when the nvidia driver is not loaded
2024-08-15
v2024.8.4
New Features 🎉
Add new simple augmentation strategy
Validate WandB key in the client and in fine-tuning job
Allow users to provide a name for generated datasets
Bug Fixes 🐛
Deployment health page labels: seconds not milliseconds
Surface WandB token validation error message
Point WandB token validation to validate_wandb_token
Fix training engines table crash (due to improper null handling in engine modals)
Update the UI with the correct key when creating an LLM
Multiple deployment health fixes
Documentation 📝
Add async create finetuning job
Enable turbo lora support for Zephyr
Add name parameter to augmentation docs
2024-08-08
v2024.8.1
New Features 🎉
Add new create deployment params to the client
Added delete adapter modal to the repo UI
Deployment health tab, reorg of deployments module
Update Auth Service to allow V2 Invite Token Validation
Add User v2 API endpoints
Terminal logs use monospace font
Connect file and connect pandas dataframe calling parameters documentation update
Update CORS to allow authorization header
Clear React Query cache on signout
Bug Fixes 🐛
Set num_samples_to_generate in the augment data payload
Multiple deployment health page fixes
Small fixes for auth operations
Fix Auth0 frontend configuration
Fix login spam, change Auth0 signup link to go to signup form directly
Pass OpenAI API key to MoA through the SDK
Register augment_dataset
Augment activity tests & fixes
Augment SDK and endpoint fixes
Fix audience claim in JWT and mock out validate token function
Include dataset ID in upload workflow ID
Disable pricing details in deployment view for all enterprise customers
Timestamp fields in identity record
Missing return in Authorize middleware
API call failures due to undefined apiServer (caused by incorrect axios typings)
Hide incorrect Min/Max Replica counts
Infinite loop during pagination of GetCreditBalances
Fix debouncing and error handling on valid filename check in file connector
Fix thumbnails API call and dataset previewer rendering for object values
Add Description to Credit Card authorization
Fix datasets for connection API call (malformed string)
Documentation 📝
Synthetic data, SDK augment dataset and download dataset
Add delete adapters page
Update e2e guide and replicas defaults info and code snippets
Fix reference
Add llama-3.1 8B versions to docs
Fix comma in e2e example and colab link styling for evaluations
Fix qwen2 coming soon
2024-07-17
v2024.7.1
New Features 🎉
Implement connecting a Pandas DataFrame as the Predibase Dataset
Accept shared and private as deployment list filter types
Add new Turbo LoRA fine-tuning support to the Predibase App and SDK
Add dataset upload troubleshooting section
Update Auth Service on Staging
Add version list method to repo in SDK
Allow download of solar models even for non-enterprise tenants
Add wandb logging to fine-tuning engine
Add support for qwen2
Plumb WandB token into engine as env var
Use batch tensor ops to speed up medusa computation
Add Private Solar to Predibase
Enable archiving and deleting adapter versions
Train Medusa LoRA adapters in Predibase
Bug Fixes 🐛
Remove install package
Improve prompt UI matching when “-dequantized”/“-quantized” or “-instruct”/“-it” could be in the model’s name
Python 3.8 compatibility with Annotated
Avoid linting error during SDK installation caused by pydantic field
Clear learning curves store when websocket reconnects
Disable train button in create adapter view for expired tenants
Temporarily comment out L40S in accelerators.yaml
Update prices for accelerators
Syntax error in predibase-agent.yaml
Typo in min_replicas for SDK DeploymentConfig
Set default cooldown to 12 hours in SDK
Resolve inconsistent case sensitivity for LLM config lookup in deployment path
Make error nil on empty wandb vault value
Fix missing learning rate in medusa loss logs
Fix URL routing logic for old models
Show prompt instructions in kebab menu whenever adapter path is available
Docs link broken in main menu
Documentation 📝
Add warning about capacity
Add archiving and unarchiving adapter docs
Add notebooks
Remove vpc trial language from docs
Update docs
Added an example tutorial and modified the Models page
Add wandb
Frequently Asked Questions
Assistant
Responses are generated using AI and may contain mistakes.