2025-08-21
v2025.8.1

🎯 New Features & Improvements

Training and Multi-LoRA Classification Heads

  • Direct Classification Training: You can now train classification heads directly within the Predibase platform by setting the task type to classification. This method provides higher accuracy and throughput for inference for classification tasks. 🚀
  • Broad Model Support: Our platform now supports training classification heads on any officially supported base model. We also offer best-effort support for models not officially on the platform.
  • Dynamic Inference for Classification: Just as with causal language model LoRAs, you can now create classification deployments. This allows you to dynamically query your classification adapters on the same base model, providing flexibility and efficiency for your inference needs.

Support For Larger VLM Training Sets

  • We can now support larger datasets for VLMs.

Training Price Estimation

  • We now show a price estimate for training jobs before you start them. This will help you budget for your training costs and avoid any surprises.

🐛 Bug Fixes

Query Intermediate Checkpoints During Training

  • We fixed a recent regression. You can now query intermediate checkpoints during training again!

GRPO Training

  • Fixed several related issues causing GRPO training on H200s to fail.

Dataset Uploads

  • Fixed an issue causing dataset file uploads to fail in certain VPC environments with multiple dataplanes.

Batch Inference (beta) Fixes

  • Fixed an issue causing batch inference jobs to not recognize the accelerator parameter.
  • Fixed an issue causing batch inference jobs to not properly account for the lorax backend version.

📚 Documentation Updates

2025-06-23
v2025.6.1

🎯 New Features & Improvements

Vision Language Models (VLMs)

  • Enhanced VLM Fine-tuning Support: Improved support for Qwen2-VL and Qwen2.5-VL models with better memory management and batch size optimization
  • OpenAI Dataset Format Support: Now supports OpenAI-compatible image dataset formats for easier migration and data preparation
  • S3 URL Support: Direct support for training with S3 URLs, including both public and private image URLs

Fine-tuning Enhancements

  • Cost & Time Estimates: Show estimated training price and duration before starting fine-tuning jobs
  • Better Dataset Validation: Enhanced validation of dataset files before starting training jobs to catch issues early

Infrastructure & Platform

  • H200 GPU Support: Added support for H200 GPUs for improved inference performance

User Experience

  • Homepage UI Refresh: New streamlined homepage design focused on inference workflows
  • New User Onboarding: Added TODO list and guided workflows for new users getting started

🐛 Bug Fixes

Training & Fine-tuning

  • Sequence Length Computation: Fixed incorrect sequence length calculation for vision models
  • Batch Size Optimization: Corrected batch size tuning that was overestimating optimal batch sizes

Deployment & Serving

  • Vision Model Deployment: Resolved deployment failures for vision model examples in documentation

UI & API

  • Metrics Collection: Fixed missing deployment metrics in UI for VPC tenants

📚 Documentation Updates

  • Vision Fine-tuning Docs: Updated documentation for vision model fine-tuning workflows
  • Dataset Format Conversion: Added guides for converting OpenAI-compatible image datasets to Predibase format
  • Supported Models: Updated supported models list to include Qwen2-VL and Qwen2.5-VL models
2025-05-19
v2025.5.1

New Features 🎉

  • Hide utilization graph, only show spec decoding graph if speculator enabled
  • Classification head training support
  • Bring back speculative decoding chart in UI
  • Add num_iterations support
  • Add a warning when all scores are 0 for a set of completions
  • Add reward server logging about reward function score

Bug Fixes 🐛

  • Permissions checking for readonly user API tokens
  • Remove duplicate pass over the example dict
  • Send json data instead of url encoded data when creating connections from UI
  • Cast additional input data to Python literals before passing it to the reward function
  • Use CPU node for rewardserver when recreating
  • Recreate services when retrying after backoff for compute unavailability
2025-04-24
v2025.4.1

New Features 🎉

  • Initial deployment metrics frontend
  • Initial deployment metrics backend
  • Refresh GRPO logs tab upon new step or new epoch
  • Reward graph refinements
  • Add and delete reward functions in the UI
  • Show placeholders when there is no reward or completions data
  • Reward graphs cleanup and improvements
  • Add train_steps as parameter in SDK
  • Add step counter for GRPO jobs
  • Collapse timeline by default when GRPO job is training

Bug Fixes 🐛

  • Fix deployment metric queries
  • Use debounce to prevent ddos from refetching logs
  • Fix invite link URLs returned by API
  • Enable adapter download in UI
  • GRPO multipod bugs around lora loading and unloading
  • Only show edit/add/delete reward function buttons during training
  • Load medusa weights from checkpoint separately
  • Set max height on reward graph legends when legend takes up >30% of tile height
  • Use right query key for reward functions query
  • Stop Completions View from DDOSing server for long running GRPO jobs
  • Make GRPO completions viewer data collection robust to restarts
  • More reward graph refinements
  • Limit to 2 concurrent batch jobs per tenant
  • Log metrics once per step in GRPO and Turbo fine-tuning jobs
  • Use better colors for reward graphs
  • Show reward graphs even if GRPO job was cancelled

Documentation 📝

  • Fix mistral chat template
  • Add new turbo example to docs
  • Add QwQ back to docs
  • Clarify that turbo doesn’t use LoRA parameters
  • Add RFT pricing info in docs
2025-03-18
v2025.3.1

New Features 🎉

  • Expose EMA slider for reward graphs
  • Add synchronization to reward graphs
  • Remove y-axis label from reward graphs
  • Hide tabs when GRPO job is queued
  • SDK function for getting config via adapters API
  • Show total_reward and total_reward_std charts
  • Rewards Function View
  • Reward function graphs tab
  • Add timestamps to deployment/GRPO logs views
  • Add dropdown to deployment modal for configuring speculator options
  • GRPO pod logs UI
  • Automatic frontend proxy

Bug Fixes 🐛

  • Treat reward function version history as ascending
  • Remove extraneous comma from completions comparison viewer
  • Properly reuse old reward function info if no code has changed
  • Add virtualenv to rewardserver deps
  • Remove extraneous parameter from metrics push
  • Do not push duplicate completions metadata
  • Show all versions of the same reward function on the same graph
  • Print GRPO metrics properly in SDK
  • Don’t require strong types for pushing payload to metrics stream
  • Remove inaccurate cost for VPC inference costs
  • Skip normalizing the split column name if it is named “split”
  • Don’t set base model in config when in continued training flow

Documentation 📝

  • Wordsmith GRPO UI usage docs
  • Change order of fine-tuning pages, rename GRPO page to RFT and add info msg about tiers
  • GRPO feature docs and UI links
  • Add example instructions for updating reward functions via SDK
  • Add apply_chat_template to finetuning config docs
  • Improvements to GRPO training docs
  • Change int to float in reward function docs
2025-02-27
v2025.2.27

New Features 🎉

  • Add GRPO to Predibase

Bug Fixes 🐛

  • Allow uploaded adapters (which have no types) to pass through speculator validation
  • Ignore missing servingParams in speculator selection during updates for private models
  • Fix return_to URL routing
  • Return lowercase literals for auto/disabled speculator, perform case-insensitive checks for create/update
  • When preloading adapters, skip adapter status validation if falling back to medusa adapter as speculator
  • Get base model name from local cache when continuing function calling training
  • Refine the list of supported models for function calling
  • If using a medusa adapter as the speculator, skip type validation
  • Speculator param bug bash
  • Extend timeout for post-preprocessing multi-GPU barrier
  • VPC UI fixes
  • Show “disabled”, not empty string for an empty speculator field in API responses
  • Active batch jobs count against an independent limit from normal deployments

Documentation 📝

  • Update GRPO docs with new reward functions payload
  • Add docs for GRPO
  • Document new model@revision syntax for specifying base models
  • Add deepseek to models list and extra info msg
2025-02-05
v2025.1.1

New Features 🎉

  • Add top-level speculator param to deployment config
  • Raise batch inference model size restriction to 32B
  • Add Preloaded Adapters and Guaranteed Capacity to Create Deployment UI
  • Add delete deployment button to config tab
  • Show tooltip for prompting keyboard shortcut in prompt UI
  • Add similar time ranges to Grafana for deployment monitoring

Bug Fixes 🐛

  • Don’t pass —adapter-source twice when using speculator param
  • If a speculator is in use, set the container arg —adapter-source to pbase
  • Return error if —adapter-id custom arg and speculator param are both active
  • Add checkpoint UUID for batch inference to avoid transaction ID collision
  • Pass both system and user API tokens to sidecar for batch inference
  • Batch inference dataset type check for file uploads
  • Distinguish activity names for fetching internal API token between modules
  • Make internal API tokens user-level and use activity to pass to llm deployer
  • Allow admin scope when validating user token
  • Make created_by_user field of auth tokens nullable
  • Move api token generation outside of batch deploy activity
  • Properly handle vault’s return values for internal API token
  • Automatically inject user API token into batch inference lorax container
  • Remove extra slash in batch inference checkpoint endpoint
  • Ensure default engines are assigned when moving user to new tenant
  • Set is_batch flag when creating batch inference deployments
  • Filter batch deployments in v2 API
  • Clean up user signup workflow
  • Increase size of work-volume for batch inference worker
  • Add redirect after deleting deployment
  • Adjust max_steps computation to account for multiple GPUs when fine-tuning with streaming data
  • Callback fixes for multi-GPU training
  • Clearer file and zip archive names for downloaded request logs
  • Manually set cooldown period for HPA when configuring KEDA scaling
  • Download adapters in the same tab (so correct headers are sent) instead of in new tab
  • Restrict post-training logic to rank 0
  • Leading zeros in S3 prefix formatting when searching for log files
  • Update Auth0 Org and User connection validation
  • Handle null reservations response in UI
  • Tighten auth boundaries at the root of the app
  • Separate Okta and Email Signin Buttons
  • Use mutually exclusive conditionals around useNavigate hook

Documentation 📝

  • Update config docs with new parameters
  • Fix update command usage on private_deployments page
  • Various docs fixes around getting rid of custom args
  • Batch inference user guide
  • Add timeouts and retries section to inference overview
  • Add quick motivating example to quickstart
  • Fix tool example
  • Add redirect from old end-to-end URL to new
  • Some improvements to end-to-end example documentation
  • Formatting improvements for function calling docs page
2024-12-17
v2024.12.1

New Features 🎉

  • Make L4s available and fixes missing L40s
  • Reservations Tab V0, small reorganization of settings module
  • Default to lora adapter type when continuing training

Bug Fixes 🐛

  • Training timeline fixes
  • Replace word “instances” with “replicas” in reservations UI
  • Show L4 only for VPC
  • Navigate calls should be inside useEffects
  • Wait until environment is defined before deciding whether to redirect to VPC setup view
  • Fixes Zapier Workflow issue
  • Avoid swallowing errors when enqueuing metrics to posthog client
  • Replace all react router Navigate component usage with useNavigate hook
  • Ensure LLM auth policies are re-setup when tenant upgrades
  • Clear invite token when user proceeds to existing tenant
  • Metronome transaction ID conflicts for LLM deployments
  • Allow readonly users to use the users/move endpoint
  • Fetch accelerator properties from config when returning deployment metadata
  • Hide new adapter version (normal training flow) option for turbo adapters
  • Auto-select file uploads connection in new adapter view
  • Fix and refine connection selector in Create Adapter View
  • New adapter version from latest should use latest non-imported adapter
  • Cast steps_per_epoch to int
  • Delete parent model distributed logs when continuing training on the same dataset

Documentation 📝

  • Add docs for function calling
  • Add l40s to accelerator
  • Update colab positioning in docs
  • Clarify vision docs
2024-11-20
v2024.11.1

New Features 🎉

  • Track and return adapter output type for finetuning jobs
  • Display adapter type on individual and list adapter page
  • Turbo and Turbo LoRA batch size tuning support
  • Add adapter filter to SDK function for request logs download
  • Show rank, target modules, and base model for imported adapters
  • Automatically infer base model when uploading adapters
  • Track and return base version for continued finetuning jobs
  • API to opt into/out of inference request logging

Bug Fixes 🐛

  • Default value for REQUEST_LOGGING_ENABLED env var
  • Turbo UI - hide parameters form in continue mode until ready
  • Bump unsloth version to enable loading local adapter weights
  • Fix V2 account deletion endpoint
  • Fix continued training parameter logic
  • For continued adapters, only show learning rate if the parent and child datasets are different
  • Pass invite token to anonymous auth boundary
  • Don’t set request_logger_url on lorax container unless logging is enabled
  • Account for repos where adapters are not yet written
  • Manually set missing values in adapter type column up migration
  • Multiple Turbo UI fixes
  • Disable timeline polling for imported adapters
  • Fix missing hoisted properties
  • Set batch size to 1 when continuing training on a new dataset
  • Display of imported adapter properties on version page
  • Parsing of adapter config file for displaying imported adapter properties
  • Time bounds passing for request log download
  • Log key parsing for request log download
  • Fetching base model name and conversion to shortname
  • Load pretokenized batches on each worker instead of dispatching from rank 0
  • Missing logic to pipe requestLoggingEnabled parameter through LLM deployment path

Documentation 📝

  • Update turbo lora adapter id load docs
  • Add MLlama user guide for VLM training
  • Open up all base model support for turbo lora and turbo
  • Fix header for download_request_logs
  • For request logs opt-in and download
  • Remove specifying base model when uploading an adapter
2024-10-31
v2024.10.2

New Features 🎉

  • Organization/Okta/SSO support in Auth0
  • Allow users to click on archived adapters
  • Enable continued training from imported adapters and with modified adapter type
  • Expose usesGuaranteedCapacity in deployment config and on deployment object
  • Add readonly user role and disable access to appropriate operations
  • Move prompt UI’s tooltips to hover over labels/titles instead of over question mark icons
  • Add support for Turbo adapters
  • Enable DDP in Predibase

Bug Fixes 🐛

  • Fix operation blockers for readonly users and preload for tenant information in user identity DB call
  • Fix Org ID nil pointers
  • Fix V2 invite link bugs and Auth0 casing inconsistencies
  • Refresh page when merging accounts to get new JWT
  • Update Current User to handle empty roles
  • Store invite token so users can validate email
  • Use correct parameter name for adapter type
  • Disable delete user UI and fix wording on Sign In page
  • Exclusively update config_obj during fine-tuning setup
  • Move llm policy to create new tenant flow
  • Dangling readonly role UI fixes
  • Monospaced font
  • Disable turbo / turbo-lora adapter type validation in the gateway for HF models

Documentation 📝

  • Update pydantic model_json_schema()
  • Add Llama-3.2 and Qwen-2.5 models to fine-tuning models page
  • Consolidate two deployment models tables into one, add new models, and more
2024-10-03
v2024.10.1

New Features 🎉

  • Implement chat format in Predibase
  • Enable showing tensorboard with finetuning jobs
  • Save lorax image tag as a new column in the deployments DB
  • Disable update deployment button when deployment is not updatable
  • Add TRAIN/EVALUATION split check
  • Add support for fine-tuning VLMs in Predibase

Bug Fixes 🐛

  • Add adaptive tag to h100 engine
  • Remove possibly spurious error raise param in copy_within_bucket
  • Default value for checkpoint_step param in update_trainer_settings activity
  • Disable update deployment button popup when updatable
  • Replace usage of overloaded deployment model name param with new model path param
  • Avoid resetting engine during failure cleanup if it was never started
  • Another incorrect reference to target instead of base version
  • Avoid using protocol prefix when doing itemized model file copy
  • Additional migration logic fixes for continued training
  • Render turbo lora message in adapter builder correctly
  • Correctly read checkpoint from base version
  • Runtime issues in finetuning flow related to new checkpoint code
  • Fixed reading image as bytes from parquet file
  • Fix dayjs plugin errors with vite
  • Fix vite production build
  • Improve model matching logic (replace periods with dashes in mod…)

Documentation 📝

  • Small R17 docs updates
  • Update docs
  • Messages docs
  • Add docs and expose param for showing tensorboard
  • Add docs for apply chat template
  • Update docs for continuing to train from arbitrary checkpoint
  • Update Comet integration page and add force_bare_client=True
  • Moved headers into REST API section
2024-09-12
v2024.9.3

New Features 🎉

  • Add solar pro instruct model
  • Add convenience method for getting openai-compatible endpoint URL for deployments
  • Add deployment update and get_recommended_config functions, expose quantization on deployment object

Bug Fixes 🐛

  • Mark quantization optional on deployment object

Documentation 📝

  • Improve SDK docs
  • Don’t use snake case for lorax args
2024-09-05
v2024.9.1

New Features 🎉

  • Add guardrails for continued training with new dataset case
  • Load adapter weights from checkpoint without additional state
  • Add custom_args, Quantization and Update Deployment to UI
  • Ingest arrow datasets
  • Add support for dataset streaming for large, pretokenized datasets
  • Added frontend for Comet API key integration
  • Add support for Comet experiment tracking integration
  • Add Phi-3.5 model to Predibase
  • Invoke resume state setup activities in FinetuningWorkflow
  • Add tooltip on guaranteed deployment chip
  • Guaranteed deployment chip in UI
  • Adding Mistral Nemo models
  • Add latest gpt-4o model for augmentation
  • Reveal GPU utilization metrics to all, nowhere to hide
  • Enable turbo lora for solar-mini

Bug Fixes 🐛

  • Fix issue during synthesis with MoA
  • Set adapter weights path on config rather than config_obj
  • Update catch-all error message to be more general
  • Fix nil pointer and description usage in update deployment handler
  • Various fixes for continued training
  • Set base_model correctly in metadata when continuing training, also additional guardrails
  • Use fully resolved model name when creating a deployment
  • Handle 404s in the Prompt UI
  • Minor fixes for checkpoint resume
  • Parameter passing to new continued training migration activities
  • Remove unnecessary SDK-side default value for enable_early_stopping parameter
  • Fix several bugs found in Engines and Modal Versions pages
  • Pydantic model for Adapter versions to have correct tag and checkpoint types
  • Guard import of HF transformers from the critical SDK path
  • Don’t swallow error when loading specific checkpoint
  • Model Version page loads again
  • Fix API call to set default engines in UI
  • Restart fine-tuning engine pods when the nvidia driver is not loaded
2024-08-15
v2024.8.4

New Features 🎉

  • Add new simple augmentation strategy
  • Validate WandB key in the client and in fine-tuning job
  • Allow users to provide a name for generated datasets

Bug Fixes 🐛

  • Deployment health page labels: seconds not milliseconds
  • Surface WandB token validation error message
  • Point WandB token validation to validate_wandb_token
  • Fix training engines table crash (due to improper null handling in engine modals)
  • Update the UI with the correct key when creating an LLM
  • Multiple deployment health fixes

Documentation 📝

  • Add async create finetuning job
  • Enable turbo lora support for Zephyr
  • Add name parameter to augmentation docs
2024-08-08
v2024.8.1

New Features 🎉

  • Add new create deployment params to the client
  • Added delete adapter modal to the repo UI
  • Deployment health tab, reorg of deployments module
  • Update Auth Service to allow V2 Invite Token Validation
  • Add User v2 API endpoints
  • Terminal logs use monospace font
  • Connect file and connect pandas dataframe calling parameters documentation update
  • Update CORS to allow authorization header
  • Clear React Query cache on signout

Bug Fixes 🐛

  • Set num_samples_to_generate in the augment data payload
  • Multiple deployment health page fixes
  • Small fixes for auth operations
  • Fix Auth0 frontend configuration
  • Fix login spam, change Auth0 signup link to go to signup form directly
  • Pass OpenAI API key to MoA through the SDK
  • Register augment_dataset
  • Augment activity tests & fixes
  • Augment SDK and endpoint fixes
  • Fix audience claim in JWT and mock out validate token function
  • Include dataset ID in upload workflow ID
  • Disable pricing details in deployment view for all enterprise customers
  • Timestamp fields in identity record
  • Missing return in Authorize middleware
  • API call failures due to undefined apiServer (caused by incorrect axios typings)
  • Hide incorrect Min/Max Replica counts
  • Infinite loop during pagination of GetCreditBalances
  • Fix debouncing and error handling on valid filename check in file connector
  • Fix thumbnails API call and dataset previewer rendering for object values
  • Add Description to Credit Card authorization
  • Fix datasets for connection API call (malformed string)

Documentation 📝

  • Synthetic data, SDK augment dataset and download dataset
  • Add delete adapters page
  • Update e2e guide and replicas defaults info and code snippets
  • Fix reference
  • Add llama-3.1 8B versions to docs
  • Fix comma in e2e example and colab link styling for evaluations
  • Fix qwen2 coming soon
2024-07-17
v2024.7.1

New Features 🎉

  • Implement connecting a Pandas DataFrame as the Predibase Dataset
  • Accept shared and private as deployment list filter types
  • Add new Turbo LoRA fine-tuning support to the Predibase App and SDK
  • Add dataset upload troubleshooting section
  • Update Auth Service on Staging
  • Add version list method to repo in SDK
  • Allow download of solar models even for non-enterprise tenants
  • Add wandb logging to fine-tuning engine
  • Add support for qwen2
  • Plumb WandB token into engine as env var
  • Use batch tensor ops to speed up medusa computation
  • Add Private Solar to Predibase
  • Enable archiving and deleting adapter versions
  • Train Medusa LoRA adapters in Predibase

Bug Fixes 🐛

  • Remove install package
  • Improve prompt UI matching when “-dequantized”/“-quantized” or “-instruct”/“-it” could be in the model’s name
  • Python 3.8 compatibility with Annotated
  • Avoid linting error during SDK installation caused by pydantic field
  • Clear learning curves store when websocket reconnects
  • Disable train button in create adapter view for expired tenants
  • Temporarily comment out L40S in accelerators.yaml
  • Update prices for accelerators
  • Syntax error in predibase-agent.yaml
  • Typo in min_replicas for SDK DeploymentConfig
  • Set default cooldown to 12 hours in SDK
  • Resolve inconsistent case sensitivity for LLM config lookup in deployment path
  • Make error nil on empty wandb vault value
  • Fix missing learning rate in medusa loss logs
  • Fix URL routing logic for old models
  • Show prompt instructions in kebab menu whenever adapter path is available
  • Docs link broken in main menu

Documentation 📝

  • Add warning about capacity
  • Add archiving and unarchiving adapter docs
  • Add notebooks
  • Remove vpc trial language from docs
  • Update docs
  • Added an example tutorial and modified the Models page
  • Add wandb