Upcoming

New Features 🎉

  • New in-app homepage
  • Allow SaaS users to deploy on L4

Bug Fixes 🐛

  • Fix GRPO job logs exclusion regex

ETA: June 18, 2025

2025-05-19
v2025.5.1

New Features 🎉

  • Hide utilization graph, only show spec decoding graph if speculator enabled
  • Classification head training support
  • Bring back speculative decoding chart in UI
  • Add num_iterations support
  • Add a warning when all scores are 0 for a set of completions
  • Add reward server logging about reward function score

Bug Fixes 🐛

  • Permissions checking for readonly user API tokens
  • Remove duplicate pass over the example dict
  • Send json data instead of url encoded data when creating connections from UI
  • Cast additional input data to Python literals before passing it to the reward function
  • Use CPU node for rewardserver when recreating
  • Recreate services when retrying after backoff for compute unavailability
2025-04-24
v2025.4.1

New Features 🎉

  • Initial deployment metrics frontend
  • Initial deployment metrics backend
  • Refresh GRPO logs tab upon new step or new epoch
  • Reward graph refinements
  • Add and delete reward functions in the UI
  • Show placeholders when there is no reward or completions data
  • Reward graphs cleanup and improvements
  • Add train_steps as parameter in SDK
  • Add step counter for GRPO jobs
  • Collapse timeline by default when GRPO job is training

Bug Fixes 🐛

  • Fix deployment metric queries
  • Use debounce to prevent ddos from refetching logs
  • Fix invite link URLs returned by API
  • Enable adapter download in UI
  • GRPO multipod bugs around lora loading and unloading
  • Only show edit/add/delete reward function buttons during training
  • Load medusa weights from checkpoint separately
  • Set max height on reward graph legends when legend takes up >30% of tile height
  • Use right query key for reward functions query
  • Stop Completions View from DDOSing server for long running GRPO jobs
  • Make GRPO completions viewer data collection robust to restarts
  • More reward graph refinements
  • Limit to 2 concurrent batch jobs per tenant
  • Log metrics once per step in GRPO and Turbo fine-tuning jobs
  • Use better colors for reward graphs
  • Show reward graphs even if GRPO job was cancelled

Documentation 📝

  • Fix mistral chat template
  • Add new turbo example to docs
  • Add QwQ back to docs
  • Clarify that turbo doesn’t use LoRA parameters
  • Add RFT pricing info in docs
2025-03-18
v2025.3.1

New Features 🎉

  • Expose EMA slider for reward graphs
  • Add synchronization to reward graphs
  • Remove y-axis label from reward graphs
  • Hide tabs when GRPO job is queued
  • SDK function for getting config via adapters API
  • Show total_reward and total_reward_std charts
  • Rewards Function View
  • Reward function graphs tab
  • Add timestamps to deployment/GRPO logs views
  • Add dropdown to deployment modal for configuring speculator options
  • GRPO pod logs UI
  • Automatic frontend proxy

Bug Fixes 🐛

  • Treat reward function version history as ascending
  • Remove extraneous comma from completions comparison viewer
  • Properly reuse old reward function info if no code has changed
  • Add virtualenv to rewardserver deps
  • Remove extraneous parameter from metrics push
  • Do not push duplicate completions metadata
  • Show all versions of the same reward function on the same graph
  • Print GRPO metrics properly in SDK
  • Don’t require strong types for pushing payload to metrics stream
  • Remove inaccurate cost for VPC inference costs
  • Skip normalizing the split column name if it is named “split”
  • Don’t set base model in config when in continued training flow

Documentation 📝

  • Wordsmith GRPO UI usage docs
  • Change order of fine-tuning pages, rename GRPO page to RFT and add info msg about tiers
  • GRPO feature docs and UI links
  • Add example instructions for updating reward functions via SDK
  • Add apply_chat_template to finetuning config docs
  • Improvements to GRPO training docs
  • Change int to float in reward function docs
2025-02-27
v2025.2.27

New Features 🎉

  • Add GRPO to Predibase

Bug Fixes 🐛

  • Allow uploaded adapters (which have no types) to pass through speculator validation
  • Ignore missing servingParams in speculator selection during updates for private models
  • Fix return_to URL routing
  • Return lowercase literals for auto/disabled speculator, perform case-insensitive checks for create/update
  • When preloading adapters, skip adapter status validation if falling back to medusa adapter as speculator
  • Get base model name from local cache when continuing function calling training
  • Refine the list of supported models for function calling
  • If using a medusa adapter as the speculator, skip type validation
  • Speculator param bug bash
  • Extend timeout for post-preprocessing multi-GPU barrier
  • VPC UI fixes
  • Show “disabled”, not empty string for an empty speculator field in API responses
  • Active batch jobs count against an independent limit from normal deployments

Documentation 📝

  • Update GRPO docs with new reward functions payload
  • Add docs for GRPO
  • Document new model@revision syntax for specifying base models
  • Add deepseek to models list and extra info msg
2025-02-05
v2025.1.1

New Features 🎉

  • Add top-level speculator param to deployment config
  • Raise batch inference model size restriction to 32B
  • Add Preloaded Adapters and Guaranteed Capacity to Create Deployment UI
  • Add delete deployment button to config tab
  • Show tooltip for prompting keyboard shortcut in prompt UI
  • Add similar time ranges to Grafana for deployment monitoring

Bug Fixes 🐛

  • Don’t pass —adapter-source twice when using speculator param
  • If a speculator is in use, set the container arg —adapter-source to pbase
  • Return error if —adapter-id custom arg and speculator param are both active
  • Add checkpoint UUID for batch inference to avoid transaction ID collision
  • Pass both system and user API tokens to sidecar for batch inference
  • Batch inference dataset type check for file uploads
  • Distinguish activity names for fetching internal API token between modules
  • Make internal API tokens user-level and use activity to pass to llm deployer
  • Allow admin scope when validating user token
  • Make created_by_user field of auth tokens nullable
  • Move api token generation outside of batch deploy activity
  • Properly handle vault’s return values for internal API token
  • Automatically inject user API token into batch inference lorax container
  • Remove extra slash in batch inference checkpoint endpoint
  • Ensure default engines are assigned when moving user to new tenant
  • Set is_batch flag when creating batch inference deployments
  • Filter batch deployments in v2 API
  • Clean up user signup workflow
  • Increase size of work-volume for batch inference worker
  • Add redirect after deleting deployment
  • Adjust max_steps computation to account for multiple GPUs when fine-tuning with streaming data
  • Callback fixes for multi-GPU training
  • Clearer file and zip archive names for downloaded request logs
  • Manually set cooldown period for HPA when configuring KEDA scaling
  • Download adapters in the same tab (so correct headers are sent) instead of in new tab
  • Restrict post-training logic to rank 0
  • Leading zeros in S3 prefix formatting when searching for log files
  • Update Auth0 Org and User connection validation
  • Handle null reservations response in UI
  • Tighten auth boundaries at the root of the app
  • Separate Okta and Email Signin Buttons
  • Use mutually exclusive conditionals around useNavigate hook

Documentation 📝

  • Update config docs with new parameters
  • Fix update command usage on private_deployments page
  • Various docs fixes around getting rid of custom args
  • Batch inference user guide
  • Add timeouts and retries section to inference overview
  • Add quick motivating example to quickstart
  • Fix tool example
  • Add redirect from old end-to-end URL to new
  • Some improvements to end-to-end example documentation
  • Formatting improvements for function calling docs page
2024-12-17
v2024.12.1

New Features 🎉

  • Make L4s available and fixes missing L40s
  • Reservations Tab V0, small reorganization of settings module
  • Default to lora adapter type when continuing training

Bug Fixes 🐛

  • Training timeline fixes
  • Replace word “instances” with “replicas” in reservations UI
  • Show L4 only for VPC
  • Navigate calls should be inside useEffects
  • Wait until environment is defined before deciding whether to redirect to VPC setup view
  • Fixes Zapier Workflow issue
  • Avoid swallowing errors when enqueuing metrics to posthog client
  • Replace all react router Navigate component usage with useNavigate hook
  • Ensure LLM auth policies are re-setup when tenant upgrades
  • Clear invite token when user proceeds to existing tenant
  • Metronome transaction ID conflicts for LLM deployments
  • Allow readonly users to use the users/move endpoint
  • Fetch accelerator properties from config when returning deployment metadata
  • Hide new adapter version (normal training flow) option for turbo adapters
  • Auto-select file uploads connection in new adapter view
  • Fix and refine connection selector in Create Adapter View
  • New adapter version from latest should use latest non-imported adapter
  • Cast steps_per_epoch to int
  • Delete parent model distributed logs when continuing training on the same dataset

Documentation 📝

  • Add docs for function calling
  • Add l40s to accelerator
  • Update colab positioning in docs
  • Clarify vision docs
2024-11-20
v2024.11.1

New Features 🎉

  • Track and return adapter output type for finetuning jobs
  • Display adapter type on individual and list adapter page
  • Turbo and Turbo LoRA batch size tuning support
  • Add adapter filter to SDK function for request logs download
  • Show rank, target modules, and base model for imported adapters
  • Automatically infer base model when uploading adapters
  • Track and return base version for continued finetuning jobs
  • API to opt into/out of inference request logging

Bug Fixes 🐛

  • Default value for REQUEST_LOGGING_ENABLED env var
  • Turbo UI - hide parameters form in continue mode until ready
  • Bump unsloth version to enable loading local adapter weights
  • Fix V2 account deletion endpoint
  • Fix continued training parameter logic
  • For continued adapters, only show learning rate if the parent and child datasets are different
  • Pass invite token to anonymous auth boundary
  • Don’t set request_logger_url on lorax container unless logging is enabled
  • Account for repos where adapters are not yet written
  • Manually set missing values in adapter type column up migration
  • Multiple Turbo UI fixes
  • Disable timeline polling for imported adapters
  • Fix missing hoisted properties
  • Set batch size to 1 when continuing training on a new dataset
  • Display of imported adapter properties on version page
  • Parsing of adapter config file for displaying imported adapter properties
  • Time bounds passing for request log download
  • Log key parsing for request log download
  • Fetching base model name and conversion to shortname
  • Load pretokenized batches on each worker instead of dispatching from rank 0
  • Missing logic to pipe requestLoggingEnabled parameter through LLM deployment path

Documentation 📝

  • Update turbo lora adapter id load docs
  • Add MLlama user guide for VLM training
  • Open up all base model support for turbo lora and turbo
  • Fix header for download_request_logs
  • For request logs opt-in and download
  • Remove specifying base model when uploading an adapter
2024-10-31
v2024.10.2

New Features 🎉

  • Organization/Okta/SSO support in Auth0
  • Allow users to click on archived adapters
  • Enable continued training from imported adapters and with modified adapter type
  • Expose usesGuaranteedCapacity in deployment config and on deployment object
  • Add readonly user role and disable access to appropriate operations
  • Move prompt UI’s tooltips to hover over labels/titles instead of over question mark icons
  • Add support for Turbo adapters
  • Enable DDP in Predibase

Bug Fixes 🐛

  • Fix operation blockers for readonly users and preload for tenant information in user identity DB call
  • Fix Org ID nil pointers
  • Fix V2 invite link bugs and Auth0 casing inconsistencies
  • Refresh page when merging accounts to get new JWT
  • Update Current User to handle empty roles
  • Store invite token so users can validate email
  • Use correct parameter name for adapter type
  • Disable delete user UI and fix wording on Sign In page
  • Exclusively update config_obj during fine-tuning setup
  • Move llm policy to create new tenant flow
  • Dangling readonly role UI fixes
  • Monospaced font
  • Disable turbo / turbo-lora adapter type validation in the gateway for HF models

Documentation 📝

  • Update pydantic model_json_schema()
  • Add Llama-3.2 and Qwen-2.5 models to fine-tuning models page
  • Consolidate two deployment models tables into one, add new models, and more
2024-10-03
v2024.10.1

New Features 🎉

  • Implement chat format in Predibase
  • Enable showing tensorboard with finetuning jobs
  • Save lorax image tag as a new column in the deployments DB
  • Disable update deployment button when deployment is not updatable
  • Add TRAIN/EVALUATION split check
  • Add support for fine-tuning VLMs in Predibase

Bug Fixes 🐛

  • Add adaptive tag to h100 engine
  • Remove possibly spurious error raise param in copy_within_bucket
  • Default value for checkpoint_step param in update_trainer_settings activity
  • Disable update deployment button popup when updatable
  • Replace usage of overloaded deployment model name param with new model path param
  • Avoid resetting engine during failure cleanup if it was never started
  • Another incorrect reference to target instead of base version
  • Avoid using protocol prefix when doing itemized model file copy
  • Additional migration logic fixes for continued training
  • Render turbo lora message in adapter builder correctly
  • Correctly read checkpoint from base version
  • Runtime issues in finetuning flow related to new checkpoint code
  • Fixed reading image as bytes from parquet file
  • Fix dayjs plugin errors with vite
  • Fix vite production build
  • Improve model matching logic (replace periods with dashes in mod…)

Documentation 📝

  • Small R17 docs updates
  • Update docs
  • Messages docs
  • Add docs and expose param for showing tensorboard
  • Add docs for apply chat template
  • Update docs for continuing to train from arbitrary checkpoint
  • Update Comet integration page and add force_bare_client=True
  • Moved headers into REST API section
2024-09-12
v2024.9.3

New Features 🎉

  • Add solar pro instruct model
  • Add convenience method for getting openai-compatible endpoint URL for deployments
  • Add deployment update and get_recommended_config functions, expose quantization on deployment object

Bug Fixes 🐛

  • Mark quantization optional on deployment object

Documentation 📝

  • Improve SDK docs
  • Don’t use snake case for lorax args
2024-09-05
v2024.9.1

New Features 🎉

  • Add guardrails for continued training with new dataset case
  • Load adapter weights from checkpoint without additional state
  • Add custom_args, Quantization and Update Deployment to UI
  • Ingest arrow datasets
  • Add support for dataset streaming for large, pretokenized datasets
  • Added frontend for Comet API key integration
  • Add support for Comet experiment tracking integration
  • Add Phi-3.5 model to Predibase
  • Invoke resume state setup activities in FinetuningWorkflow
  • Add tooltip on guaranteed deployment chip
  • Guaranteed deployment chip in UI
  • Adding Mistral Nemo models
  • Add latest gpt-4o model for augmentation
  • Reveal GPU utilization metrics to all, nowhere to hide
  • Enable turbo lora for solar-mini

Bug Fixes 🐛

  • Fix issue during synthesis with MoA
  • Set adapter weights path on config rather than config_obj
  • Update catch-all error message to be more general
  • Fix nil pointer and description usage in update deployment handler
  • Various fixes for continued training
  • Set base_model correctly in metadata when continuing training, also additional guardrails
  • Use fully resolved model name when creating a deployment
  • Handle 404s in the Prompt UI
  • Minor fixes for checkpoint resume
  • Parameter passing to new continued training migration activities
  • Remove unnecessary SDK-side default value for enable_early_stopping parameter
  • Fix several bugs found in Engines and Modal Versions pages
  • Pydantic model for Adapter versions to have correct tag and checkpoint types
  • Guard import of HF transformers from the critical SDK path
  • Don’t swallow error when loading specific checkpoint
  • Model Version page loads again
  • Fix API call to set default engines in UI
  • Restart fine-tuning engine pods when the nvidia driver is not loaded
2024-08-15
v2024.8.4

New Features 🎉

  • Add new simple augmentation strategy
  • Validate WandB key in the client and in fine-tuning job
  • Allow users to provide a name for generated datasets

Bug Fixes 🐛

  • Deployment health page labels: seconds not milliseconds
  • Surface WandB token validation error message
  • Point WandB token validation to validate_wandb_token
  • Fix training engines table crash (due to improper null handling in engine modals)
  • Update the UI with the correct key when creating an LLM
  • Multiple deployment health fixes

Documentation 📝

  • Add async create finetuning job
  • Enable turbo lora support for Zephyr
  • Add name parameter to augmentation docs
2024-08-08
v2024.8.1

New Features 🎉

  • Add new create deployment params to the client
  • Added delete adapter modal to the repo UI
  • Deployment health tab, reorg of deployments module
  • Update Auth Service to allow V2 Invite Token Validation
  • Add User v2 API endpoints
  • Terminal logs use monospace font
  • Connect file and connect pandas dataframe calling parameters documentation update
  • Update CORS to allow authorization header
  • Clear React Query cache on signout

Bug Fixes 🐛

  • Set num_samples_to_generate in the augment data payload
  • Multiple deployment health page fixes
  • Small fixes for auth operations
  • Fix Auth0 frontend configuration
  • Fix login spam, change Auth0 signup link to go to signup form directly
  • Pass OpenAI API key to MoA through the SDK
  • Register augment_dataset
  • Augment activity tests & fixes
  • Augment SDK and endpoint fixes
  • Fix audience claim in JWT and mock out validate token function
  • Include dataset ID in upload workflow ID
  • Disable pricing details in deployment view for all enterprise customers
  • Timestamp fields in identity record
  • Missing return in Authorize middleware
  • API call failures due to undefined apiServer (caused by incorrect axios typings)
  • Hide incorrect Min/Max Replica counts
  • Infinite loop during pagination of GetCreditBalances
  • Fix debouncing and error handling on valid filename check in file connector
  • Fix thumbnails API call and dataset previewer rendering for object values
  • Add Description to Credit Card authorization
  • Fix datasets for connection API call (malformed string)

Documentation 📝

  • Synthetic data, SDK augment dataset and download dataset
  • Add delete adapters page
  • Update e2e guide and replicas defaults info and code snippets
  • Fix reference
  • Add llama-3.1 8B versions to docs
  • Fix comma in e2e example and colab link styling for evaluations
  • Fix qwen2 coming soon
2024-07-17
v2024.7.1

New Features 🎉

  • Implement connecting a Pandas DataFrame as the Predibase Dataset
  • Accept shared and private as deployment list filter types
  • Add new Turbo LoRA fine-tuning support to the Predibase App and SDK
  • Add dataset upload troubleshooting section
  • Update Auth Service on Staging
  • Add version list method to repo in SDK
  • Allow download of solar models even for non-enterprise tenants
  • Add wandb logging to fine-tuning engine
  • Add support for qwen2
  • Plumb WandB token into engine as env var
  • Use batch tensor ops to speed up medusa computation
  • Add Private Solar to Predibase
  • Enable archiving and deleting adapter versions
  • Train Medusa LoRA adapters in Predibase

Bug Fixes 🐛

  • Remove install package
  • Improve prompt UI matching when “-dequantized”/“-quantized” or “-instruct”/“-it” could be in the model’s name
  • Python 3.8 compatibility with Annotated
  • Avoid linting error during SDK installation caused by pydantic field
  • Clear learning curves store when websocket reconnects
  • Disable train button in create adapter view for expired tenants
  • Temporarily comment out L40S in accelerators.yaml
  • Update prices for accelerators
  • Syntax error in predibase-agent.yaml
  • Typo in min_replicas for SDK DeploymentConfig
  • Set default cooldown to 12 hours in SDK
  • Resolve inconsistent case sensitivity for LLM config lookup in deployment path
  • Make error nil on empty wandb vault value
  • Fix missing learning rate in medusa loss logs
  • Fix URL routing logic for old models
  • Show prompt instructions in kebab menu whenever adapter path is available
  • Docs link broken in main menu

Documentation 📝

  • Add warning about capacity
  • Add archiving and unarchiving adapter docs
  • Add notebooks
  • Remove vpc trial language from docs
  • Update docs
  • Added an example tutorial and modified the Models page
  • Add wandb