Resources
Changelog
Upcoming
New Features 🎉
- Hide utilization graph, only show spec decoding graph if speculator enabled
- Classification head training support
- Bring back speculative decoding chart in UI
- Add num_iterations support
- Add a warning when all scores are 0 for a set of completions
- Add reward server logging about reward function score
Bug Fixes 🐛
- Permissions checking for readonly user API tokens
- Remove duplicate pass over the example dict
- Send json data instead of url encoded data when creating connections from UI
- Cast additional input data to Python literals before passing it to the reward function
- Use CPU node for rewardserver when recreating
- Recreate services when retrying after backoff for compute unavailability
New Features 🎉
- Initial deployment metrics frontend
- Initial deployment metrics backend
- Refresh GRPO logs tab upon new step or new epoch
- Reward graph refinements
- Add and delete reward functions in the UI
- Show placeholders when there is no reward or completions data
- Reward graphs cleanup and improvements
- Add train_steps as parameter in SDK
- Add step counter for GRPO jobs
- Collapse timeline by default when GRPO job is training
Bug Fixes 🐛
- Fix deployment metric queries
- Use debounce to prevent ddos from refetching logs
- Fix invite link URLs returned by API
- Enable adapter download in UI
- GRPO multipod bugs around lora loading and unloading
- Only show edit/add/delete reward function buttons during training
- Load medusa weights from checkpoint separately
- Set max height on reward graph legends when legend takes up >30% of tile height
- Use right query key for reward functions query
- Stop Completions View from DDOSing server for long running GRPO jobs
- Make GRPO completions viewer data collection robust to restarts
- More reward graph refinements
- Limit to 2 concurrent batch jobs per tenant
- Log metrics once per step in GRPO and Turbo fine-tuning jobs
- Use better colors for reward graphs
- Show reward graphs even if GRPO job was cancelled
Documentation 📝
- Fix mistral chat template
- Add new turbo example to docs
- Add QwQ back to docs
- Clarify that turbo doesn’t use LoRA parameters
- Add RFT pricing info in docs
New Features 🎉
- Expose EMA slider for reward graphs
- Add synchronization to reward graphs
- Remove y-axis label from reward graphs
- Hide tabs when GRPO job is queued
- SDK function for getting config via adapters API
- Show total_reward and total_reward_std charts
- Rewards Function View
- Reward function graphs tab
- Add timestamps to deployment/GRPO logs views
- Add dropdown to deployment modal for configuring speculator options
- GRPO pod logs UI
- Automatic frontend proxy
Bug Fixes 🐛
- Treat reward function version history as ascending
- Remove extraneous comma from completions comparison viewer
- Properly reuse old reward function info if no code has changed
- Add virtualenv to rewardserver deps
- Remove extraneous parameter from metrics push
- Do not push duplicate completions metadata
- Show all versions of the same reward function on the same graph
- Print GRPO metrics properly in SDK
- Don’t require strong types for pushing payload to metrics stream
- Remove inaccurate cost for VPC inference costs
- Skip normalizing the split column name if it is named “split”
- Don’t set base model in config when in continued training flow
Documentation 📝
- Wordsmith GRPO UI usage docs
- Change order of fine-tuning pages, rename GRPO page to RFT and add info msg about tiers
- GRPO feature docs and UI links
- Add example instructions for updating reward functions via SDK
- Add apply_chat_template to finetuning config docs
- Improvements to GRPO training docs
- Change int to float in reward function docs
New Features 🎉
- Add GRPO to Predibase
Bug Fixes 🐛
- Allow uploaded adapters (which have no types) to pass through speculator validation
- Ignore missing servingParams in speculator selection during updates for private models
- Fix return_to URL routing
- Return lowercase literals for auto/disabled speculator, perform case-insensitive checks for create/update
- When preloading adapters, skip adapter status validation if falling back to medusa adapter as speculator
- Get base model name from local cache when continuing function calling training
- Refine the list of supported models for function calling
- If using a medusa adapter as the speculator, skip type validation
- Speculator param bug bash
- Extend timeout for post-preprocessing multi-GPU barrier
- VPC UI fixes
- Show “disabled”, not empty string for an empty speculator field in API responses
- Active batch jobs count against an independent limit from normal deployments
Documentation 📝
- Update GRPO docs with new reward functions payload
- Add docs for GRPO
- Document new model@revision syntax for specifying base models
- Add deepseek to models list and extra info msg
New Features 🎉
- Add top-level speculator param to deployment config
- Raise batch inference model size restriction to 32B
- Add Preloaded Adapters and Guaranteed Capacity to Create Deployment UI
- Add delete deployment button to config tab
- Show tooltip for prompting keyboard shortcut in prompt UI
- Add similar time ranges to Grafana for deployment monitoring
Bug Fixes 🐛
- Don’t pass —adapter-source twice when using speculator param
- If a speculator is in use, set the container arg —adapter-source to pbase
- Return error if —adapter-id custom arg and speculator param are both active
- Add checkpoint UUID for batch inference to avoid transaction ID collision
- Pass both system and user API tokens to sidecar for batch inference
- Batch inference dataset type check for file uploads
- Distinguish activity names for fetching internal API token between modules
- Make internal API tokens user-level and use activity to pass to llm deployer
- Allow admin scope when validating user token
- Make created_by_user field of auth tokens nullable
- Move api token generation outside of batch deploy activity
- Properly handle vault’s return values for internal API token
- Automatically inject user API token into batch inference lorax container
- Remove extra slash in batch inference checkpoint endpoint
- Ensure default engines are assigned when moving user to new tenant
- Set is_batch flag when creating batch inference deployments
- Filter batch deployments in v2 API
- Clean up user signup workflow
- Increase size of work-volume for batch inference worker
- Add redirect after deleting deployment
- Adjust max_steps computation to account for multiple GPUs when fine-tuning with streaming data
- Callback fixes for multi-GPU training
- Clearer file and zip archive names for downloaded request logs
- Manually set cooldown period for HPA when configuring KEDA scaling
- Download adapters in the same tab (so correct headers are sent) instead of in new tab
- Restrict post-training logic to rank 0
- Leading zeros in S3 prefix formatting when searching for log files
- Update Auth0 Org and User connection validation
- Handle null reservations response in UI
- Tighten auth boundaries at the root of the app
- Separate Okta and Email Signin Buttons
- Use mutually exclusive conditionals around useNavigate hook
Documentation 📝
- Update config docs with new parameters
- Fix update command usage on private_deployments page
- Various docs fixes around getting rid of custom args
- Batch inference user guide
- Add timeouts and retries section to inference overview
- Add quick motivating example to quickstart
- Fix tool example
- Add redirect from old end-to-end URL to new
- Some improvements to end-to-end example documentation
- Formatting improvements for function calling docs page
New Features 🎉
- Make L4s available and fixes missing L40s
- Reservations Tab V0, small reorganization of settings module
- Default to lora adapter type when continuing training
Bug Fixes 🐛
- Training timeline fixes
- Replace word “instances” with “replicas” in reservations UI
- Show L4 only for VPC
- Navigate calls should be inside useEffects
- Wait until environment is defined before deciding whether to redirect to VPC setup view
- Fixes Zapier Workflow issue
- Avoid swallowing errors when enqueuing metrics to posthog client
- Replace all react router Navigate component usage with useNavigate hook
- Ensure LLM auth policies are re-setup when tenant upgrades
- Clear invite token when user proceeds to existing tenant
- Metronome transaction ID conflicts for LLM deployments
- Allow readonly users to use the users/move endpoint
- Fetch accelerator properties from config when returning deployment metadata
- Hide new adapter version (normal training flow) option for turbo adapters
- Auto-select file uploads connection in new adapter view
- Fix and refine connection selector in Create Adapter View
- New adapter version from latest should use latest non-imported adapter
- Cast steps_per_epoch to int
- Delete parent model distributed logs when continuing training on the same dataset
Documentation 📝
- Add docs for function calling
- Add l40s to accelerator
- Update colab positioning in docs
- Clarify vision docs
New Features 🎉
- Track and return adapter output type for finetuning jobs
- Display adapter type on individual and list adapter page
- Turbo and Turbo LoRA batch size tuning support
- Add adapter filter to SDK function for request logs download
- Show rank, target modules, and base model for imported adapters
- Automatically infer base model when uploading adapters
- Track and return base version for continued finetuning jobs
- API to opt into/out of inference request logging
Bug Fixes 🐛
- Default value for REQUEST_LOGGING_ENABLED env var
- Turbo UI - hide parameters form in continue mode until ready
- Bump unsloth version to enable loading local adapter weights
- Fix V2 account deletion endpoint
- Fix continued training parameter logic
- For continued adapters, only show learning rate if the parent and child datasets are different
- Pass invite token to anonymous auth boundary
- Don’t set request_logger_url on lorax container unless logging is enabled
- Account for repos where adapters are not yet written
- Manually set missing values in adapter type column up migration
- Multiple Turbo UI fixes
- Disable timeline polling for imported adapters
- Fix missing hoisted properties
- Set batch size to 1 when continuing training on a new dataset
- Display of imported adapter properties on version page
- Parsing of adapter config file for displaying imported adapter properties
- Time bounds passing for request log download
- Log key parsing for request log download
- Fetching base model name and conversion to shortname
- Load pretokenized batches on each worker instead of dispatching from rank 0
- Missing logic to pipe requestLoggingEnabled parameter through LLM deployment path
Documentation 📝
- Update turbo lora adapter id load docs
- Add MLlama user guide for VLM training
- Open up all base model support for turbo lora and turbo
- Fix header for download_request_logs
- For request logs opt-in and download
- Remove specifying base model when uploading an adapter
New Features 🎉
- Organization/Okta/SSO support in Auth0
- Allow users to click on archived adapters
- Enable continued training from imported adapters and with modified adapter type
- Expose usesGuaranteedCapacity in deployment config and on deployment object
- Add readonly user role and disable access to appropriate operations
- Move prompt UI’s tooltips to hover over labels/titles instead of over question mark icons
- Add support for Turbo adapters
- Enable DDP in Predibase
Bug Fixes 🐛
- Fix operation blockers for readonly users and preload for tenant information in user identity DB call
- Fix Org ID nil pointers
- Fix V2 invite link bugs and Auth0 casing inconsistencies
- Refresh page when merging accounts to get new JWT
- Update Current User to handle empty roles
- Store invite token so users can validate email
- Use correct parameter name for adapter type
- Disable delete user UI and fix wording on Sign In page
- Exclusively update config_obj during fine-tuning setup
- Move llm policy to create new tenant flow
- Dangling readonly role UI fixes
- Monospaced font
- Disable turbo / turbo-lora adapter type validation in the gateway for HF models
Documentation 📝
- Update pydantic model_json_schema()
- Add Llama-3.2 and Qwen-2.5 models to fine-tuning models page
- Consolidate two deployment models tables into one, add new models, and more
New Features 🎉
- Implement chat format in Predibase
- Enable showing tensorboard with finetuning jobs
- Save lorax image tag as a new column in the deployments DB
- Disable update deployment button when deployment is not updatable
- Add TRAIN/EVALUATION split check
- Add support for fine-tuning VLMs in Predibase
Bug Fixes 🐛
- Add adaptive tag to h100 engine
- Remove possibly spurious error raise param in copy_within_bucket
- Default value for checkpoint_step param in update_trainer_settings activity
- Disable update deployment button popup when updatable
- Replace usage of overloaded deployment model name param with new model path param
- Avoid resetting engine during failure cleanup if it was never started
- Another incorrect reference to target instead of base version
- Avoid using protocol prefix when doing itemized model file copy
- Additional migration logic fixes for continued training
- Render turbo lora message in adapter builder correctly
- Correctly read checkpoint from base version
- Runtime issues in finetuning flow related to new checkpoint code
- Fixed reading image as bytes from parquet file
- Fix dayjs plugin errors with vite
- Fix vite production build
- Improve model matching logic (replace periods with dashes in mod…)
Documentation 📝
- Small R17 docs updates
- Update docs
- Messages docs
- Add docs and expose param for showing tensorboard
- Add docs for apply chat template
- Update docs for continuing to train from arbitrary checkpoint
- Update Comet integration page and add force_bare_client=True
- Moved headers into REST API section
New Features 🎉
- Add solar pro instruct model
- Add convenience method for getting openai-compatible endpoint URL for deployments
- Add deployment update and get_recommended_config functions, expose quantization on deployment object
Bug Fixes 🐛
- Mark quantization optional on deployment object
Documentation 📝
- Improve SDK docs
- Don’t use snake case for lorax args
New Features 🎉
- Add guardrails for continued training with new dataset case
- Load adapter weights from checkpoint without additional state
- Add custom_args, Quantization and Update Deployment to UI
- Ingest arrow datasets
- Add support for dataset streaming for large, pretokenized datasets
- Added frontend for Comet API key integration
- Add support for Comet experiment tracking integration
- Add Phi-3.5 model to Predibase
- Invoke resume state setup activities in FinetuningWorkflow
- Add tooltip on guaranteed deployment chip
- Guaranteed deployment chip in UI
- Adding Mistral Nemo models
- Add latest gpt-4o model for augmentation
- Reveal GPU utilization metrics to all, nowhere to hide
- Enable turbo lora for solar-mini
Bug Fixes 🐛
- Fix issue during synthesis with MoA
- Set adapter weights path on config rather than config_obj
- Update catch-all error message to be more general
- Fix nil pointer and description usage in update deployment handler
- Various fixes for continued training
- Set base_model correctly in metadata when continuing training, also additional guardrails
- Use fully resolved model name when creating a deployment
- Handle 404s in the Prompt UI
- Minor fixes for checkpoint resume
- Parameter passing to new continued training migration activities
- Remove unnecessary SDK-side default value for enable_early_stopping parameter
- Fix several bugs found in Engines and Modal Versions pages
- Pydantic model for Adapter versions to have correct tag and checkpoint types
- Guard import of HF transformers from the critical SDK path
- Don’t swallow error when loading specific checkpoint
- Model Version page loads again
- Fix API call to set default engines in UI
- Restart fine-tuning engine pods when the nvidia driver is not loaded
New Features 🎉
- Add new simple augmentation strategy
- Validate WandB key in the client and in fine-tuning job
- Allow users to provide a name for generated datasets
Bug Fixes 🐛
- Deployment health page labels: seconds not milliseconds
- Surface WandB token validation error message
- Point WandB token validation to validate_wandb_token
- Fix training engines table crash (due to improper null handling in engine modals)
- Update the UI with the correct key when creating an LLM
- Multiple deployment health fixes
Documentation 📝
- Add async create finetuning job
- Enable turbo lora support for Zephyr
- Add name parameter to augmentation docs
New Features 🎉
- Add new create deployment params to the client
- Added delete adapter modal to the repo UI
- Deployment health tab, reorg of deployments module
- Update Auth Service to allow V2 Invite Token Validation
- Add User v2 API endpoints
- Terminal logs use monospace font
- Connect file and connect pandas dataframe calling parameters documentation update
- Update CORS to allow authorization header
- Clear React Query cache on signout
Bug Fixes 🐛
- Set num_samples_to_generate in the augment data payload
- Multiple deployment health page fixes
- Small fixes for auth operations
- Fix Auth0 frontend configuration
- Fix login spam, change Auth0 signup link to go to signup form directly
- Pass OpenAI API key to MoA through the SDK
- Register augment_dataset
- Augment activity tests & fixes
- Augment SDK and endpoint fixes
- Fix audience claim in JWT and mock out validate token function
- Include dataset ID in upload workflow ID
- Disable pricing details in deployment view for all enterprise customers
- Timestamp fields in identity record
- Missing return in Authorize middleware
- API call failures due to undefined apiServer (caused by incorrect axios typings)
- Hide incorrect Min/Max Replica counts
- Infinite loop during pagination of GetCreditBalances
- Fix debouncing and error handling on valid filename check in file connector
- Fix thumbnails API call and dataset previewer rendering for object values
- Add Description to Credit Card authorization
- Fix datasets for connection API call (malformed string)
Documentation 📝
- Synthetic data, SDK augment dataset and download dataset
- Add delete adapters page
- Update e2e guide and replicas defaults info and code snippets
- Fix reference
- Add llama-3.1 8B versions to docs
- Fix comma in e2e example and colab link styling for evaluations
- Fix qwen2 coming soon
New Features 🎉
- Implement connecting a Pandas DataFrame as the Predibase Dataset
- Accept shared and private as deployment list filter types
- Add new Turbo LoRA fine-tuning support to the Predibase App and SDK
- Add dataset upload troubleshooting section
- Update Auth Service on Staging
- Add version list method to repo in SDK
- Allow download of solar models even for non-enterprise tenants
- Add wandb logging to fine-tuning engine
- Add support for qwen2
- Plumb WandB token into engine as env var
- Use batch tensor ops to speed up medusa computation
- Add Private Solar to Predibase
- Enable archiving and deleting adapter versions
- Train Medusa LoRA adapters in Predibase
Bug Fixes 🐛
- Remove install package
- Improve prompt UI matching when “-dequantized”/“-quantized” or “-instruct”/“-it” could be in the model’s name
- Python 3.8 compatibility with Annotated
- Avoid linting error during SDK installation caused by pydantic field
- Clear learning curves store when websocket reconnects
- Disable train button in create adapter view for expired tenants
- Temporarily comment out L40S in accelerators.yaml
- Update prices for accelerators
- Syntax error in predibase-agent.yaml
- Typo in min_replicas for SDK DeploymentConfig
- Set default cooldown to 12 hours in SDK
- Resolve inconsistent case sensitivity for LLM config lookup in deployment path
- Make error nil on empty wandb vault value
- Fix missing learning rate in medusa loss logs
- Fix URL routing logic for old models
- Show prompt instructions in kebab menu whenever adapter path is available
- Docs link broken in main menu
Documentation 📝
- Add warning about capacity
- Add archiving and unarchiving adapter docs
- Add notebooks
- Remove vpc trial language from docs
- Update docs
- Added an example tutorial and modified the Models page
- Add wandb