Model Repositories
Introduction
By navigating to the Models page, you'll see a table of your team's model repositories. Each model repository has a name, associated datasets, contributors, the latest version, and last modified timestamp.
Let's click into the Twitter Bots Model Repository as an example.
Model Versions
The first page we land on after diving into a model repository is the model versions page. Here we see all associated model versions and metadata.
- #: The model version number, which is the unique identifier for the model version within this model repository. The model version number is based on the order in which the model was trained.
- Description: The description for a given model, which can be specified in the Model Builder and modified from the Model Repo and Model Version Page. Add any notes that will help you differentiate this model version from others, for example, parameter changes or notes about the model performance.
- Datasets / Targets: A concatenation of the dataset and the target (output feature) you're predicting. When this is blue, that indicates that version is the default used for PQL PREDICT queries.
- Engine: The engine used to train the model version. For best performance, we recommend using training engines.
- Metric: The default metric based on the output type of the latest model in the repo. For example, category --> ROC AUC & binary -> Accuracy.
- Duration: The total duration of model training: includes queuing, preprocessing, training, and evaluating.
- Created: The timestamp of when the model was first created.
- Status: The model status. Models can be in the following states:
- Queued: Waiting for an engine to become available and initialize. Engine statuses can be monitored on the Engines page.
- Preprocessing: Dataset preprocessing (for example, dataset splitting, balancing, or applying normalization).
- Training: Model training. You can view learning curves live as model training progresses to completion.
- Evaluating: Calculating final performance metrics at the best checkpoint (defined as where the model achieved the best validation metric, for example for a binary output, the validation metric would be accuracy).
- Visualizing: Generating additional visualizations.
- Explaining: Generating explainability visualizations (such as feature importance and token level explanations).
- Ready: Ready for querying & inference.
- Canceling: In the process of being canceled. This action is user-initiated.
- Canceled: No further steps (training, visualizing, etc) will continue.
- Deploying: Currently being deployed to the serving engine specified.
- Deployed: Successfully deployed. It will now appear under the Deployments tab on the Model Repo page.
- Undeploying: Taking down the model deployment. After undeploying, the model will be back in the Ready state.
There are also a few relevant actions you can take within a model repository. At the top-level, you can click:
- New Model Version from Latest (left): This will create a new version, starting from the configuration of the latest trained model in the repository.
- Upload Ludwig Model (left): This will allow you to upload a local Ludwig model to Predibase.
- Compare (right): This will allow you to compare different model versions.
On each individual model version, you can click the kebab icon on the far right of each model to:
- New Model from #: Create a new version, starting from the configuration of the chosen model.
- Data Science Co-Pilot: Generate recommendations for the next set of experiments for you to try, based on your dataset and past model versions.
- Retrain: Retrain the model, with the same configuration.
- Star: Tag a model version of your choosing with a star.
- Archive: Archive, or soft-delete, model versions. These model versions will now be hidden by default but can always be brought back by toggling
Hide Archived Models
or unarchiving the model version.
Model Lineage
The model lineage tab is where we can see both the model lineage as well as the model diff.
- Version Lineage shows a birds-eye view of the models that have been trained in a repo, along with metadata such as the author and status.
- Version Diff enables you to quickly compare the underlying configuration for each model so you can understand what changed across versions and how that affects model performance.
Version Lineage
Version Diff
Deployments
The Deployments tab displays the real-time deployments for a given model repository. Each row represent a deployment that has the name, associated model version, the host serving engine, status, and URL for querying.
Settings
The Settings tab allows you to view and configure metadata for your model repository. You can also delete the model repository from this page. Admins can delete all repositories. Users may only delete the repositories they created. Deleting a repository removes it for all users, deletes all contained model versions, and cannot be undone.