pb.deployments.update
pb.deployments.update
Update an existing serverless deployment
Parameters:
deployment_ref: str
Name or UUID of the private serverless deployment
config: Update Deployment Config
Returns:
Deployment
Usage:
To update a deployment, call pb.deployments.update
, and for the config
parameter provide an instance of
UpdateDeploymentConfig
with only the fields you want to change set. Any fields not set will remain unchanged.
Example:
Assume we have an existing deployment named my-mistral-7b
.
pb.deployments.get("my-mistral-7b")
The hf_token field is not shown in the output of pb.deployments.get
, even if set, but it is persisted on the backend.
which might return:
Deployment(
name="my-mistral-7b",
# <...>
config=UpdateDeploymentConfig(
custom_args=["--preloaded-adapter-ids", "my_adapter/1"], cooldown_time=600, hf_token=None, min_replicas=1, max_replicas=2, scale_up_threshold=1
),
)
Update the deployment configuration and provide it to pb.deployments.update
.
pb.deployments.update(
deployment_ref="my-mistral-7b",
config=UpdateDeploymentConfig(
cooldown_time=1200, # Changed from 600
custom_args=["--preloaded-adapter-ids", "my_adapter/1", "--preloaded-adapter-ids", "my_other_adapter/1"], # Added a second adapter
min_replicas=0, # Changed from 1
)
)
Now pb.deployments.get("my-mistral-7b")
will return:
Deployment(
name="my-mistral-7b",
# <...>
config=UpdateDeploymentConfig(
custom_args=[], cooldown_time=1200, hf_token=None, min_replicas=0, max_replicas=2, scale_up_threshold=1
),
)
Note that max_replicase
remains at the non-default value of 2, even though it was not explicitly set in the call to
pb.deployments.update
.
- Updating a deployment will not cause any downtime. The existing deployment will continue to serve requests while the new configuration is applied.
- A very large number of (advanced) configuration parameters are configured by the
custom_args
field. See the custom_args section of DeploymentConfig for more information. - To update Lorax to the latest supported version, specify
lorax_image_tag="<current>"
in theconfig
parameter. - Not all lorax CLI arguments are supported. Passing a non-supported argument will result in an error.
- The SDK and backend do not validate that the values of custom_args are valid lorax parameters. Passing an invalid
value will result in Lorax failing to start the deployment. (However the existing deployment will continue to serve.)
custom_args
is intended as a break-glass feature for advanced users who need to pass additional parameters to Lorax. - If you provide
custom_args
in theconfig
parameter, it will replace the existingcustom_args
list. If you want to add to the existing list, you must include all the existing values in the new list.