Skip to main content

UpdateDeploymentConfig

Below is the class definition for the Update Deployment Config. It inherits from the BaseModel defined in Pydantic.

class UpdateDeploymentConfig(BaseModel):
base_model: str | None = Field(default=None)
cooldown_time: PositiveInt | None = Field(default=None)
custom_args: list[str] | None = Field(default=None)
hf_token: str | None = Field(default=None)
min_replicas: NonNegativeInt | None = Field(default=None)
max_replicas: PositiveInt | None = Field(default=None)
scale_up_threshold: PositiveInt | None = Field(default=None)
max_total_tokens: int | None = Field(default=None)
lorax_image_tag: str | None = Field(default=None)
request_logging_enabled: bool | None = Field(default=None)
direct_ingress: bool | None = Field(default=None)
preloaded_adapters: list[str] | None = Field(default=None)
speculator: str | None = Field(default=None)
prefix_caching: bool | None = Field(default=None)
disable_adapters: bool | None = Field(default=None)

The meaning of these fields is identical to those in DeploymentConfig.

Note that there are a very large number of (advanced) configuration parameters that are configured by the custom_args field. See the custom_args section of DeploymentConfig for more information.

The default value of all of these fields is None. When updating a deployment, any fields set to None will be left at their currently-deployed value. Non-None fields will be updated to the value specified in the field. See Update Deployment for more information.

NOTES
  • To update the lorax version to the latest supported version, set lorax_image_tag to <current>.
  • When updating to a new custom_args value, the full value of the new custom_args field must be provided. E.g. if the existing value is ['foo'] and you want it to be ['foo', 'bar'], you must provide ['foo', 'bar'] as the new value.
  • A deployment's base model can be updated via the SDK. Note that a base model with significantly different properties (e.g. model size) may not be servable with the same deployment configuration, so only update with a base model that is similar to the existing one. (If the new model is not servable with the existing configuration, the deployment will continue to serve with the previous version.)