Skip to main content

pb.deployments.client

pb.deployments.client

Use a LoRAX Client to prompt your LLM

Parameters:

   deployment_ref: str
Name of the deployment to prompt

Returns:

   LoRAX Client

Using LoRAX for inference

LoRAX client provides several functions for inference, including streaming. See the LoRAX docs for all possible parameters you can configure.

Examples:

Example 1: Prompt base model

lorax_client = pb.deployments.client("mistral-7b-instruct")
print(lorax_client.generate("What is your name?").generated_text)

Example 2: Prompt fine-tuned adapter on serverless endpoint with max_new_tokens

lorax_client = pb.deployments.client("mistral-7b-instruct")
print(lorax_client.generate("hello", adapter_id="news-summarizer-model/1", max_new_tokens=100).generated_text)