pb.deployments.client
pb.deployments.client
Use a LoRAX Client to prompt your LLM
Parameters:
deployment_ref: str
Name of the deployment to prompt
Returns:
Using LoRAX for inference
LoRAX client provides several functions for inference, including streaming. See the LoRAX docs for all possible parameters you can configure.
Examples:
Example 1: Prompt base model
lorax_client = pb.deployments.client("mistral-7b-instruct")
print(lorax_client.generate("What is your name?").generated_text)
Example 2: Prompt fine-tuned adapter max_new_tokens
lorax_client = pb.deployments.client("mistral-7b-instruct")
print(lorax_client.generate("hello", adapter_id="news-summarizer-model/1", max_new_tokens=100).generated_text)
Example 3: Prompt a specific checkpoint from an adapter version
lorax_client = pb.deployments.client("mistral-7b-instruct")
# Prompts using the 7th checkpoint of adapter version `news-summarizer-model/1`.
print(lorax_client.generate("hello", adapter_id="news-summarizer-model/1@7", max_new_tokens=100).generated_text)