llm_deployment.prompt
llm_deployment.prompt(templates, options=None)
This method allows you to query the specified deployment in your Predibase environment. You have the option to provide a Predibase Dataset to add context to the query via Retrieval Augmented Generation (RAG). Additionally, you can specify a Predibase Dataset to run the query against - in this case, the LLM will generate a response for every row of the dataset (up to the specified limit).
Parameters:
templates: Union[str, List[str]]
The prompt to be passed to the specified LLM. This can be passed as a raw string, or as a list of raw strings which will be combined into a single prompt to send to the LLM. This is helpful for structuring few-shot learning examples by passing them in as a list of examples.
options: Optional[Dict[str, float]]
The additional query options passed along with the query, including:
max_new_tokens (default: 32
) : The maximum number of new tokens to generate, ignoring the number of tokens in the input prompt.
temperature (default: 0.1
) : Temperature is used to control the randomness of predictions. A high temperature value (closer to 1) makes the output more diverse and random, while a lower temperature (closer to 0) makes the model's responses more deterministic and focused on the most likely outcome. In other words, temperature adjusts the probability distribution from which the model picks the next token.
Returns:
A list of Predibase GeneratedResponse objects
Examples:
Simple LLM query (e.g. using llm_deployment
from the previous page):
llm_deployment.prompt("What is the capital of Italy")
# [
# GeneratedResponse(
# prompt='What is the capital of Italy?',
# response='\nA chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the users questions. USER: What is the capital of Italy? ASSISTANT:\n\nA chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the users questions. USER: What is the capital of Italy? ASSISTANT:\n\nA chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the users questions. USER: What is the capital of Italy? ASSISTANT:',
# raw_prompt=None,
# sample_index=0,
# sample_data=None,
# context=None,
# model_name='llama-2-13b',
# finish_reason=None,
# generated_tokens=None
# )
# ]