Skip to main content

Shared Serverless Endpoints

info

Shared serverless deployments are available for free inference and are intended for experimentation and fast iteration since they're served on shared infrastructure and subject to capacity constraints.

For production-ready inference, use private serverless deployments.

Prompt Base Models

Predibase supports a variety of base models as shared serverless deployments.

When prompting using the SDK or REST API, we recommend including the model-specific instruction template, otherwise you may see less than stellar results. Prompting in the UI includes these templates by default.

# Specify the shared serverless deployment by name
lorax_client = pb.deployments.client("mistral-7b-instruct-v0-2")
print(lorax_client.generate("""<<SYS>>You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.<</SYS>>

[INST] What is the best pizza restaurant in New York? [/INST]""", max_new_tokens=100).generated_text)

Prompt Fine-tuned Models (with LoRAX)

If the base model is one of the available shared serverless endpoints, you can prompt your fine-tuned model immediately after training. If your base model is a custom model, you will need to use a private serverless deployment to prompt your fine-tuned model.

# Specify the shared serverless deployment of the base model which was fine-tuned
lorax_client = pb.deployments.client("mistral-7b-instruct-v0-2")

# Specify your adapter_id as "adapter-repo-name/adapter-version-number"
print(lorax_client.generate("""<<SYS>>You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.<</SYS>>

[INST] What is the best pizza restaurant in New York? [/INST]""", adapter_id="adapter-repo-name/1", max_new_tokens=100).generated_text)