Quickstart

Predibase provides the fastest way to fine-tune and serve open-source LLMs. It's built on top of open-source LoRAX.

Fine-tuning: Fine-tune and serve a model in just a few steps using the SDK or UI
Shared endpoints: Try the Python SDK or the Web Playground to prompt serverless endpoints for quick iteration and prototyping
Production-ready private serverless inference: Deploy your base model to serve an unlimited number of adapters using the SDK or UI

Deploy and prompt a model in 6 lines of code

Enterprise and VPC customers can deploy private serverless endpoints using the following six lines of code. Shared endpoints are available for all users.

pip install predibase
python
>>> from predibase import Predibase, DeploymentConfig
>>> pb = Predibase(api_token=<TOKEN>)
>>> pb.deployments.create(name='my-model', config=DeploymentConfig(base_model="mistral-7b"))
>>> print(pb.deployments.client('my-model').generate("What is a Large Language Model?", max_new_tokens=50).generated_text)
# "Large Language Models (LLMs) are a type of artificial intelligence (AI)..."

Read on for more details!

Run inference using the SDK or REST

Create an account here.
Navigate to the Settings page and click Generate API Token.
Setup a venv and install the Python SDK (if running locally - not required for Google Collab or similar):

python3.9 -m venv .venv
source .venv/bin/activate
pip install -U predibase

See available shared deployments. (Note: Enterprise customers can deploy a private serverless deployment, and VPC customers are required to.)

Python SDK
REST

from predibase import Predibase, FinetuningConfig, DeploymentConfig

pb = Predibase(api_token="<PREDIBASE API TOKEN>")

# Optionally get a list of available models by calling pb.deployments.list()
lorax_client = pb.deployments.client("mistral-7b-instruct-v0-2") # Insert deployment name here
resp = lorax_client.generate("[INST] What are some popular tourist spots in San Francisco? [/INST]")
print(resp.generated_text)

# Export environment variables
export PREDIBASE_API_TOKEN="<YOUR TOKEN HERE>" # Settings > My Profile > Generate API Token
export PREDIBASE_TENANT_ID="<YOUR TENANT ID>" # Settings > My Profile > Overview > Tenant ID
export PREDIBASE_DEPLOYMENT="mistral-7b-instruct-v0-2"

# query the LLM deployment
curl -d '{"inputs": "[INST] What are some popular tourist spots in San Francisco? [/INST]"}' \
    -H "Content-Type: application/json" \
    -X POST https://serving.app.predibase.com/$PREDIBASE_TENANT_ID/deployments/v2/llms/$PREDIBASE_DEPLOYMENT/generate \
    -H "Authorization: Bearer ${PREDIBASE_API_TOKEN}"

info

Note the explicit use of special tokens before and after the prompt. These are used with instruction- and chat-tuned models to improve response quality. See Instruction Templates for details on how these should be applied for each of the serverless model endpoints.

Streaming

Python SDK
REST

from predibase import Predibase, FinetuningConfig, DeploymentConfig

pb = Predibase(api_token="<PREDIBASE API TOKEN>")

lorax_client = pb.deployments.client("mistral-7b-instruct-v0-2") # Insert deployment name here

for resp in lorax_client.generate_stream("[INST] What are some popular tourist spots in San Francisco? [/INST]"):
    if not resp.token.special:
        print(resp.token.text, sep="", end="", flush=True)

# Export environment variables
export PREDIBASE_API_TOKEN="<YOUR TOKEN HERE>" # Settings > My Profile > Generate API Token
export PREDIBASE_TENANT_ID="<YOUR TENANT ID>" # Settings > My Profile > Overview > Tenant ID
export PREDIBASE_DEPLOYMENT="mistral-7b-instruct-v0-2"

# query the LLM deployment
curl -d '{"inputs": "[INST] What are some popular tourist spots in San Francisco? [/INST]"}' \
    -H "Content-Type: application/json" \
    -X POST https://serving.app.predibase.com/$PREDIBASE_TENANT_ID/deployments/v2/llms/$PREDIBASE_DEPLOYMENT/generate_stream \
    -H "Authorization: Bearer ${PREDIBASE_API_TOKEN}"

Next steps

Try out the full example to fine-tune and prompt an adapter in Predibase using the SDK
Don't want to code at all? Use the UI to connect a dataset and start fine-tuning an adapter.
Coming from OpenAI? Check out our migration guides for serving
Explore additional complete examples
See how you Predibase integrates with other frameworks in the ecosystem

Get in touch

Reach out to us at support@predibase.com or join us on Discord for any questions, comments, or feedback!

Quickstart

Deploy and prompt a model in 6 lines of code​

Run inference using the SDK or REST​

Streaming​

Next steps​

Get in touch​

Deploy and prompt a model in 6 lines of code

Run inference using the SDK or REST

Streaming

Next steps

Get in touch