Skip to main content

Deploy a Pretrained LLM

Predibase supports deploying any pretrained Large Language Model hosted on the HuggingFace Hub to a hosted endpoint for real-time inference.

info

Only VPC users with the Admin role will be able to deploy a pretrained LLM. Predibase Cloud users will have access to shared deployments without the need to manage any deployments themselves.

Deploying via SDK

Currently, we support using the Python SDK / Command Line Interface (CLI) to manage LLM deployments. A UI version is in the works and will be available soon.

Prerequisites:

  1. Install the Predibase Python SDK

  2. Select a pretrained text generation model from the HuggingFace Hub.

To deploy an LLM to a hosted endpoint:

from predibase import PredibaseClient

pc = PredibaseClient()
llm = pc.LLM("hf://meta-llama/Llama-2-7b-chat-hf")
deployment = llm.deploy("my-first-llm").get()

Testing the deployment

While LLMs can be queried via the UI, the SDK can be used to programmatically query and test the the deployment is up and ready for use.

from predibase import PredibaseClient

pc = PredibaseClient()
deployment = pc.LLM("pb://deployments/my-first-llm")
result = deployment.prompt("What is the capital of Italy?")

Selecting an Engine Template

By default, your LLM will be deployed on an engine with a single Nvidia A10G GPU. This will be sufficient for serving most LLMs under 10 billion parameters. For larger models, you will want to upgrade to a larger engine type.

To deploy an LLM using a specific engine template:

from predibase import PredibaseClient

pc = PredibaseClient()
llm = pc.LLM("hf://meta-llama/Llama-2-7b-chat-hf")
deployment = llm.deploy("my-first-llm", engine_template="llm-gpu-large").get()

Available Engine Templates

Engine TemplateGPUsGPU SKUvCPUsRAMDisk
llm-gpu-small1A10G7810m29217Mi100Gi
llm-gpu-large4A10G47710m173300Mi400Gi

Delete a Deployment

Deployments can be deleted via the SDK to free up compute resources:

from predibase import PredibaseClient

pc = PredibaseClient()
pc.LLM("pb://deployments/my-first-llm").delete()