Overview

Predibase offers flexible options for serving language models, vision models, and embeddings. This guide will help you choose the right deployment option and access method for your needs.

Deployment Options

Private Deployments

🚀 Best for: Production workloads and enterprise use cases Private deployments are our recommended solution for production environments, offering:

Dedicated resources with guaranteed availability
Production-grade SLAs and support
Customizable configuration for your specific needs
Auto-scaling options to handle varying workloads
Full security and isolation
Learn about private deployments →

Shared Endpoints

✨ Best for: Quick experimentation and development Shared endpoints are designed for testing and development purposes only:

Pre-deployed models for rapid prototyping
Subject to rate limits
No infrastructure setup needed
Support for testing custom adapters
Not recommended for production workloads
Try shared endpoints →

Access Methods

Python SDK

Our Python SDK is our recommended way to interact with Predibase models. It offers a simple, intuitive interface with full feature support. To get started, first install the SDK using pip:

# install Python package
pip install -U predibase

REST API

Language-agnostic HTTP interface for integration with any programming language or framework. REST API documentation →

Available Models

Browse our catalog of supported models:

Language Models - Text generation models (Mistral, Mixtral, Llama 2, etc.)
Vision Models - Image understanding and generation
Embedding Models - Text embeddings for search and similarity

Custom Models

Fine-tuned Adapters - By default, all Predibase deployments support serving LoRAs. You can fine-tune adapters on Predibase or use an already trained LoRA for serving.
Custom Base Models - Deploy custom models from Hugging Face

Additional Features

Batch Inference - Process multiple inputs efficiently
Structured Output - Enforce JSON schema on responses
OpenAI Migration Guide - Easily switch from OpenAI

Next Steps

Set up a private deployment for production use
Learn about fine-tuning your own models
Try shared endpoints for quick experimentation

Getting Started

Inference

Fine-Tuning

Account

Integrations

Examples

Resources

Deployment Options

Private Deployments

Shared Endpoints

Access Methods

Python SDK

REST API

Available Models

Custom Models

Additional Features

Next Steps

Getting Started

Inference

Fine-Tuning

Account

Integrations

Examples

Resources

​Deployment Options

​Private Deployments

​Shared Endpoints

​Access Methods

​Python SDK

​REST API

​Available Models

​Custom Models

​Additional Features

​Next Steps

Deployment Options

Private Deployments

Shared Endpoints

Access Methods

Python SDK

REST API

Available Models

Custom Models

Additional Features

Next Steps