Predibase offers flexible options for serving language models, vision models, and embeddings. This guide will help you choose the right deployment option and access method for your needs.

Deployment Options

Private Deployments

🚀 Best for: Production workloads and enterprise use cases

Private deployments are our recommended solution for production environments, offering:

  • Dedicated resources with guaranteed availability
  • Production-grade SLAs and support
  • Customizable configuration for your specific needs
  • Auto-scaling options to handle varying workloads
  • Full security and isolation
  • Learn about private deployments →

Shared Endpoints

Best for: Quick experimentation and development

Shared endpoints are designed for testing and development purposes only:

  • Pre-deployed models for rapid prototyping
  • Subject to rate limits
  • No infrastructure setup needed
  • Support for testing custom adapters
  • Not recommended for production workloads
  • Try shared endpoints →

Access Methods

Python SDK

Our Python SDK is our recommended way to interact with Predibase models. It offers a simple, intuitive interface with full feature support. To get started, first install the SDK using pip:

# install Python package
pip install -U predibase

REST API

Language-agnostic HTTP interface for integration with any programming language or framework. REST API documentation →

Available Models

Browse our catalog of supported models:

Custom Models

  1. Fine-tuned Adapters - By default, all Predibase deployments support serving LoRAs. You can fine-tune adapters on Predibase or use an already trained LoRA for serving.
  2. Custom Base Models - Deploy custom models from Hugging Face

Additional Features

Next Steps

  1. Set up a private deployment for production use
  2. Learn about fine-tuning your own models
  3. Try shared endpoints for quick experimentation