Deployment Options
Private Deployments
🚀 Best for: Production workloads and enterprise use cases Private deployments are our recommended solution for production environments, offering:- Dedicated resources with guaranteed availability
- Production-grade SLAs and support
- Customizable configuration for your specific needs
- Auto-scaling options to handle varying workloads
- Full security and isolation
- Learn about private deployments →
Shared Endpoints
✨ Best for: Quick experimentation and development Shared endpoints are designed for testing and development purposes only:- Pre-deployed models for rapid prototyping
- Subject to rate limits
- No infrastructure setup needed
- Support for testing custom adapters
- Not recommended for production workloads
- Try shared endpoints →
Access Methods
Python SDK
Our Python SDK is our recommended way to interact with Predibase models. It offers a simple, intuitive interface with full feature support. To get started, first install the SDK using pip:REST API
Language-agnostic HTTP interface for integration with any programming language or framework. REST API documentation →Available Models
Browse our catalog of supported models:- Language Models - Text generation models (Mistral, Mixtral, Llama 2, etc.)
- Vision Models - Image understanding and generation
- Embedding Models - Text embeddings for search and similarity
Custom Models
- Fine-tuned Adapters - By default, all Predibase deployments support serving LoRAs. You can fine-tune adapters on Predibase or use an already trained LoRA for serving.
- Custom Base Models - Deploy custom models from Hugging Face
Additional Features
- Batch Inference - Process multiple inputs efficiently
- Structured Output - Enforce JSON schema on responses
- OpenAI Migration Guide - Easily switch from OpenAI
Next Steps
- Set up a private deployment for production use
- Learn about fine-tuning your own models
- Try shared endpoints for quick experimentation