- Quick experimentation and prototyping
- Development and testing environments
- Learning and evaluation of models
- Proof of concept development
Quick Start
First, install the Predibase Python SDK:Available Models
Predibase offers several popular models as shared endpoints for testing and development. See our supported models for the complete list.Using Shared Endpoints
With Python SDK
Here’s a detailed example showing both basic text generation and streaming responses for development:With REST API
For testing language-agnostic integration, you can use our REST API:Testing Custom Models
You can test your fine-tuned adapters on shared endpoints during development:Rate Limits
Shared endpoints are subject to rate limits. Rate limits are restrictions that our API enforces on how often users can access our services within a given time period and can be identified via HTTP 429 error codes.Rate Limits by Tier
Tier | Rate Limit | Daily | Monthly |
---|---|---|---|
Free | 1 request / sec | 1 million tokens / day | 10 million tokens / day |
Enterprise SaaS | 100 requests / sec | 1 million tokens / day | 10 million tokens / day |
Enterprise VPC | Does not apply | Does not apply | Does not apply |
Rate Limit Headers
When making API requests, you’ll receive the following headers that help you monitor your rate limit status:Header | Description |
---|---|
x-envoy-ratelimited | Whether the rate limit has been reached |
x-ratelimit-limit | The max number of requests until the rate limit is reached |
x-ratelimit-remaining | The remaining number of requests until the rate limit is reached |
x-ratelimit-reset | Amount of time (seconds) until you can query again |
Moving to Production
When you’re ready to move your application to production:- Set up a private deployment for production-grade reliability
- Configure auto-scaling to handle your workload
- Take advantage of dedicated resources and SLAs