Prerequisites
- Create an account here
- Navigate to the Settings page and click Generate API Token
- Setup your environment and install the Python SDK:
Create and prompt a private deployment
Let’s start by deploying a model and running inference.Prompt a shared endpoint
For quick experimentation, you can use our shared endpoints, available for SaaS users only.Note the explicit use of special tokens (like [INST]) before and after the
prompt. These are used with instruction- and chat-tuned models to improve
response quality. See Chat Templates
for details.
Stream responses
For longer responses, you might want to stream the tokens as they’re generated:All examples above use the Python SDK for
simplicity. A REST API is also available if you prefer making direct HTTP
calls. See our Chat Completions
API for details.
Next steps
- Check out our officially supported LLMs
- Try the fine-tuning guide to customize a model for your use case
- Connect a dataset via the UI to start fine-tuning without code
- Coming from OpenAI? Check out how to use OpenAI-compatible endpoints hosted on Predibase
Need help?
- Email us at support@predibase.com