Inference API
Create Chat Completion
OpenAI Chat Completions v1 compatible endpoint
POST
/
v1
/
chat
/
completions
Copy
Ask AI
curl --request POST \
--url https://serving.app.predibase.com/tenant_id/deployments/v2/llms/deployment_name/v1/chat/completions \
--header 'Content-Type: application/json' \
--data '{
"model": "alignment-handbook/zephyr-7b-dpo-lora",
"messages": [
{
"role": "user",
"content": "What is deep learning?"
}
],
"temperature": 0.5,
"top_p": 0.95,
"n": 2,
"max_tokens": "20",
"stop": [
"photographer"
],
"stream": "false",
"adapter_source": "<string>",
"api_token": "<string>"
}'
Copy
Ask AI
{
"id": "<string>",
"object": "<string>",
"created": 123,
"model": "<string>",
"choices": [
{
"index": 123,
"message": {
"role": "<string>",
"content": "<string>"
},
"finish_reason": "<string>"
}
],
"usage": {
"prompt_tokens": 123,
"total_tokens": 123,
"completion_tokens": 123
},
"system_fingerprint": "<string>"
}
Body
application/json
Response
200
application/json
Generated Text
The response is of type object
.
Copy
Ask AI
curl --request POST \
--url https://serving.app.predibase.com/tenant_id/deployments/v2/llms/deployment_name/v1/chat/completions \
--header 'Content-Type: application/json' \
--data '{
"model": "alignment-handbook/zephyr-7b-dpo-lora",
"messages": [
{
"role": "user",
"content": "What is deep learning?"
}
],
"temperature": 0.5,
"top_p": 0.95,
"n": 2,
"max_tokens": "20",
"stop": [
"photographer"
],
"stream": "false",
"adapter_source": "<string>",
"api_token": "<string>"
}'
Copy
Ask AI
{
"id": "<string>",
"object": "<string>",
"created": 123,
"model": "<string>",
"choices": [
{
"index": 123,
"message": {
"role": "<string>",
"content": "<string>"
},
"finish_reason": "<string>"
}
],
"usage": {
"prompt_tokens": 123,
"total_tokens": 123,
"completion_tokens": 123
},
"system_fingerprint": "<string>"
}
Assistant
Responses are generated using AI and may contain mistakes.