Inference API
Info
Inference endpoint info
GET
/
info
Copy
curl --request GET \
--url https://docs.predibase.com/info
Copy
{
"docker_label": "null",
"max_batch_total_tokens": "32000",
"max_best_of": "2",
"max_concurrent_requests": "128",
"max_input_length": "1024",
"max_stop_sequences": "4",
"max_total_tokens": "2048",
"max_waiting_tokens": "20",
"model_device_type": "cuda",
"model_dtype": "torch.float16",
"model_id": "bigscience/blomm-560m",
"model_pipeline_tag": "lorax",
"model_sha": "e985a63cdc139290c5f700ff1929f0b5942cced2",
"sha": "null",
"validation_workers": "2",
"version": "0.5.0",
"waiting_served_ratio": "1.2"
}
Response
200 - application/json
Info Response
The response is of type object
.
Copy
curl --request GET \
--url https://docs.predibase.com/info
Copy
{
"docker_label": "null",
"max_batch_total_tokens": "32000",
"max_best_of": "2",
"max_concurrent_requests": "128",
"max_input_length": "1024",
"max_stop_sequences": "4",
"max_total_tokens": "2048",
"max_waiting_tokens": "20",
"model_device_type": "cuda",
"model_dtype": "torch.float16",
"model_id": "bigscience/blomm-560m",
"model_pipeline_tag": "lorax",
"model_sha": "e985a63cdc139290c5f700ff1929f0b5942cced2",
"sha": "null",
"validation_workers": "2",
"version": "0.5.0",
"waiting_served_ratio": "1.2"
}
Assistant
Responses are generated using AI and may contain mistakes.