Predict
Deployment.predict
Deployment.predict(input_df, stream)
Make predictions for each row of the pandas dataframe.
Parameters:
input_df: pd.DataFrame
Input pandas dataframe.
stream: bool
Optional flag to stream results via gRPC protocol (default False).
Returns:
pd.DataFrame pandas dataframe with predictions.
Examples:
Get the taxi deployment, and make predictions over gRPC using test dataset.
df = pd.read_csv('titanic_test.csv')
deployment = pc.list_deployments('taxi_deployment')[0]
df = deployment.predict(df, stream=True)
df.head()
Deployment.async_predict
Deployment.async_predict(input_df, stream)
Asyncio version of Deployment.predict.
Parameters:
input_df: pd.DataFrame
Input pandas dataframe.
stream: bool
Optional flag to stream results via gRPC protocol (default False).
Returns:
pd.DataFrame pandas dataframe with predictions.
Examples:
Make predictions on two deployments in parallel using asyncio:
import asyncio
import pandas as pd
async def main():
deployment_1 = pc.list_deployments('taxi_deployment')[0]
deployment_2 = pc.list_deployments('taxi_deployment')[1]
df = pd.read_csv("/path/to/file.csv")
future = await asyncio.gather(
deployment_1.async_predict(df, stream=True),
deployment_2.async_predict(df, stream=True),
)
print(future[0].head(3))
print(future[1].head(3))
if __name__ == "__main__":
asyncio.run(main())
HTTP Request with python
First we're going to import the necessary libraries and get the endpoint for our deployment. Note that you can also get this endpoint from the UI. It is located both on the deployments page and the model version page of the deployed model.
import os
import requests
import json
metadata_url = pc.get_deployment("DEPLOYMENT NAME HERE").deployment_url
inference_url = metadata_url + "/infer"
request_headers = {"Authorization": f'Bearer {os.environ.get("PREDIBASE_API_TOKEN")}'}
Next we need to structure our input payload. We can make a GET
request to the metadata endpoint to see what structure
and data types the deployment is expecting.
metadata_response = requests.get(metadata_url, headers=request_headers)
json.loads(metadata_response.text)
You should see an output that looks something like this:
{
'name': 'model-3263',
'versions': ['10'],
'platform': 'ensemble',
'inputs': [
{'name': 'geo_enabled', 'datatype': 'BYTES', 'shape': [1]},
{'name': 'favourites_count', 'datatype': 'FP64', 'shape': [1]},
{'name': 'profile_background_image_path', 'datatype': 'BYTES', 'shape': [1]},
{'name': 'account_age_days', 'datatype': 'FP64', 'shape': [1]},
{'name': 'average_tweets_per_day', 'datatype': 'FP64', 'shape': [1]},
{'name': 'verified', 'datatype': 'BYTES', 'shape': [1]},
{'name': 'location', 'datatype': 'BYTES', 'shape': [1]},
{'name': 'statuses_count', 'datatype': 'FP64', 'shape': [1]},
{'name': 'lang', 'datatype': 'BYTES', 'shape': [1]},
{'name': 'friends_count', 'datatype': 'FP64', 'shape': [1]},
{'name': 'default_profile', 'datatype': 'BYTES', 'shape': [1]},
{'name': 'followers_count', 'datatype': 'FP64', 'shape': [1]},
{'name': 'description', 'datatype': 'BYTES', 'shape': [1]}
],
'outputs': [
{'name': 'account_type::predictions', 'datatype': 'BYTES', 'shape': [1]},
{'name': 'account_type::probabilities', 'datatype': 'FP32', 'shape': [1, 2]}
]
}
Using this information, we can structure the payload that we want to send to the model:
payload = {
"inputs": [
{"name": "geo_enabled", "shape": [1], "datatype": "BYTES", "data": ["1"]},
{"name": "favourites_count", "shape": [1], "datatype": "FP64", "data": [755]},
{"name": "profile_background_image_path", "shape": [1], "datatype": "BYTES", "data": [""]},
{"name": "account_age_days", "shape": [1], "datatype": "FP64", "data": [258]},
{"name": "average_tweets_per_day", "shape": [1], "datatype": "FP64", "data": [5]},
{"name": "verified", "shape": [1], "datatype": "BYTES", "data": ["0"]},
{"name": "location", "shape": [1], "datatype": "BYTES", "data": ["United States"]},
{"name": "statuses_count", "shape": [1], "datatype": "FP64", "data": [189]},
{"name": "lang", "shape": [1], "datatype": "BYTES", "data": ["en"]},
{"name": "friends_count", "shape": [1], "datatype": "FP64", "data": [5000]},
{"name": "default_profile", "shape": [1], "datatype": "BYTES", "data": ["1"]},
{"name": "followers_count", "shape": [1], "datatype": "FP64", "data": [1200]},
{"name": "description", "shape": [1], "datatype": "BYTES", "data": ["This will be your favorite account by next week!"]}
]
}
As you can see above, we have structured the payload in the same way that the metadata response showed us. The only
difference is that we added in the "data"
key along with the corresponding input value.
Now we can pass this payload in with a post request to the inference endpoint to get our prediction:
inference_response = requests.post(inference_url, json=payload, headers=request_headers)
prediction_data = json.loads(inference_response.text)
prediction = prediction_data['outputs'][1]['data'][0]
confidence = prediction_data['outputs'][0]['data'][0]
print("MODEL PREDICTION: ", prediction, "\nMODEL CONFIDENCE: ", confidence)
The model outputs should look something like this:
MODEL PREDICTION: human
MODEL CONFIDENCE: 0.899675178527832
HTTP Request with curl
Here the process is very similar as making a request with the python requests library. The main difference here is that
we will be writing the payload out to a temporary file as opposed to entering it into the curl
command since the input
payload can be quite large.
%%writefile payload.json
{
"inputs": [
{"name": "geo_enabled", "shape": [1], "datatype": "BYTES", "data": ["1"]},
{"name": "favourites_count", "shape": [1], "datatype": "FP64", "data": [755]},
{"name": "profile_background_image_path", "shape": [1], "datatype": "BYTES", "data": [""]},
{"name": "account_age_days", "shape": [1], "datatype": "FP64", "data": [258]},
{"name": "average_tweets_per_day", "shape": [1], "datatype": "FP64", "data": [5]},
{"name": "verified", "shape": [1], "datatype": "BYTES", "data": ["0"]},
{"name": "location", "shape": [1], "datatype": "BYTES", "data": ["United States"]},
{"name": "statuses_count", "shape": [1], "datatype": "FP64", "data": [189]},
{"name": "lang", "shape": [1], "datatype": "BYTES", "data": ["en"]},
{"name": "friends_count", "shape": [1], "datatype": "FP64", "data": [5000]},
{"name": "default_profile", "shape": [1], "datatype": "BYTES", "data": ["1"]},
{"name": "followers_count", "shape": [1], "datatype": "FP64", "data": [1200]},
{"name": "description", "shape": [1], "datatype": "BYTES", "data": ["This will be your favorite account by next week!"]}
]
}
Here we are writing the payload.json file from a python shell, but you can write this payload file any way that works best for you. You also can pass the inputs payload directly in if that works better for you as well.
Once you have the payload, you can run the following command to make a POST request to the inference endpoint:
!curl -H "Authorization: Bearer $PREDIBASE_API_TOKEN" 'ENTER DEPLOYMENT ENDPOINT HERE + /infer' -d @payload.json
Just make sure that you have set your PREDIBASE_API_TOKEN
environment variable.