Batch Inference
info
To run batch inference, you'll need an appropriate engine. Only users with the Admin role will be able to create new engines. Users with the User role can see existing engines on the Engines page in the Predibase UI.
Run batch inference using the predict method on a DataFrame. Note: Engines will automatically start up when used and may take around 10 minutes to initialize if they weren't already active.
- Python SDK
- CLI
# Create an engine suitable for inference. An appropriate template can be selected
# using `pc.get_engine_templates()`.
eng = pc.create_engine("llm-batch-engine", template="gpu-a10g-small", auto_suspend=1800, auto_resume=True)
import pandas as pd
test_df = pd.DataFrame.from_dict(
{"instruction": ["Write an algorithm in Java to reverse the words in a string."],
"input": ["The quick brown fox jumped over"]})
results = model.predict(targets="output", source=test_df, engine=eng)
pd.set_option('display.max_colwidth', None)
print(results)
pbase create engine --name llm-batch-engine --template gpu-a10g-small
cat <<EOF > test_data.csv
instruction,input
Write an algorithm in Java to reverse the words in a string.,The quick brown fox jumped over
EOF
pbase predict \
--model-name my-llm \
--target output \
--input-csv test_data.csv \
--engine llm-batch-engine