Skip to main content


PROMPT is used to query one or more large language models (LLM) with a template, over an optional index, given a query input.


Following is the grammar for making predictions:

PROMPT template [, template ]*
[ WITH property [, property]* ]
USING deployment [, deployment] *
[ OVER dataset_index ]
[ GIVEN input [, input ]* ]

OPTIONS ( key=number [, key=number]* )

'text' [ AS alias ]

DEPLOYMENT deployment_name [ VERSION deployment_version ]

{ constant_input | range_input | query_input }

The * syntax in the above grammar indicates that an element can repeat zero or more times.


The prompt requires a query string template to ask a question of the model.

With property

The WITH clause can specify additional properties for the prompt:

Following is the set of supported properties:

  • OPTIONS: provide a parameterized list of key=value options.
  • METADATA: returns additional columns for model_name and model_version.

With options

You can specify additional option that are provided to the LLM:

Following is the set of supported optinos:

  • temperature: (Default: 0.1). Float (0.0-1.0). The temperature of the sampling operation. 1 means regular sampling, 0 means always take the highest score, 100.0 is getting closer to uniform probability.
  • max_new_tokens: (Default: 128). Int (1-512). The amount of new tokens to be generated, this does not include the input length it is a estimate of the size of generated text you want. Each new tokens slows down the request, so look for balance between response times and length of text generated. For users running in Prediase Cloud, the number of tokens generated is used for metering usage of LLM deployments, so reducing this value may reduce consumption towards the daily token quota. For incoming requests, if the current token usage plus the max new tokens would exceed the daily token limit, then the request will fail immediately.

Using deployment(s) clause

You can specify a pre-trained or fine tuned LLM model deployment with the USING DEPLOYMENT clause. Remember to escape identifies that might have special characters:

USING DEPLOYMENT "flan-t5-xxl"

You may optionally specify multiple model deployments separated by a comma (,):

USING DEPLOYMENT "flan-t5-xxl", "redpajama-7b"

The results set will be a table with one row per model.

Input features

GIVEN is used to declaratively construct the input features to use for prompt.

Query inputs

Simple SQL select statements can be used as input for example:

GIVEN SELECT * from hotel_reviews

see also: PREDICT for more examples.


Follow is a simple example of how to use the prompt to answer simple questions:

'What is the capital of Italy?'
USING DEPLOYMENT "flan-t5-xxl"

Advanced: indexed batch prediction

The following prompt returns multiple results, substituting the hotel name from the hotel_reviews dataset using an index to improve performance:

'What is the rating of hotel: {name}'
WITH OPTIONS (temperature=0.1, max_new_tokens=5)
USING DEPLOYMENT "flan-t5-xxl"
OVER hotel_reviews
GIVEN SELECT * from hotel_reviews

Advanced: multi-template, multi-model

The following will produce number of rows equal to the outer product of the templates and models:

'What is the capital of Italy?',
'What is the most populous city in Europe?'