Function calling allows models to interact with external tools and APIs in a
structured way. This guide explains how to fine-tune models with function
calling capabilities in Predibase.
Supported Models
Function calling fine-tuning is currently supported on the following models:
- llama-3-2-1b-instruct
- llama-3-2-3b-instruct
- llama-3-3-70b-instruct
- qwen2-5-1-5b-instruct
- qwen2-5-7b-instruct
- qwen2-5-14b-instruct
- qwen2-5-32b-instruct
Support for additional models is coming soon.
Your tools should follow the schema defined by
Hugging Face.
Here’s an example:
[
{
"type": "function",
"function": {
"name": "multiply",
"description": "A function that multiplies two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number to multiply"
},
"b": {
"type": "number",
"description": "The second number to multiply"
}
},
"required": ["a", "b"]
}
}
}
]
Use a chat-style dataset with a “tools” key at the same level as “messages”:
{
"messages": [
{
"role": "system",
"content": "Predibot is a cool chatbot"
},
{
"role": "user",
"content": "What's 2 times 5"
},
{
"role": "assistant",
"tool_calls": [
{
"type": "function",
"function": {
"name": "multiply",
"arguments": {
"a": 2,
"b": 5
}
}
}
],
"content": ""
}
],
"tools": [
{
"type": "function",
"function": {
"name": "multiply",
"description": "A function that multiplies two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number to multiply"
},
"b": {
"type": "number",
"description": "The second number to multiply"
}
},
"required": ["a", "b"]
}
}
}
]
}
Required Configuration Parameters
When fine-tuning with function calling, you must enable
apply_chat_template
in your
fine-tuning config. You can do
this either through the SDK or by checking the box in the adapter version UI
before training.
How Function Calling Works
Function calling involves an interactive flow between the user, assistant, and
tools:
- System acknowledges tools may be used
- User provides prompt with tools
- Assistant makes appropriate tool calls
- User/system executes tool calls
- Tool returns results
- Assistant provides formatted response
Here’s a complete example of this interaction:
{
"messages": [
{
"role": "system",
"content": "Predibot is a cool chatbot"
},
{
"role": "user",
"content": "I need help multiplying 2 numbers"
},
{
"role": "assistant",
"content": "I can help with that. What are the numbers?"
},
{
"role": "user",
"content": "What's 2 times 5"
},
{
"role": "assistant",
"tool_calls": [
{
"type": "function",
"function": {
"name": "multiply",
"arguments": {
"a": 2,
"b": 5
}
}
}
],
"content": ""
},
{
"role": "tool",
"name": "multiply",
"content": 10
},
{
"role": "assistant",
"content": "2 times 5 equals 10!"
}
],
"tools": [
{
"type": "function",
"function": {
"name": "multiply",
"description": "A function that multiplies two numbers",
"parameters": {
"type": "object",
"properties": {
"a": {
"type": "number",
"description": "The first number to multiply"
},
"b": {
"type": "number",
"description": "The second number to multiply"
}
},
"required": ["a", "b"]
}
}
}
]
}
If your data is in ShareGPT format, you can use this Python script to convert it
to Predibase’s format:
import json
def convert_format(path_to_sharegpt_json: str):
# Load the JSON into a Python dictionary
with open(path_to_sharegpt_json, "r") as f:
data = json.load(f)
# Initialize the input structure
input_data = {
"messages": [],
"tools": json.loads(data["tools"])
}
# Set the system message
input_data["messages"].append({
"role": "system",
"content": data["system"]
})
# Process the conversations
for conversation in data["conversations"]:
if conversation["from"] == "user":
input_data["messages"].append({
"role": "user",
"content": conversation["value"]
})
elif conversation["from"] == "assistant":
input_data["messages"].append({
"role": "assistant",
"content": conversation["value"]
})
elif conversation["from"] == "function_call":
function_call = json.loads(conversation["value"])
input_data["messages"].append({
"role": "assistant",
"tool_calls": [{
"type": "function",
"function": {
"name": function_call["name"],
"arguments": function_call["arguments"]
}
}],
"content": ""
})
elif conversation["from"] == "observation":
observation = json.loads(conversation["name"])
input_data["messages"].append({
"role": "tool",
"name": function_call["name"], # Use the actual tool name from the function call
"content": observation["helper"]
})
return json.dumps(input_data, indent=4)
Example ShareGPT format:
{
"conversations": [
{
"from": "user",
"value": "I need help multiplying 2 numbers"
},
{
"from": "assistant",
"value": "I can help with that. What are the numbers?"
},
{
"from": "user",
"value": "What's 2 times 5"
},
{
"from": "function_call",
"value": "{\"arguments\": {\"a\": \"2\", \"b\": \"5\"}, \"name\": \"multiply\"}"
},
{
"from": "observation",
"name": "{\"result\":\"success\",\"helper\":\"10\"}",
"value": 10
},
{
"from": "assistant",
"value": "2 times 5 equals 10!"
}
],
"system": "Predibot is a cool chatbot",
"tools": "[{\"name\": \"multiply\", \"description\": \"Use this function to multiply two numbers together\", \"parameters\": {\"type\": \"object\", \"properties\": {\"a\": {\"type\": \"number\", \"description\": \"The first number to multiply\"}, \"b\": {\"type\": \"string\", \"description\": \"The second number to multiply\"}}, \"required\": [\"a\", \"b\"]}}]"
}
Next Steps