Function calling allows models to interact with external tools and APIs in a structured way. This guide explains how to fine-tune models with function calling capabilities in Predibase.

Supported Models

Function calling fine-tuning is currently supported on the following models:

  • llama-3-2-1b-instruct
  • llama-3-2-3b-instruct
  • llama-3-3-70b-instruct
  • qwen2-5-1-5b-instruct
  • qwen2-5-7b-instruct
  • qwen2-5-14b-instruct
  • qwen2-5-32b-instruct

Support for additional models is coming soon.

Tool Schema

Your tools should follow the schema defined by Hugging Face. Here’s an example:

[
  {
    "type": "function",
    "function": {
      "name": "multiply",
      "description": "A function that multiplies two numbers",
      "parameters": {
        "type": "object",
        "properties": {
          "a": {
            "type": "number",
            "description": "The first number to multiply"
          },
          "b": {
            "type": "number",
            "description": "The second number to multiply"
          }
        },
        "required": ["a", "b"]
      }
    }
  }
]

Chat Dataset Format

Use a chat-style dataset with a “tools” key at the same level as “messages”:

{
  "messages": [
    {
      "role": "system",
      "content": "Predibot is a cool chatbot"
    },
    {
      "role": "user",
      "content": "What's 2 times 5"
    },
    {
      "role": "assistant",
      "tool_calls": [
        {
          "type": "function",
          "function": {
            "name": "multiply",
            "arguments": {
              "a": 2,
              "b": 5
            }
          }
        }
      ],
      "content": ""
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "multiply",
        "description": "A function that multiplies two numbers",
        "parameters": {
          "type": "object",
          "properties": {
            "a": {
              "type": "number",
              "description": "The first number to multiply"
            },
            "b": {
              "type": "number",
              "description": "The second number to multiply"
            }
          },
          "required": ["a", "b"]
        }
      }
    }
  ]
}

Required Configuration Parameters

When fine-tuning with function calling, you must enable apply_chat_template in your fine-tuning config. You can do this either through the SDK or by checking the box in the adapter version UI before training.

How Function Calling Works

Function calling involves an interactive flow between the user, assistant, and tools:

  1. System acknowledges tools may be used
  2. User provides prompt with tools
  3. Assistant makes appropriate tool calls
  4. User/system executes tool calls
  5. Tool returns results
  6. Assistant provides formatted response

Here’s a complete example of this interaction:

{
  "messages": [
    {
      "role": "system",
      "content": "Predibot is a cool chatbot"
    },
    {
      "role": "user",
      "content": "I need help multiplying 2 numbers"
    },
    {
      "role": "assistant",
      "content": "I can help with that. What are the numbers?"
    },
    {
      "role": "user",
      "content": "What's 2 times 5"
    },
    {
      "role": "assistant",
      "tool_calls": [
        {
          "type": "function",
          "function": {
            "name": "multiply",
            "arguments": {
              "a": 2,
              "b": 5
            }
          }
        }
      ],
      "content": ""
    },
    {
      "role": "tool",
      "name": "multiply",
      "content": 10
    },
    {
      "role": "assistant",
      "content": "2 times 5 equals 10!"
    }
  ],
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "multiply",
        "description": "A function that multiplies two numbers",
        "parameters": {
          "type": "object",
          "properties": {
            "a": {
              "type": "number",
              "description": "The first number to multiply"
            },
            "b": {
              "type": "number",
              "description": "The second number to multiply"
            }
          },
          "required": ["a", "b"]
        }
      }
    }
  ]
}

Converting from ShareGPT Format

If your data is in ShareGPT format, you can use this Python script to convert it to Predibase’s format:

import json

def convert_format(path_to_sharegpt_json: str):
    # Load the JSON into a Python dictionary
    with open(path_to_sharegpt_json, "r") as f:
        data = json.load(f)

    # Initialize the input structure
    input_data = {
        "messages": [],
        "tools": json.loads(data["tools"])
    }

    # Set the system message
    input_data["messages"].append({
        "role": "system",
        "content": data["system"]
    })

    # Process the conversations
    for conversation in data["conversations"]:
        if conversation["from"] == "user":
            input_data["messages"].append({
                "role": "user",
                "content": conversation["value"]
            })
        elif conversation["from"] == "assistant":
            input_data["messages"].append({
                "role": "assistant",
                "content": conversation["value"]
            })
        elif conversation["from"] == "function_call":
            function_call = json.loads(conversation["value"])
            input_data["messages"].append({
                "role": "assistant",
                "tool_calls": [{
                    "type": "function",
                    "function": {
                        "name": function_call["name"],
                        "arguments": function_call["arguments"]
                    }
                }],
                "content": ""
            })
        elif conversation["from"] == "observation":
            observation = json.loads(conversation["name"])
            input_data["messages"].append({
                "role": "tool",
                "name": function_call["name"],  # Use the actual tool name from the function call
                "content": observation["helper"]
            })

    return json.dumps(input_data, indent=4)

Example ShareGPT format:

{
  "conversations": [
    {
      "from": "user",
      "value": "I need help multiplying 2 numbers"
    },
    {
      "from": "assistant",
      "value": "I can help with that. What are the numbers?"
    },
    {
      "from": "user",
      "value": "What's 2 times 5"
    },
    {
      "from": "function_call",
      "value": "{\"arguments\": {\"a\": \"2\", \"b\": \"5\"}, \"name\": \"multiply\"}"
    },
    {
      "from": "observation",
      "name": "{\"result\":\"success\",\"helper\":\"10\"}",
      "value": 10
    },
    {
      "from": "assistant",
      "value": "2 times 5 equals 10!"
    }
  ],
  "system": "Predibot is a cool chatbot",
  "tools": "[{\"name\": \"multiply\", \"description\": \"Use this function to multiply two numbers together\", \"parameters\": {\"type\": \"object\", \"properties\": {\"a\": {\"type\": \"number\", \"description\": \"The first number to multiply\"}, \"b\": {\"type\": \"string\", \"description\": \"The second number to multiply\"}}, \"required\": [\"a\", \"b\"]}}]"
}

Next Steps