Fine tuning with the result of a function call / "tool" role?

I am trying to do some fine-tuning and want to integrate function calls in my fine tuning jsonl file. I found the cookbook entry about fine tuning with function calls, and also several other resources, but none of them show how to integrate the answer to a function call into fine-tuning.

For example, say I want to fine tune a model on the following conversation:

  • The user asks “What’s the current stock price for AAPL?”
  • The assistant initiates a tool call getStockPrice(AAPL)
  • The tool responds with some info, e.g. {‘price’: ‘129.0’, ‘currency’: ‘EUR’}
  • The assistant responds with text “The current stock price for AAPL is 129€”

I tried fine tuning a model as a test by including the following json-line in my jsonl-File:

{"messages": [{"role":"user","content":[{"type":"text","text":"What's the stock price for AAPL currently?"}]},{"role":"assistant","content":[{"type":"text","text":""}],"tool_calls":[{"id":"call_QtVUVBe4Fv3orNGbUtq19gW","type":"function","function":{"name":"get_stock_price","arguments":"{\"symbol\":\"AAPL\"}"}}]},{"role":"tool","content":[{"type":"text","text":"{'price': '129.0', 'currency': 'EUR'}"}],"tool_call_id":"call_QtVUVBe4Fv3orNGbUtq19gW"},{"role":"assistant","content":[{"type":"text","text":"The current stock price for AAPL (Apple Inc.) is €129.0."}]}], "parallel_tool_calls": false, "tools": [{"type":"function","function":{"name":"get_stock_price","description":"Get the current stock price","parameters":{"type":"object","properties":{"symbol":{"type":"string","description":"The stock symbol"}},"additionalProperties":false,"required":["symbol"]},"strict":true}}]}

(I got this structure by creating the conversation on the playground and then exporting it as code)

But when uploading this file in the playground, I get the follwoing strange error:
The job failed due to an invalid training file. Invalid file format. Line 1, message 3, key “content”: Input should be a valid string

which is somewhat confusing as message 3 is the tool response call, which does contain a valid string.

How, if at all, do I need to format the “answer” to a tool call to use in fine-tuning?

1 Like

The error you’re encountering likely stems from how the content field is structured for the tool response. In fine-tuning JSONL files, the content of a tool’s response should be a valid string. In your example, you’re using a dictionary-like structure inside the content field, which could be causing the error.

Here’s a corrected version of your JSON:

  1. Ensure that the tool response (role: tool) contains a plain text string, even if it represents structured data. You can store the response as a string representation.
  2. Additionally, ensure that all fields are properly formatted as plain text strings.

Here’s an adjusted version:

{
“messages”: [
{
“role”: “user”,
“content”: “What’s the stock price for AAPL currently?”
},
{
“role”: “assistant”,
“content”: “”,
“tool_calls”: [
{
“id”: “call_QtVUVBe4Fv3orNGbUtq19gW”,
“type”: “function”,
“function”: {
“name”: “get_stock_price”,
“arguments”: “{"symbol":"AAPL"}”}}]},
{
“role”: “tool”,
“content”: “{"price": "129.0", "currency": "EUR"}”,
“tool_call_id”: “call_QtVUVBe4Fv3orNGbUtq19gW”
},
{
“role”: “assistant”,
“content”: “The current stock price for AAPL (Apple Inc.) is €129.0.”
}
],
“parallel_tool_calls”: false,
“tools”: [
{
“type”: “function”,
“function”: {
“name”: “get_stock_price”,
“description”: “Get the current stock price”,
“parameters”: {
“type”: “object”,
“properties”: {
“symbol”: {
“type”: “string”,
“description”: “The stock symbol”
}
},
“additionalProperties”: false,
“required”: [“symbol”]
},
“strict”: true} }]}

In this case:

  1. The tool response is now correctly stored as a string "{\"price\": \"129.0\", \"currency\": \"EUR\"}".
  2. The content field for all roles is a plain string, ensuring proper formatting for fine-tuning.

This should resolve the “Input should be a valid string” error.