Fine-tuning with Responses API, file search and function calling

Charles_Jackson · May 2, 2025, 5:17am

Is there a way to fine-tune a model for use with the Responses API (specifically, for an application that uses function calling and file search)? I was hoping to take real user queries and fine-tune it to better call the correct function tool…and also take function tool responses and fine-tune the resulting message back to the user.

The fine-tuning guides all reference the Chat Completions API, and the required jsonl format is for chat completions. I could try to convert all my Responses API history into the Chat Completions format and build a jsonl file with that, but I’m skeptical as to whether the resulting fine-tuned model would then work when I try to use it in the Responses API setting. Plus, chat completions doesn’t support file search, so I guess I’d have to avoid any of the queries where file search was an appropriate response.

If fine-tuning and responses API are just incompatible right now, any idea if it will become an option in future?

And yes, I’ve done a lot of prompt engineering already and will continue to do so!

umar-anzar-covalent · May 27, 2025, 5:14am

Have you tried it yourself? I’m also feeling a bit skeptical at the moment, but I haven’t tested it with the Responses API yet. The Assistant API only supported GPT-3.5-turbo for fine-tuning with file search, so I’m curious how responses will performs.

_j · May 27, 2025, 6:16am

You can produce fine tuning examples on functions, which is one type of tool that you can add specifications to. The model is employed similarly between Responses and Chat Completions.

You cannot replicate placement of tool specifications needed for internal Assistants or Responses tools, nor patterns of internal tool calling.

Thus, you can use fine-tuning to somewhat understand a function call better, to differentiate between your “product_search” and “knowledge_base”. However, you won’t be able to fill the training examples with real-looking calls or returns for an internal tool such as file search.

Fine tuning with functions is a balancing act that can only be done with experimentation – will you break function calling even worse?

Fine-tuning can damage the model quality for following instructions such as the internal tool specification, but inference is not blocked on Assistants, and file search is still an option on a newer AI model.

Use a recent fine-tuning, with file search:

Receive usage of the tool:

The playground even has background feature data:

    {
      "object": "model",
      "id": "ft:gpt-4o-2024-08-06:xxx:yyy:zzz",
      "api_version": null,
      "streaming_enabled": true,
      "image_inputs_enabled": true,
      "reasoning_effort_enabled": false,
      "retrieval_enabled": true,
      "code_interpreter_enabled": true,
      "browser_enabled": false,
      "sampling_params_enabled": true
    }

OnceAndTwice · May 27, 2025, 6:25am

You can only fine-tune on functions you bring yourself, so some ideas are:

Implement the tools yourself and fine-tune with those.
Or, introduce diverse examples of functions to train the model to be flexible and respond better to OpenAI’s tools.

Personally, I’d just save some money and implement the tools on my own.

Topic		Replies	Views
Fine tuned model with document retrieval API	5	1021	August 19, 2024
Preparing data to fine-tune function-calling model Documentation fine-tuning , fine-tuning-problems	13	6816	January 11, 2025
Fine-tuning and function calling API fine-tuning , functions , function-calling	8	3045	October 9, 2023
Function calling with fine tuned model API	18	4581	December 1, 2023
Fine tuning the model for our specific use case? API	4	977	December 27, 2023

Fine-tuning with Responses API, file search and function calling

Related topics