Finetuning of an Assistant API

Does anyone know how to fine tune an assistant API? I need to create a long constant output that consists of multiple chapters - with a chat over 5 steps. I think the easiest way would be to demonstrate the result I want to get consistently - but it’s not that simple. Maybe someone of you has a tip for me, a youtube video or something else.

1 Like

There’s lots of people here that know a bit about assistants. However, it is good to outline what assistants on the API is NOT, first.

Fine-tune is a phrase specifically used to describe the costly training of a new AI language model, using hundreds or thousands of examples to change the behavior of the model when it writes a response.

Assistants is an agent framework that allows reference to uploaded documentation, and multiple internal function calls can be executed before finally producing a response to a user.

Neither of these sound like what you need. The biggest challenge you will face:

Newer OpenAI AI models have already been tuned to give shorter responses, under 1000 words (about 600-700 tokens). They’ll always find a way to shorten the response, or even simply deny a long request. Perhaps the most competent is the earlier GPT-4-0314 AI model, which hasn’t suffered as much of this “chat” “cost savings” training.

So the best way to get an AI to write at length, to complete a composition, is to break down the project into smaller steps.

Giving the AI an outline of the writing overview, and directing the AI which part it shall now write. Or have the AI start with that outline of the composition it writes itself.

In chat completion code you write yourself, you can have the AI write an outline, also output how many individual parts of 500 words would be needed, and then even hand off these smaller writing jobs to a different AI model by software.


Thank you! It’s about the finetuning option (meaing: - but this can’t be chosen as an “model” when creating/running an assistant ( To give an overview is an idea - thank you, haven’t tried it. On the other hand finetuning should help to reduce the tokens - this would add futher context and make it more costy. I use the 4-1106-preview - and in my case this is to expensive. I need to run the assistant with 3.5 turbo 1106, which delivers inconsistent / often nonsense. So a Gordnian node.