Data format to train assistant based on user-support dialogs

I want to train my assistant with data from previous interactions (dialogs) of my clients and support. In which format is it better to provide those dialogs so that assistant could clearly analyse them and better learn?

Depends on the model, but if by training you mean fine-tuning, and by assistant you mean a gpt model, the format is defined in the OpenAI Docs:

Happy building!

Is fine-tuning of a “gpt model” smth different from creating and finetuning assistant created withing openAI assistants API (which I was originally referencing)?

And what is in my case more appropriate then? My final goal is to connect the assistant to smth like livechat (with my mediator which will take message from user, send the to openai (whatever assistants api or completions over fine tuned model or whatever is more suitable), and return responce back to user.

Yes. Fine tuning a model is very different from using an assistant.

In the assistants GUI you don’t really have a way to “train” the assistant. You can fit only a limited number of characters into the instructions. Maybe 10 example conversations max on top of your instructions. The format is not as strict as in fine tuning. E.g.
Customer: “can you help me”
Assistant: “sure thing”
Works fine.

There is an option to enable file retrieval but if you want very specific type of answers from your assistant I don’t think the file retrieval is going to help much. At least I haven’t been able to use the files for feeding example conversations and have the assistant actually use them.

This is why fine tuning might be the way to go if you want to teach something specific to your use case and have the model respond in a predictable manner.

1 Like

as far as I understand , there is a third approach based on embeddings, which is suggested for the very similar to mine case: Customer support assistant with automated response - #6 by matcha72

How do you think - will embeddings better work for my case then assistants API?

Someone more experienced can correct me if I’m wrong, but from what I understand:

  • Fine tuning works well if you want to teach the model how to respond, including, style, tone, format.
  • Fine tuning doesn’t work well if you need the model to know specific FAQ type details about your business like cost of product X is 249Usd. The knowledge file retrieval is meant to be used for this type of details.

What type of responses do you need the model to give?

very good point, and it is correct that my question is more about FAQ answers

1 Like

@matcha72 we still need an answer over here, don’t go posting in other part’s of the forum before you’ve explained yourself.

1 Like