I am a web software developer with a strong background in PHP. Due to my focus on projects, I am unable to provide sufficient support to my clients. I’ve thought of a solution for this and want to implement it. I plan to create an AI bot that can automatically respond to my clients. I have around 5,000 resolved support requests from the past that I will use as a dataset. Using these requests, I want to set up a system that can automatically answer similar questions posed by my clients. I prepared a JSONL dataset according to the ‘gpt-3.5-turbo-1106’ model. I uploaded this training file from the ‘Files’ section and created a fine-tuned model. Since my dataset reached thousands of lines, I realized my balance was insufficient for this task. Therefore, I decided to proceed with only a 200-line dataset. I successfully added my fine-tuned model. The results of the model were as follows: Epoch value: 3, Training loss value: 0.0683. However, when I tested my fine-tuned model in the ‘Playground - Chat’ section, it started giving irrelevant and incorrect responses. I am unsure how to proceed from this situation. I would appreciate any guidance on what steps I should take.
Welcome to the forum!
It’s a common misconception that fine tuning allows you to imbue a model with knowledge. Unfortunately, that’s not the case. I’m personally of the opinion that fine tuning is a waste of time for most use cases, so I don’t have a strong grasp on what use cases it actually works well on.
What you might be better off doing, is using augmented generation.
What you basically do is use a search method to retrieve similar issues to what the customer is describing, and allowing the LLM to access these cases in its context to formulate an informed answer.
One of these search methods is vector/embedding search (which might be good enough for your usecase), but some people find that they need to use a hybrid approach.
Of course you can imbue a model with knowledge through fine-tuning. Just like you can define / adjust semantics of words through few-shot prompting.
It’s just that in most cases it makes more sense to use embeddings.
If I wanted to achieve a customer support chatbot with a large amount of functions and instructions it would make sense to use fine-tuning to reduce the large prompt size.
Then I can also take advantage of the fine-tuning by introducing typically permanent features like the name, mission statement, founder.
But yeah. In this case embeddings makes way more sense
Thank you for your valuable comments, how can I do the embedding process you mentioned, how can I use the data sets I mentioned, can you present a scenario about this?
Here is a link from the OpenAI cookbook:
It’s an example using Pinecone but rest assured every vector DB has a similar tutorial.