Need help! fine tune worked using API, but the answers were bad

I have been trying to build a chatbot with custom knowledge base. However, after fine tuned a few thousand lines of Q&A data, I found the davinci model is not able to give any relevant answers. So I create a very simple test to see whether information can be picked up by davinci. However, it’s still not working. Can someone please point out what I did wrong? Thanks!

Here is my test data
cat test2_prepared.jsonl
{“prompt”:“what’s bio123tech’s company address?”,“completion”:“bio123tech’s address is 1 main road mytown.”}
{“prompt”:“what does bio123tech do?”,“completion”:“bio123tech provides dna sequencing services to the general public.”}
{“prompt”:“where was bio123tech found?”,“completion”:“bio123tech was found in the madeupplace.”}

upload the file
openai api files.create -f test2_prepared.jsonl -p fine-tune
Upload progress: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 326/326 [00:00<00:00, 36.1kit/s]
{
“bytes”: 326,
“created_at”: 1679715719,
“filename”: “test2_prepared.jsonl”,
“id”: “file-oTOEm8Yyq8SKBQU2VeM2Ppn5”,
“object”: “file”,
“purpose”: “fine-tune”,
“status”: “uploaded”,
“status_details”: null
}

fine tune
openai api fine_tunes.create -m davinci -t “file-oTOEm8Yyq8SKBQU2VeM2Ppn5”
Created fine-tune: ft-4MCWZxaPkSZWsNduTugSYLnh
Streaming events until fine-tuning is complete…

(Ctrl-C will interrupt the stream, but not cancel the fine-tune)
[2023-03-25 11:45:11] Created fine-tune: ft-4MCWZxaPkSZWsNduTugSYLnh

Stream interrupted (client disconnected).
To resume the stream, run:

openai api fine_tunes.follow -i ft-4MCWZxaPkSZWsNduTugSYLnh

~/chatbot# openai api fine_tunes.follow -i ft-4MCWZxaPkSZWsNduTugSYLnh
[2023-03-25 11:45:11] Created fine-tune: ft-4MCWZxaPkSZWsNduTugSYLnh
[2023-03-25 11:47:15] Fine-tune costs $0.01
[2023-03-25 11:47:15] Fine-tune enqueued. Queue number: 0
[2023-03-25 11:47:16] Fine-tune started
[2023-03-25 11:49:09] Completed epoch 1/4
[2023-03-25 11:49:10] Completed epoch 2/4
[2023-03-25 11:49:11] Completed epoch 3/4
[2023-03-25 11:49:12] Completed epoch 4/4
[2023-03-25 11:49:49] Uploaded model: davinci:ft-personal-2023-03-25-03-49-49
[2023-03-25 11:49:50] Uploaded result file: file-zoKVTT6QfmQcP1xBwlpKHhfz
[2023-03-25 11:49:50] Fine-tune succeeded

Job complete! Status: succeeded 
Try out your fine-tuned model:

openai api completions.create -m davinci:ft-personal-2023-03-25-03-49-49 -p <YOUR_PROMPT>

testing
~/chatbot# openai api completions.create -m davinci:ft-personal-2023-03-25-03-49-49 -p “what’s bio123tech’s company address?”
what’s bio123tech’s company address? Google for , <State
~/chatbot# openai api completions.create -m davinci:ft-personal-2023-03-25-03-49-49 -p “what does bio123tech do?”
what does bio123tech do?

We provide a local contact for your Health and Life Sciences to Business Solutions
~/chatbot# openai api completions.create -m davinci:ft-personal-2023-03-25-03-49-49 -p “where was bio123tech found?”
wwhere was bio123tech found?

At the fork when bio123tech killed the grimm blocking the way

1 Like

GPT 4’s answer to this question is below. However, I can’t create multiple QA data for simple things like what’s your company name, what do you company do. What’s the best way to let GPT understand some basic information of a custom environment?
-----GPT 4 answer below----
I recommend increasing the number of training epochs and dataset size, as a few thousand lines of Q&A data may not be sufficient for the model to learn effectively. Additionally, make sure you are using a consistent format for the questions and answers in your dataset.

Here are some tips to improve your fine-tuning process:

  1. Increase the size of your dataset: A larger dataset can help the model to generalize better and learn the specific knowledge you want it to retain.
  2. Diversify your dataset: Make sure your dataset contains diverse examples and is representative of the types of questions and answers you want the chatbot to handle.
  3. Increase the number of epochs: By increasing the number of epochs, you are giving the model more opportunities to learn from the dataset. Be cautious not to overfit, though; if you observe that the model is memorizing the training data and not generalizing well, consider reducing the number of epochs or adding more data.
  4. Experiment with different learning rates: A smaller learning rate may help the model to converge better on the provided data. However, finding the optimal learning rate may require some experimentation.
  5. Evaluate your model’s performance: After fine-tuning, evaluate the performance of your model on a separate validation set to make sure it’s generalizing well.

Try making these adjustments and fine-tuning your model again to see if the performance improves.

for chatGPT, I can simply tell it " my company name is bio123tech. please remember it". Then when I ask " what’s my company name?" GPT will say “bio123tech”. how to do the same thing with API?

New knowledge should be done through embeddings.
Fine-tuning is for patterns.

what’s bio123tech’s company address?
Could be easily solved with a simple Q&A knowledgebase.

I tried embeddings. however, it doesn’t seem to be able to answer in a natural way. The answer only comes from the answer sheet. how to replicate the chatgpt but with custom knowledge?
once the engine understands the company address. when an user say " I have been to bio123tech last year, however I can’t remember where exactly I went." I was hoping the chatbot to act like chatgpt and answer “the address of bio123tech is xxx”

You can either wait for ChatGPT plugins or you can develop your own application to take the context, and answer the question.

ChatGPT is a fine-tuned model designed to be conversational. It’d be very hard to replicate.

For your question, you can include the context and let the model naturally respond with the knowledge.

1 Like

are you saying at this stage people can’t use openai api to build efficient chatbots?

No, not at all. Not sure how you came to that conclusion.

Is it very difficult to replicate ChatGPT? Yes.
Can one make a chatbot which is more knowledgeable for their industry? 100%

Thanks for trying to answer my questions. appreciate your effort.

can anyone else shed some light on this? How to build a chatgpt like company chatbot with custom knowledge? I have good amount of history human chat logs. I’m only one person. I don’t want to spend 1 year to clean up and process the data…lol are there any shortcuts that I can take?

I guess you didn’t like my answer.

Here’s a wonderful tutorial to do exactly what you want

You can probably use GPT to clean your data.

1 Like

The answer that @RonaldGRuckus gave is exactly the way to go here. Semantic search + composing the answer. You should not fine tune in this scenario.

The answer to the question should not be returned in an extractive way. This mechanism has two steps:

  • First, you conduct a semantic search to retrieve relevant pieces of text that contain the answer to the user’s question.
  • Second, you use this context in a final call to the API to compose an answer to the user’s question in an abstractive way. It should be original and take into account the conversational context as well.

Also: the semantic search mechanism can be enhanced by building a module on top that paraphrases the question to transform it from a contextual question into a stand-alone one. You can also use the chat endpoint to build this module. In the example that you were giving, this module would convert the user’s utterance “ I have been to bio123tech last year, however I can’t remember where exactly I went.” into “What is the exact address of biotech123?

Once you have this, you embed this stand alone question and conduct the semantic search + abstractive answer generation.

3 Likes

My slower-than-average brain DEEPLY appreciates these comments; they are very helpful guidance for establishing a mental model for solving these challenges.

1 Like