I have been testing the capability of a fine-tuned model for QA bots to answer questions related to the companies’ information. Approximately 1000 prompts and completions were prepared as training data. When using the fine-tuned model, however, it answers nonfactual sentences most of the time, even when the prompts are exactly the same as the data provided. I have tested based on information on the web and this forum but so far no luck. If you have any suggestions to try something else than what I have done, I really appreciate your help.
Goal : Bot to answer from the pre-trained data if the prompt is similar to the trained ones.
Problem : Bot doesn’t answer from the pre-trained data even prompt is exactly the same as the pre-trained ones but nonfactual answers.
What I have tried :
- Base mode curie and davinci
I have made a fine-tuned model based on both curie and davinci for the each case below. It didn’t help when prompting the pre-trained data. - Changed parameter for completion
The lower temperature seems to be better.
response = openai.Completion.create(
engine=CHATBOT_MODEL,
prompt= BOT_PREFIX + aprompt,
temperature=0.0, # changed value from 0 to 1
max_tokens=200,
top_p=1, # changed value 0 or 1 or comment
# best_of=1, # changed value 0 or 1 or comment
stop = ["###","->","\n"],
frequency_penalty=0,
presence_penalty=0
)
-
Changed Prefix of prompt
In front of the prompt added one of the following prefix sentences. Still cannot find the best sentence.BOT_PREFIX= "The following is a conversation with an AI assistant called BOT and a user. BOT is empathic and friendly. BOT's objective is to help the user find StartUp companies. With each response, BOT offers follow-up questions to encourage openness and tries to continue the conversation in a natural way. ### "
BOT_PREFIX = 'The following is a conversation with an AI assistant called BOT. BOT is helpful, creative, clever, and very friendly. If you ask BOT a question that is rooted in truth, BOT will give you the answer. If you ask BOT a question that is nonsense, trickery, or has no clear answer, I will respond with "Sorry, I am not sure. I will learn more to support you.". ### '
-
Changed the format of pre-trained data (1000+). Changed the length of the description from a long one to a summarized short one. Shorter simple data looks to respond better but even shorter simple ones mostly don’t answer as expected.
# CASE I : prompt ends with "->", completion ends with "\n"
{"prompt":"Tell me about ABC ->","completion":" ABC belongs to Web3. HQ is in USA. Their business is related to Financial Services,Media and Entertainment,Other,Payments,Software. ABC is a blockchain technology company that develops NFTs and digital collectibles.\n"}
{"prompt":"Tell me about BCD ->","completion":" BCD belongs to CyberSecurity. HQ is in ISR. Their business is related to Consumer Electronics,Hardware,Information Technology,Privacy and Security,Software. BCD is a breach and attack simulation platform that helps organizations verify their security posture.\n"}
# CASE II: prompt ends with "###" and starts with "User: ", completion ends with "###"
{"prompt":"User: Tell me about ABC ###","completion":" ABC belongs to Web3. HQ is in USA. Their business is related to Financial Services,Media and Entertainment,Other,Payments,Software. Yuga Labs is a blockchain technology company that develops NFTs and digital collectibles. ###"}
{"prompt":"User: Tell me about BCD ###","completion":" BCD belongs to CyberSecurity. HQ is in ISR. Their business is related to Consumer Electronics,Hardware,Information Technology,Privacy and Security,Software. BCD is a breach and attack simulation platform that helps organizations verify their security posture. ###"}
Output Example
User: Tell me about BCD
BOT: BCD is a Cynefin-based AI company that helps companies make sense of their data.
# I expect something similar to the pre-trained completion one, but the answer is something nonfactual
My thoughts :
- The pre-trained data format may impact the response quality, so I tested the above format and a few different ones referring to other websites and QA forums but so far no big gain.
- Prefix sentences in prompts also have an impact on the quality of response but so far I don’t come up with better ones than above.
- What else can I try …