I want to train a model using fine tune, so that this model can reply to mails as per the polices and structure of my company

this type of jsonl data i am using to train model , i am using davinci as base model .
{“prompt”: “Q: What is the name of the company?”, “completion”: “A: The name of the company is ABC Doors and Windows.”}

{“prompt”: “Q: Who is the owner of the company?”, “completion”: “A: James Brook is the owner of the company.”}

{“prompt”: “Q: Where is the company located?”, “completion”: “A: The company is located in the USA.”}

{“prompt”: “Q: What services does the company provide?”, “completion”: “A: The company provides a wide variety of new windows and doors for commercial and personal spaces.”}

{“prompt”: “Q: What is the return policy of the company?”, “completion”: “A: The company offers a 15-day complete replacement policy if any manufacturing errors are detected.”}

{“prompt”: “Q: Does the company provide repair and service?”, “completion”: “A: Yes, the company provides repair service within 24 hours upon request.”}

after fine tuning is complete i am using this new model in play ground and giving same prompt as above like what is the mane of company but answer is not correct

Welcome to the OpenAI community @akshayjaggi146

If the goal is factual responses, use embeddings instead of fine-tunes.

PS: Your prompt completion pairs for fine-tune training dataset aren’t properly formatted per the recommended formatting guidelines.

if answer is not satisfactry , u can try hypaermeter tuning , iterate with diiferent value of hyperparameter
also make sure u have send atlaeat 500 question -pair dataset
and for the fomatting use

openai tools fine_tunes.prepare_data -f <LOCAL_FILE>

it will automatic format ur data

Fine tune is computational expensive and headache. I always use semantic search with company pdf. Add some instructions to that including examples how to answer. It will do more than fine tuning. Just use embeddings.

response coming for my prompt is repetitive. Is this because i am using few line of jsonl

yes it is because of bad fin tune model
u yan give more data for training and also u can iterate with different value of hyperparameter to see which value is right for ur model

That problem is discussed here:

thank you @sps , now i have to add stop sequence at the end of completions . can i edit my current jsonl file and update it in same model, or i have to make new model with updated file

You’ll have to properly format the training data according to the guidelines mentioned in the docs and train a base model.

However,

@sps i am using langchain package for embadding , can you please suggest me is this right approach ?

Langchain describes itself as a framework for developing applications powered by language models. I haven’t used it.

But, if it works for you, it’s great.