can i do hyperparameter tuning like grid,random,Bayesian optimization,Tree-structured Parzen Estimators (TPE) in davinci model with my own dataset
but there is aslo learning_rate_multiplier,prompt_loss_weight hyperparameter we can access i think?
my dataset containing total 40k tokens which is chatbot dataset around 150 to 160 question answer pair ,but my fine tune model with default parameter is not giving responce like i want
it is giving repetative statment
I mentioned learning_rate_multiplier
but, yeah, there’s also prompt_loss_weight
, but all the additional hyperparameter does is increase the cost of any potential hyperparameter tuning you try to do.
But, beyond that 150–160 examples is very low for a Q/A fine tune. You need at least 3–4 times as many examples to get passable results.
You also need to ensure you’re putting a separator between the prompt and response in your training data and inserting that same separator into the prompt when you make requests to your model.
I would also try to ensure all your examples are of excellent quality and all of them perfectly follow an identical format to what you’re trying to achieve.
I would revisit these three points each several times before I would ever dare to think about hyperparameters.
Get more data and make sure that data is off the highest possible quality and your results will improve greatly.
If you get 500-1000 examples and ensure they all have an identical format with separators between the prompt and response and your requests from your fine-tuned model which include that separator still spit out garbage, we can certainly have a conversation about hyperparameter tuning.
Doing it before then would be like removing the side mirrors from a school bus to reduce it’s drag.
Ok i will put more data.
I have separator → between the prompt and response and also \n after each completion
{“prompt”: “What is the warranty coverage for a new Chevrolet Silverado truck? ->”, “completion”: " The warranty coverage for a new Chevrolet Silverado truck typically includes a limited warranty that lasts for a specific duration, such as 3 years or 36,000 miles, whichever comes first\n"}
Do u have any idea how can i restrict the answer only for Automobile domain
if user ask any question outside the automobile domain my chatbot should responce with “i do not know about this” like something.
I have acheived this with prompt eng but i want to know if i am training the model than how can i restrict the domain
i have my dataset contining automobile daomin user query and also other domain question with completion "i do not know "
openai api fine_tunes.create -t C:\Users\sarthak.srivastava\Desktop\data_prepared.jsonl -m davinci --n_epochs 6 --batch_size 2 --prompt_loss_weight 0.2
is this correct way to write all the hyperparameter in CLI
is embedding give the same exact answer from the knowledge base
or it will add extra in responce
so it will give answer from the context only