It’s still in working progress @ruby_coder.
could you share the link? where you tested fine tuning validation
that will helpful for me ![]()
It’s still in working progress @ruby_coder.
could you share the link? where you tested fine tuning validation
that will helpful for me ![]()
It runs on localhost on my desktop on the seacoast.
I have not yet pushed this code to the net, sorry. I am still adding more and more functionality and adding more params, etc. I need to move some functions like files and validation into different tasks and modules, etc. Plus, I have other tasks on my plate, so I work on this over AM coffee and PM dinner at my desk, haha.
Do you use REGEX expressions @madhangopal500 ?
I validate with REGEX expressions as you have surely guessed by now
.
Hi guys,
here I 'm not getting proper response on keywords of given prompt.
{"prompt":"Item=handbag, Color=army_green, price=$99, size=S->","completion":" This stylish small army green handbag will add a unique touch to your look, without costing you a fortune.###"}
Scenario:
prompt: small size army green handbag
Expected Output: This stylish small army green handbag will add a unique touch to your look, without costing you a fortune.
Output on OpenAI : with a small black design on the front. The small black design is a small black and white small diamond pattern, and the small black design covers…
help me here to find solution.
thanks
Your training data needs to also be in words. It doesn’t like captions with values.
For example it doesn’t correlate S as meaning Small. It works much better if you convert your training data into sentences instead of a list of parameters or specifications
@raymonddavey I have created training data as per documentation. In documentation it’s also support captions with values.
This is from the documentation
For inference, you should format your prompts in the same way as you did when creating the training dataset
Your training set uses parameters and the prompt when you use it is using sentences or words
Also quoting from the documentation:
Here it is important to convert the input data into a natural language, which will likely lead to superior performance. For example, the following format:
{"prompt":"Item=handbag, Color=army_green, price=$99, size=S->", "completion":" This stylish small green handbag will add a unique touch to your look, without costing you a fortune."}
Won’t work as well as:
{"prompt":"Item is a handbag. Co
They got rid of the equal signs and commas in the list format and made sentences as prompts instead
This documentation you have posted is not correct according to OpenAI Fine-Tuning Data Formatting documentation.
Please follow the guidelines I have posted above and below, again.
The last part was directly copied from this link and anchor
The first part of my reply was copied directly from the 4th bullet point under this link
That documentation is in conflict with OpenAI’s stated directions which I have posted so many times.
Posting incorrect docs in conflict with these guidelines is not always in the best interest of beginning users with problems, in my view. OpenAI needs to fix their docs so they are consistent, or users will keep being frustrated their fine-tunes do not work as they expect.
Do you not agree with these OpenAI data formatting requirements below:
![]()
I has posted countless times on the need to validate JSONL date for both JSONL format requirements and the OpenAI Data Formatting requirements; and have written a validation for both with work for all my fine-tunings without a problem.
We need to encourage users to validate the JSONL data against OpenAI Data Formatting guidelines in my view as the first step they must do when trouble shooting fine tuning issues.
![]()
OK, I am going to test this for you. The cake is baking now as a single-line JSONL test and when it bakes, I’ll test your prompt for you and post back.
The cake is baked (much faster than a few days ago!) and here is how I set up the fine-tuning:
Isn’t this the completion you wanted @madhangopal500 ? Isn’t that right?
![]()
Hi @ruby_coder ,
If we given exact prompt we are getting expected result, but in my case I’m going to give keywords of those prompt like small size army green handbag, the expected output is
This stylish small army green handbag will add a unique touch to your look, without costing you a fortune. but we are getting random results from OpenAI
Yes, that is because these two text string:
Item=handbag, Color=army_green, price=$99, size=S
small size army green handbag
Are too dissimilar (you can test by taking the dot product between two embedding vectors of both strings) and it may be difficult to get the correct model fitting when you fine-tune, unless you add “string 2” to your fine-tuning JSONL file as a prompt.
However, if you shift to using embeddings, you may have better luck.
Before I drop off, I will run this again with only 12 n_epochs but I serious doubt it will provide a suitable model fitting for both strings with a single-line JSONL entry.
In addition, your string lengths are a bit short, so this makes the vector math even more challenging; it’s doable, but requires work on your part.
HTH
You need a stop parameter which should be the same stop you used when you fine tuned, BTW.
You don’t need to send the top_p param because OpenAI says to only use one or the other (temperature or top_p, so don’t send both, use one or the other.
![]()