Have fine -tuned a OpenAI GPT-3 Model for my use-case for keyword extraction
Although when i call it ,Its just completes the prompt rather than extracting the entities
Made this via following their documentation - OpenAI API
Have trained this model with around 50 rows dataset with large texts. Here is the fine tuning code
os.environ['OPENAI_API_KEY'] = "api-key"
! openai api fine_tunes.create -t /content/entity_prepared.jsonl -m ada
My data used to train it is this in format of Jsonl fike - entity_prepared.jsonl - Google Drive
Is their any solution for this ?
I’m on mobile so maybe I’m not seeing the whole file but it looks like you have only 2 or 3 samples. The minimum is 200. Also, you’re using the default ### demarcation token, which in my experience is pretty useless. You ought to change the demarcation to at least a one word instruction on what to do, but ideally longer so the model learns when the input ends and the output begins. So instead you should use something like EXTRACT NAMED ENTITIES.
He’s got 50 examples but even so… even with t=0 I wasn’t able to get the app to reliably return me “facts” like you might think it would. That’s because it’s still a best guess system, not a lookup.
Example: I added records in my fine tune that look similar but had a “key string” attached to each record. When I asked for a record to return is get it mostly the same, but the key would be wrong…our values on the content would be replaced by the most common next thing, and not the FT record itself.
That’s because the FT records ids long gone after the fine tune indexing is done and all that’s left is the most likely next word based on your content. And with 50 records it would not scale even if it did work occasionally.
If you need explicit returns, you are best with a database for now.