I’m trying to extract text from documents. most time OpenAi give back great results but sometimes if the format changes it breaks.
For example I’m extracting data from farm auctions results so the document has header like Cows, Steers, Sheep, etc. and I need the header and the following paragraph.
But on some documents they are split so it would be Price per head Cows, Price per KG Sheep etc.
So would the best approach to use fine tuning some thing along the lines of
`{"messages": [
{"role": "system", "content": "You are a helpful bot and will extract data as requested."},
{"role": "user", "content": "Find me all the Auctions types"},
{"role": "assistant", "content": "Named Price per head Cows"},
{"role": "user", "content": "Find me all the Auctions types"},
{"role": "assistant", "content": "Named Price per kg Cows"},
{"role": "user", "content": "Find me all the Auctions types"},
{"role": "assistant", "content": "Named Cows"},
]}`
or is there a better option like formating the text before I use the Api (e.g adding bold tags) and ask OpenAi to find those bold tags?
Or is ther a totally different option I haven’t thought of?
(BTW I know my fine tuning is terrrbile but its just a sample!)