I currently have a research paper, https://cancerpreventionresearch.aacrjournals.org/content/3/1/108 which has information about abstract, method of study, experiments and conclusion about pomegranates and it’s cancer-fighting effects.
But its a long paper and will not fit a prompt. So I believe the best way to tackle this is through Fine-tuning. However, after testing the fine-tuned model. When I ask it a question, it would answer me with content that is outside the research paper.
How do I train or what is the correct prompt to use so that when I ask the model a question. It would only give me answers that is within the research paper.
ex: What is the purpose of this study?
and it will answer:
The purpose of this study was to investigate the antiaromatase activity and inhibition of testosterone-induced breast cancer cell proliferation by ET-derived compounds isolated from pomegranates.
Thanks in advance for anyone who answers
The INSTRUCT series works really well. I generally follow the format:
Read the following article and answer questions
- Blah blah blah?
First. thanks daveshapautomator for checking this question out, really appreciate
So the INSTRUCT series would answer the question based on the overall “Mind” / “Wikipedia” information that Open AI has. and that’s a large scope of info
I was hoping to create a fine-tuned model so that the AI will only find its answer within the uploaded research paper.( https://cancerpreventionresearch.aacrjournals.org/content/3/1/108 )
and not go to far ahead.
There’s also an /answers endpoint, which can be used when you can’t fit the entire article within the token limit.
This works by a two step process (but is handled on our end, within an answers endpoint):
- find the most relevant part of the article.
- answer the question based on that part of the article, as dave pointed above.
Fine-tuning on a single paper isn’t meant to work. For fine-tuning you usually require at least ~200 examples - one paper is too little.
Ohh there we go! Cool cool! OpenAI API. Thank you very much to you guys ( boris and daveshapautomator ). I guess I should play around with this.
Quick question Boris. If I add the whole Abstract on a single JSON line.
And ask a question, “What is the purpose of this study?”
Would it attempt to answer through a sentence? or it would return the whole abstract?
I guess that just something I have to test
You can give a few example context & question & answer in the /answers endpoint, which will help the model know what kind of format you expect for answers.