I am new here. I am trying to use our paper data repository to train the bot to answer more topic-specific questions. I saw that it is impossible to load PDFs, but you need to do it with JSON. However, before I start this journey, which will take certainly weeks to optimize, I want to be sure that this is the right way to go.

I want to use the data repository of peer-reviewed papers so that allows me to fine-tune the IPA bot to give more insightful answers about how proteins interact with other proteins that are contained in the user question. Is that even possible?

@vladimir.a.gimenez.r

It looks like embeddings will be a much better approach than fine-tuning for your use case.

A lot of projects have also been launched lately that enable question answering based on provided documents.

