Extracting insights from multiple documents

Hi everyone;

I’ve been exploring online tools that use OpenAI embeddings. My exploration focuses on extracting insights like key takeaways from multiple documents.

In one instance, I used http://filechat.io/, I uploaded 2 closely relevant PDF files that discuss the same topic. The quality of the results was not satisfactory.

I’m unsure where the problem lies:

  • Is this a known issue with OpenAI embeddings?
  • Is it a possible tool limitation?
  • Quality of my prompt?

It may be that your answer required the AI to consume large parts of the document to give a comprehensive answer.

There are ways around this by breaking the documents down into small blocks (not too small), and when you ask a question, you feed the AI multiple (relevant) contexts at once to provide an answer.

If this doesn’t work well and you want the AI to consider more of the document, you can use prompt chaining (where you use the output from one request, to feed the next request while also introducing new contexts). This is more expensive to run - but gives amazing results (We are doing it with Academic papers)

I have also sent you a personal chat message with more info

1 Like


Thank you for the detailed analysis. I’m not sure if it applies to my case though. The documents I’m testing are tiny (5 pages each).

Hi, can you please send it to me as well? It would be really helpful, as I am processing financial documenta