Extracting insights from multiple documents

Tokail · February 8, 2023, 12:19pm

Hi everyone;

I’ve been exploring online tools that use OpenAI embeddings. My exploration focuses on extracting insights like key takeaways from multiple documents.

In one instance, I used http://filechat.io/, I uploaded 2 closely relevant PDF files that discuss the same topic. The quality of the results was not satisfactory.

I’m unsure where the problem lies:

Is this a known issue with OpenAI embeddings?
Is it a possible tool limitation?
Quality of my prompt?

raymonddavey · February 8, 2023, 7:16pm

It may be that your answer required the AI to consume large parts of the document to give a comprehensive answer.

There are ways around this by breaking the documents down into small blocks (not too small), and when you ask a question, you feed the AI multiple (relevant) contexts at once to provide an answer.

If this doesn’t work well and you want the AI to consider more of the document, you can use prompt chaining (where you use the output from one request, to feed the next request while also introducing new contexts). This is more expensive to run - but gives amazing results (We are doing it with Academic papers)

I have also sent you a personal chat message with more info

Tokail · February 9, 2023, 1:47am

Raymond;

Thank you for the detailed analysis. I’m not sure if it applies to my case though. The documents I’m testing are tiny (5 pages each).

pmshadow · March 14, 2023, 5:20am

Hi, can you please send it to me as well? It would be really helpful, as I am processing financial documenta

Topic		Replies	Views
Use case: asking questions about a specific document API	7	2308	June 12, 2023
Feedback please: Chatbot to answer questions about long documents API	4	2250	December 17, 2023
Novel way to chat with your one or many PDF documents. A multi-step agent approach Community api	21	7481	December 17, 2023
Sharly - Chat with PDFs and documents with GPT-4 Community gpt-4 , api	29	7652	May 18, 2024
OpenAI Embeddings - Search through ~1000 PDFs API embeddings	3	3242	August 28, 2024

Extracting insights from multiple documents

Related topics