Question answering using embeddings-based search

nativ.vered · May 10, 2023, 10:41am

I am using as a reference the code in openai-cookbook/Question_answering_using_embeddings.ipynb at main · openai/openai-cookbook · GitHub

My use case is simple as well: I want to read #n documents and run Q&A. I know that I have a max length of text that I can use in gpt-3.5-turbo 4,096 tokens (~5 pages)

I have few questions:

I see in output[7] that the text was split to different rows. Was it done due to accuracy? Meaning, the more I split the text to sentences, the more the answers will be accurate? Or I can use text up to 4,096 tokens to build the embedding
How do I work with the case I have more data than the max length? Any ideas?
I have noticed that the questions I ask cannot be a “smarts” for example, If I have a text of a painter that painted a paint of blue flowers and I asked: “Suggest a name for the paint” I got no answer. What can I do to improve it?

Thank you!.

Topic		Replies	Views
Embedding Longer Texts API	8	15301	December 25, 2023
Semantic search on large document API	10	3615	January 3, 2024
Feeding data then ask questions about it API	1	1559	February 28, 2024
Searching Using Vectors Derived from Long Text Segments in an Embedding Model API embeddings , api	4	2470	December 15, 2023
How to let chatgpt fully digest a really large text? API	7	9402	December 16, 2023

Question answering using embeddings-based search

Related topics