Optimizing AI Document Retrieval: Embedding vs. Prompting

BrianLovesAI · January 31, 2024, 3:31am

Hello, great AI API developers.

I am currently contemplating which approach would yield better results between embedding and list prompting.

Let’s consider a scenario with 50 different documents. With embedding, I can log user inputs and use vector storages to direct which message (input) goes where, determining which document or documents need to be read by AI before providing an answer. This approach will work well if I can manage vector storage effectively.

On the other hand, without embeddings, using only prompts and commands to GPT-4, I can also provide titles or short descriptions that the AI can understand, helping it grasp the user’s inquiry and select the correct document for a more accurate response.

Currently, I am trying to determine which process will work better in terms of efficiency (less work, better results, lower cost, difficulty, accuracy, etc.). Can anyone clearly indicate which approach they would recommend, and why?

DevGirl · January 31, 2024, 9:26am

If I’m understanding you correctly (please let me know if I’ve misunderstood), this is a simple answer:

better results: Padding the prompt
lower cost: Almost guaranteed embedding (unless very long dogs and very few queries)
difficulty: Embedding would definitely be easier; it doesn’t require attempting to semantically determine what content to embed as with prompt-packing.
accuracy: Same as “better results” in this context.

I should clarify that my understanding is that you’re comparing (a) embedding vs (b) packing prompts with content, assuming the original documents are relatively short.

Big picture, I can’t imagine a scenario where embedding wouldn’t be a lot easier and less expensive. In addition, because the documents must be relatively short (based on the point about adding them into prompts), the accuracy delta probably wouldn’t be consequential.

I hope that helps

merefield · January 31, 2024, 9:34am

The biggest difference I suspect is “list prompting” wouldn’t scale well (and let’s not get into LLM performance degrading with longer prompts … )

The embedding approach scales extremely well because increasing your dataset does not, all other things being equal, increase the amount of tokens being handled.

Topic		Replies	Views
Over-prompting with irrelevant context Prompting embeddings , gpt-4	8	1689	December 17, 2023
Need some help with understanding embedding/fine-tuning API	2	1610	December 17, 2023
Can someone make embeddings make sense? (Not what you think, more in post, lets discuss!) API embeddings , gpt-4	6	2290	September 19, 2023
Best method of injecting relatively large amount of context to be leveraged in a response API	10	11878	December 17, 2023
What's better for the type of chatbot I am building? Fine tune or embedding? Community chatgpt , api	10	2259	August 20, 2023

Optimizing AI Document Retrieval: Embedding vs. Prompting

Related topics