Embedding data with prompting

classictablet333 · November 19, 2024, 7:05pm

How do prompts influence embeddings when processing text?

For instance, if we take a piece of text and prepend a prompt like “highlight key symptoms in the following data,” will the embeddings place more emphasis on the symptoms?

Can prompts be used to “focus on and encode” only specific aspects of the input data?

In Retrieval-Augmented Generation (RAG), embeddings are typically created without prompts. However, some top-performing MTEB models on Hugging Face include prompts during embedding.

Would incorporating prompts during the embedding process for a knowledge base enhance its effectiveness?

If you’re aware of any related papers, experiences, or suggestions, I’d greatly appreciate your input.

Thank you!

curt.kennedy · November 20, 2024, 12:10am

It’s theoretically possible if the model takes a prompt into consideration. Most models that embed don’t offer prompts, but if they did, I’m confident they would give better results.

You can intuitively think of a prompt as focusing the “next word prediction”, and so this would allow you to focus your embedding, instead of relying on blind semantics.

Topic		Replies	Views
Affect of Prompting in Embeddings and Retention of Data API embeddings	2	142	November 20, 2024
Embeddings as model input API embeddings , api , prompt	3	2458	June 16, 2023
Can context be added outside the prompt? Prompting	6	3761	May 12, 2023
Optimizing AI Document Retrieval: Embedding vs. Prompting API embeddings , gpt-4	2	1839	January 31, 2024
Over-prompting with irrelevant context Prompting embeddings , gpt-4	8	1679	December 17, 2023

Embedding data with prompting

Related topics