Affect of Prompting in Embeddings and Retention of Data

classictablet333 · November 18, 2024, 5:19am

Hello,

When embedding a piece of text, what effect do prompts have?
1-If we have a piece of medical text and we concat a prompt saying something like “pay close attention to the diagnosis in the following data” will diagnosis data become prominent in the vectors?

2-Can we prompt to “remember and encode” only selecting things in the input data?

In RAG, we usually embed data without any prompt. But some top performing MTEB huggingface models add prompts before embedding,

would prompting during embedding the knowledge base data help?

Any paper on this or any experience/suggestion would be fantastic.

Thank you in advance!

classictablet333 · November 19, 2024, 2:01am

Sorry for tagging you, just wanted to see if you have ideas.
@anon10827405 @Diet

Diet · November 20, 2024, 3:10am

Yes, but it depends on the model. I wouldn’t try it with openai’s text embed. you can absolutely forget about ada.

A model that has been trained to follow the context - query pattern might indeed be well suited to what you’re trying to do.

That said, at this time I wouldn’t try to mix different embedding queries in the corpus. You’d likely be better off keeping different corpus queries in different embedding indices.

Topic		Replies	Views
Embedding data with prompting API embeddings	1	292	November 20, 2024
Over-prompting with irrelevant context Prompting embeddings , gpt-4	8	1679	December 17, 2023
Prompt Assistance , Potentially Fine Tuning oddity Prompting	6	1204	February 7, 2023
Optimizing AI Document Retrieval: Embedding vs. Prompting API embeddings , gpt-4	2	1840	January 31, 2024
Training with blank prompts API	11	1579	December 24, 2023

Affect of Prompting in Embeddings and Retention of Data

Related topics