Does it make sense to add context to text before embedding?

Diet · February 20, 2024, 8:00pm

Ah, with context you actually meant a sort of promptability/instructability?

we investigated that a little bit in this thread as well: New OpenAI Announcement! Updated API Models and no more lazy outputs - #9 by Diet (in this case the instruction rear-loaded instead of front loaded)

However, if we think it through a little: what does a bob in london have in common with a person called bob that happens to live in reno vs a mary that lives in reno? This could just be the result of unfortunate prompting…

That said, it’s interesting how dramatically the model discriminates by gender, and how it considers DANA and RENO different from the others.

If we wanted to improve the signal as you proposed in your original post, would it make sense to rerun the experiment something like this?:

"Dana (Person) is from Reno (City)"
<=>
"We're focusing on the person's name in the statement: Dana (Person) is from Reno (City)"
<=>
"We're focusing on the city in the statement: Dana (Person) is from Reno (City)"

I’d also like to try this with mistral embeds at some point, but msft is really stingy with their gpu vms atm

Topic		Replies	Views
Fine tuning for use of keyword lists API fine-tuning	15	3003	December 10, 2023
Fine tuning vs. Embedding API	21	45580	December 12, 2023
How I cluster/segment my text after embeddings process for easy understanding? API	13	12664	December 18, 2024
Embeddings not preventing OpenAI from answering API	25	3084	December 19, 2023
Reducing Cost of GPT 4 by using embeddings Prompting	23	10506	May 4, 2023

Does it make sense to add context to text before embedding?

Related topics