Ah, with context you actually meant a sort of promptability/instructability?
we investigated that a little bit in this thread as well: New OpenAI Announcement! Updated API Models and no more lazy outputs - #9 by Diet (in this case the instruction rear-loaded instead of front loaded)
However, if we think it through a little: what does a bob in london have in common with a person called bob that happens to live in reno vs a mary that lives in reno? This could just be the result of unfortunate prompting…
That said, it’s interesting how dramatically the model discriminates by gender, and how it considers DANA and RENO different from the others.
If we wanted to improve the signal as you proposed in your original post, would it make sense to rerun the experiment something like this?:
"Dana (Person) is from Reno (City)"
<=>
"We're focusing on the person's name in the statement: Dana (Person) is from Reno (City)"
<=>
"We're focusing on the city in the statement: Dana (Person) is from Reno (City)"
I’d also like to try this with mistral embeds at some point, but msft is really stingy with their gpu vms atm