Text Embedding result based on Priority

utsav-d · March 12, 2024, 10:31am

I am new to OpenAI, I have integrated text-embedding-ada-002 modal in AWS OpenSearch for the semantic search.

I want text Embedding Results based on my priority order so for example
my document is like,
{
title: “random”,
desc:“random”,
}
if i search for “abc” word then if my priority order is title > description > etc… then it should be in order of that findings

So if two document found, one document is got because of “abc” is matching in description and another got because “abc” is matching in title. So, In the Results, first should be that document which got because of title is matched irrespective the order of document in database

This is just an example there would be lots of fields…

any solution ?

also another problem is if exact match found then also score is about 0.7, I think it’s low

Thanks in advance

Diet · March 12, 2024, 10:42am

Welcome to the community!

Welcome to the world of embeddings!

Embeddings don’t exactly match words. They’re supposed to encode the concept or meaning of whatever the document contains. And the cosine similarity is supposed to indicate the similarity between concepts, and not whether certain words appear.

While there are some limitations to this, this means that you can sorta search across languages, even if the languages have nothing in common. To a degree.

So, if you want to “only match the title” or “only the description” - you’ll have to separately embed the title/description and keep an index for that. You can then use a rank fusion algorithm, maybe Reciprocal Rank Fusion with a bias, to lift certain components out over others.

this is indeed pretty low. in ADA, stuff around 0.7 is pretty much useless. I would investigate what you’re actually embedding and comparing, it’s possible that you might have a logic issue here.

You could also consider using the new text embedding 3 models, they’re a little easier to interpret.

utsav-d · March 12, 2024, 10:57am

Yaa, I mean similarity but results should be based on priority that I want, means if “abc” found similarity in the field with top priority (here title in example), then it should be at top of my results.

prioritize the fields is my main concern here

Diet · March 12, 2024, 11:58am

Well, yea, what I’m telling you is if you want to discriminate by field, you’ll need to at least embed the fields separately.

How were you figuring it could work?

Topic		Replies	Views
I want to known about Text Search using text-embedding-ada-002? API	3	1043	December 20, 2023
Using Embeddings for search poor results vs GPT3 API	1	758	December 17, 2023
Need help for vector embedding search with Open Ai embeddding and elastic search cosine similarity API	1	438	February 15, 2024
Embedding and searching from similar embeddings API	6	6253	October 27, 2023
Embeddings support for numbers API	5	2179	December 17, 2023

Text Embedding result based on Priority

Related topics