Results given for Irrelevant Queries that is not in the data

shriradhakrishnan888 · July 7, 2023, 1:43pm

I have used Openai Embeddings for creating the Embeddings when I do vector search question irrelevant to the data it gave the results even though the it it irrelevant to the data.How to rectify the issue without using the threshold

anon22939549 · July 7, 2023, 3:46pm

You do not.

Embeddings use cosine similarity as a proxy for relevance between two vectors.

Two entirely unrelated vectors will still have a a cosine similarity score. So, the embeddings returned are just the best of only bad options. The only way to ensure the model doesn’t pull in irrelevant data is to establish a threshold for relevance.

Topic		Replies	Views
Matching irrelevant embedding vectors when given a question API	0	508	June 20, 2023
Irrelevant data returned when querying simple string API embeddings	0	323	November 16, 2023
Inconsistent Embedding Results for my dataset API embeddings	1	24	November 14, 2024
Semantic search through embeddings API	3	1238	January 22, 2023
Text embedding: cosine similarity API embeddings	5	2408	June 30, 2024

Results given for Irrelevant Queries that is not in the data

Related topics