I have used 0.79 as the cosine similarity threshold for text-embedding-ada-002. This means that any lower value would not be considered similar enough to be included in the context.
However, upon utilizing text-embedding-3-large, the same threshold no longer seems effective. Initial tests indicate that a lower threshold number should be chosen.
I’m curious to learn about the rule of thumb for the similarity threshold that people have settled on with text-embedding-3-large.
Yes, using rule-of-thumb values to start one’s own exploration is a good way forward for developers who are mostly interested in practical outcomes.
You can take a look at what other users from the community have found to work well for them. And from there you need to fine-tune your approach to your use case.