I’m currently working on a Retrieval-Augmented Generation (RAG) system for question answering, and I have a question regarding embeddings. Specifically, do the embeddings generated by OpenAI remain consistent every time?
I’m using the text-embedding-3-small model, and I’m unsure if the same question might produce different embeddings each time. I’ve noticed that the chunks picked up based on question changes, which affects the response.
That’s why you should use some kind of tolerance range to determine what is similar and what is not similar enough, the range giving you the necessary flex to include all things which are similar enough whilst also handling the case when something identical has slightly different numbers.