Vector database problem and answer ratio, if the proportion of words in the question is relatively short,
the score will be high (0.36); My code’s threshold is a score lower than 0.3, then it uses the vector database
for the answer; if the score is higher than 0.3, then it uses regular GPT-based question answering.
My desired outcome is to address the issue where questions with very few words cannot find a match.
Example: Is it expensive or not? Answer: The premium infant formula “Infant Baby Gold Formula” is priced at 668 per can. There’s a promotion going on these days with a 20% discount.
You need conversational context for that input. user: I want to reduce wrinkle lines. user: So if the creams are snake oil, is Botox safe? user: Is it expensive or not?
We give more user inputs. Also including the prior AI answers may give obsolete similarity matching, though. Differing amounts of users context could be used in parallel embeddings calls, and thus two searches of data to augment AI knowledge in different ways.
Still, retrieval matching by semantic similarity will not be optimized. You have questions, the database has informational text. AI-generated brief example questions (like a user could ask) for each knowledge chunk could also be sent to embeddings.
Thank you for your response; that’s just how we do it.
When the user asked, “Is it expensive or not?” Although it’s exactly the same as a question in the vector database, the score from similarity_search_with_score() is too high. This is because the proportion of the question in the answer sentence is too low.
Thank you for your response; that’s just how we do it.
When the user asked, “Is it expensive or not?” Although it’s exactly the same as a question in the vector database, the score from similarity_search_with_score() is too high. This is because the proportion of the question in the answer sentence is too low.
If the input is context-poor, there’s not much that can be done with guaranteed results if not understandable or answerable without your database knowledge.
You could have language AI take some of the conversation and transform the question into one that is standalone, long, and thought-out. Ask for more example questions that would give the same answer.
One technique (that would make the user wait) would be to have the AI answer the question with its limited knowledge, and then retrieve the database answer by including the AI answer in embeddings, for the final augmentation. (you could show internal users the preliminary generation as user feedback).
An AI answer will look more like database text than a brief series of questions, unless it simply says “As an AI model, I don’t know…”.
The ideas would have to be done every time, as I didn’t describe any place where we could detect inadequate or easily-fulfilled questions.
The crux of the issue doesn’t lie in the context. I just want to know how to lower the matching score for short questions. Your response has extended a certain line of thought; take a look to see if there are any other simpler approaches available.