ChatGPT-like Search functionality - Reliability on question and answers

I created ChatGPT-like Search functionality, which is a question-answering system that understands the meaning and context of questions posed and provides relevant answers. I used text-embedding-ada to create embeddings.
But if I use the same search query, it returns different results with and without the question mark.

If I ask the same question with different wording, I get different answers, even though the context is the same.

For example,
what is a table?
What is a table

Both are the same question, but each generates a different embedding, selects a different chunk, and returns a different answer. I’m in required of the consistent answers each time.

How reliable is this? How can we solve this?

The simplest way would be to pre-process the text before creating the embedding by removing some stop words, numbers or punctuations marks and the like.

While removing stopwords might decrease the performance of GPT, the other two should not effect the output that much

Further to @udm17’s reply, I’d also add that perhaps taking the users question and passing that to the model with a prompt like “please take this query ###{user_input}### and rewrite it such that it will produce the best result when used as a vector embedding search term”. That way, you isolate the users query from an ideal search term, you will still get variance but hopefully less so.

1 Like

“What is a table” - ok, thanks for the information. Could you explain it a little further? Are you using “What” as an abbreviation for something?

“what is a table?” - a table is…

that’s an example of pylons u should never have 2 things 99% the same heck even the issue can accrue at a 20% if u don’t set up like the right call functions u have to set rules and variables in which u can “fine-tune” the chatbot search functions there can be lot to making a search functionality i recommend a making a log that records “Q&A” and u can use it in rapid succession to make the same thing I’ve done this before for another compony but ya its a little simple but can become complicated depends on the sophistication of the design of the information collected it can make itself after enough questions cuz u can training it to ask ChatGPT question and search in other places