So we have built a system that uses embeddings and then completion with context. So when a user asks a question, it searches the Pinecone vectors for similarity and then calls ChatGPT for the completion with context.
This is working great for most use cases.
However, there is a certain segment of query types where users think they are talking to a bot - so they ask questions like “summarize section 5” or “how many patents does X have?”. Such queries literally searches the Pinecone vectors for similarity of these words – and usually results in garbage context.
I’m curious what approaches people are taking for such type of queries.