Questions about a Q&A bot and embeddings

Somebody asked me why am I using so many questions in one file to embed, rather than embedding a single question by itself? – now my brain blew up a little.

The back story is that I had a Langchain code working embedding about 20 questions in chunks… and the bot seem to answer in natural language given the Q&A file provided in vectors… <<<< — with this said, what would be the benefit to embed one question at a time? and not the answer?.. when and at what point would I need to limit the amount of questions in one file?

Now, Pinecone can have metadata and I can see a case to embed different files with different categories… but one question at a time?.. eventually the bot well need all of the information…

Usually you would embed individual questions, individual answers, or both, and the search will aggregate these all together to form the top K hits related to the query.

If you embed all 20 questions and get only 1 embedding vector, your search precision suffers because you are listing all 20 things under 1 vector, and so your overall retrieval may suffer.

I say may because if the 20 things are all related, then it might be more efficient to try and group all these similar things and let the LLM sort out the rest.

But if the 20 things aren’t related to a specific query, then you are just adding noise to your search.

1 Like