Understanding the Impact of Special Characters in query during Vector Search

Understanding the Impact of Special Characters in query in Vector Search

Hey everyone! :wave:

I need some help understanding the role of special characters (!, ?, ., ", &) in vector search.

Example 1 :

Imagine this: You’ve meticulously curated a vast digital library of books and are now gearing up to transform it into a powerful Question-Answering (QA) system. The goal? Fetch relevant passages based on user queries and then leverage OpenAI to generate accurate answers.

Here are two options I’m contemplating:

  1. “What are the key character in book ?” or "What are the key character in book "
  2. “Who was the villain and how it got killed ?” or "Who was the villain and how it got killed "

Which approach do you think would be more effective in retrieving relevant passage from the books.?.
What role will special character like “?,!,.,&” will play in query ?

Example 2 :
Imagine you have a knowledge base with resumes uploaded, and you want to efficiently locate the documents where candidates possess special skills before sending them through OpenAI for RAG analysis. How should you structure your query?

Here are two options I’m considering:

  1. “What is the name of candidate skilled in Python?”
  2. “What is the name of candidate skills in Python”

Which one do you think would yield more accurate results? Any insights or suggestions would be greatly appreciated! Let’s crack this puzzle together. :jigsaw: #VectorSearch #SkillIdentification openai #RAGAnalysis