I am working on a chat application that uses data from PDF files. Below are some points on which some guidance/ help would be appreciated:
- Certain queries retrieve correct data only when the query contains specific words which are present in the actual answer. How can handle such cases?
- I am using different types of content (eg: only text content from the pdf file, content extracted from you tube video transcripts, content from tables present in the pdfs etc). If information is present in two different contents like in text data as well as you tube content, then the retriever prioritizes search on the you tube content over the text data. How can we prioritize search over text content?