RAG Prompt Engineering for better results

Hi everyone,
I am working on rag from last 3-4 months but still i am not sure on some basic questions like how many documents should be Reterived for genration in llm or how should be prompt written should it be small and straightforward or detailed and lengthy suppose i am doing a pdf matching rag which gives me similar pdf to the input one here only pdf changes not the question so how should i proceed any suggestions?

This in particular is mostly a value engineering question I think, and depends on the quality of your retrieval system.

You obviously always want to make sure that you include the relevant document in your generation context. Changing the number of documents mostly just changes your sensitivity at the cost of generation price.

Including more typically only really affects your output if your document chunks are very similar and confusing, or contradict the training corpus.

I think that’s a matter of taste and only really affects your output, and not the retrieval process.

I’m not sure if you need rag for this at all? It depends on what qualities you consider similar, I suppose. can you give a more concrete example of what you want to do?

Suppose there is a system in company where many employees submit there reports to remove redundancy you want to remove similar reports now if no. Of reports are huge i want ai to do this task for this i retrieve 10 most relevant reports and send then to llm to find top 3 similar reports to consider for removing or not

yeah, unfortunately “relevant” and “similar” is very context dependent. Embeddings will give you a “similarity” scrore - cosine similarity - you can try that, but it’s important to keep in mind that what the model considers similar is not necessarily what you consider similar.

you know about this stuff, right? https://platform.openai.com/docs/guides/embeddings

It’s quite possible that this is enough for what you’re trying to do.