@lucid.dev Thanks a lot for the detailed explanation — it really helps to clarify the challenge and the possible paths forward.
Given that my goal is to build a production-grade system capable of answering legal queries based strictly on large document collections, I’m now leaning toward Option #2 (pre-processing with vector storage + retrieval + LLM for reasoning).
Do you (or anyone in the forum) have suggestions on the most scalable architecture for this setup? For example:
- Should I use OpenAI’s file search tools or go with an external vector store (like FAISS, Weaviate, Pinecone, etc.)?
- What’s the best way to ensure traceable, citation-level responses from the LLM?
- Are there any open-source RAG frameworks that handle this multi-step flow well?
Any tips or direction would be much appreciated — especially as I’m trying to avoid hallucinations and maintain legal accuracy.
Thanks again!