Hello everyone,
I’m encountering an issue with my assistant when querying specific legal articles from a large document stored in a vector store. The document is quite extensive, with 2600 articles and around 500 pages (in .txt format, 2MB in size), split into chunks of 800 tokens with an overlap of 400 tokens.
The problem arises when I ask the assistant about a specific article (usually around 400-700 characters long). Instead of returning the correct response based on the exact article, it either returns a completely incorrect response or gives me details from another article.
Here’s a summary of my setup:
- I’ve tried using different models (GPT-3.5, GPT-4, etc.).
- I’ve experimented with various temperatures and Top P settings (currently at temperature 0.45 / Top P 0.78).
- The document is stored in a vector store, and the instructions are designed for precise responses based on the content of the vector store.
Despite these attempts, I’m still getting incorrect results.
I’m wondering if there’s something I’m missing in my configuration or if there’s an issue with how the model handles this type of data. Could anyone help me optimize this setup or suggest further steps to improve the accuracy of responses?
Any guidance or a list of best practices for this type of implementation would be greatly appreciated!
Thanks in advance!