Objective
Build a legal document assistant that retrieves and returns EXACT copies of EU legal document templates without any modifications or alterations. When a user requests a specific document by ID, the system should return the precise template with original formatting preserved.
Current Setup
Files in Vector Store
- breach_report_829.md
- breach_report_probation_947.md
- certificate_custodial_sentence_909.md
- certificate_probation_measures_947.md
- certificate_supervision_829.md
- notification_sentenced_person_909.md
Example of a document format:
# [EU-DOC-829] Certificate: Supervision Measures
Type: Certificate
Framework: 2009/829/JHA
Purpose: Recognition of supervision measures
*Referred to in Article 10 of Council Framework Decision 2009/829/JHA on the application, between Member States of the European Union, of the principle of mutual recognition to decisions on supervision measures as an alternative to provisional detention.*
## (a) Issuing State
- **Issuing State:**
- **Executing State:**
## (b) Authority which issued the decision on supervision measures
- **Official Name:**
- **Additional Information Source:**
- [ ] The authority specified above
- [ ] Central authority:
- [ ] Other competent authority:
### Contact Details
- **Address:**
- **Telephone:**
- **Fax:**
- **Person to Contact:**
- **Surname:**
- **Forename(s):**
- **Position:**
- **E-mail:**
- **Languages for Communication:**
## (c) Authority for additional information
- [ ] The authority referred to in (b)
- [ ] Other authority:
### Contact Details
- **Address:**
- **Telephone:**
- **Fax:**
- **Person to Contact:**
- **Surname:**
- **Forename(s):**
- **Position:**
- **E-mail:**
- **Languages for Communication:**
## (d) Information regarding the person
- **Surname:**
- **Forename(s):**
- **Maiden Name (if applicable):**
- **Aliases (if applicable):**
- **Sex:**
- **Nationality:**
- **Identity Number or Social Security Number (if any):**
- **Date of Birth:**
- **Place of Birth:**
- **Address(es):**
- **In the Issuing State:**
- **In the Executing State:**
- **Elsewhere:**
- **Languages Understood:**
- **Identity Document(s):**
- **Residence Permit in Executing State:**
## (e) Member State where supervision measures are forwarded
- **Reason for forwarding:**
- [ ] Lawful residence in the executing State
- [ ] Requested transfer due to:
## (f) Decision on supervision measures
- **Decision Issued On:**
- **Decision Became Enforceable On:**
- **Legal Remedy Pending:** [ ] Yes [ ] No
- **File Reference:**
- **Period of Provisional Detention:**
### Offences Covered
1. **Total Alleged Offences:**
2. **Summary of Facts & Circumstances:**
3. **Legal Classification & Statutory Provisions:**
### Offence Categories (Tick if relevant)
- [ ] Participation in a criminal organisation
- [ ] Terrorism
- [ ] Fraud
- [ ] Other:
## (g) Supervision Measures
- **Duration:**
- **Possible Renewal:** [ ] Yes [ ] No
- **Nature of Measures:**
- [ ] Residence Reporting
- [ ] Restricted Areas
- [ ] Curfew Obligations
- [ ] Travel Limitations
- [ ] Other:
## (h) Other Relevant Information
- **Reasons for Measures:**
## (i) Signature
- **Name:** _______________ *(This field must remain blank for manual signing.)*
- **Position:**
- **Date:**
- **File Reference (if any):**
- **Official Stamp (if applicable):**
{
"model": "gpt-4o",
"temperature": 0.31,
"top_p": 0.85,
"tools": [{
"type": "file_search",
"file_search": {
"max_num_results": 19,
"ranking_options": {
"ranker": "default_2024_08_21",
"score_threshold": 0
}
}
}]
}
The Problem
When users request specific documents, the assistant:
- Sometimes returns a generated/modified version instead of the exact document
- Changes the formatting of sections
- Ignores the original document structure
- Creates new fields that don’t exist in the original template
Example of incorrect behavior: User: “I want document [EU-DOC-829]” Assistant returns a modified version with different sections and formatting instead of the exact template from the vector store.
What I’ve Tried
- Adjusted temperature (currently 0.31)
- Added strict formatting rules in the prompt
- Included document validation steps
- Set up file_search tool
Questions
- How can I force the assistant to retrieve and return EXACT documents without modification?
- What’s the optimal configuration for the vector store and file_search tool for exact matches?
- Should I restructure how documents are stored in the vector store?
- Are there better ways to handle document templates that I should consider?
- Would implementing Function Calling help with this use case?
Any help or guidance would be greatly appreciated. I need the assistant to behave more like a document retrieval system and less like a creative AI.
Additional Context
- Using Next.js for the frontend
- Documents are legal templates that must maintain exact formatting
- Using OpenAI Assistant API with vector store
- Current implementation uses file_search tool
- Need to support multiple languages for explanations but keep documents in original format