I’m working on a project where in I need to match two sets of structured data using Gpt-4o model. Since the data is large , I process them in chunks and send them sequentially to avoid Token rate limits.
The agent takes two input files.
The issue I’m facing:
- The first few chunks give accurate results(Max-2 or 1), correctly classifying transactions into three categories (exact match, possible match, unmatched).-(Sometimes here also some rows are missed)
- But as the processing continues, the quality drops—some matches are inconsistent, and the responses become less structured.
How I’m Handling Chunking & API Calls:
- Splitting data into chunks of ~50 records each.
- Processing each chunk separately by sending it to GPT-4o via
ChatCompletion.acreate()
. - Introducing a delay (2s) between API calls to avoid rate limits.
- Final pass: Any unmatched records from the first pass are processed again.
What I’ve Tried:
Adjusting chunk size (tried from 30 to 100 records).
Sent only the required columns needed for matcching to reduce input.
Lowering temperature
to 0.2 for consistency.
Including a system prompt to enforce structured JSON output.
Ensuring consistent formatting across chunks.
Adding a retry mechanism for API failures.
Questions:
- Why might the model’s response degrade as more chunks are processed?
- Is there a better approach to ensure consistency across multiple API calls?
- Is there any other alternative where i can upload full data instead of chunking
- Do you know about any other method which is fast and efficient and also trustable as record data contains financial work
Any insights or best practices for handling LLM-based 2 files (excel) matching tasks with large structured datasets would be greatly appreciated!