Problem:
I’m using the OpenAI GPT API to extract pickup and delivery locations from logistics emails and return them in a JSON format. The emails sometimes contain structured data with multiple entries (e.g., tables of pickup/delivery dates, cities, weights, and commodities). Here’s an example of the structured data:
Pick Date Pick City Pick State Delivery City Delivery State Delivery Date Weight Equipment Commodity
1/9/2025 Tuscaloosa AL Tulsa OK 1/10/2025 42,000 V Lumber
1/9/2025 Cusseta AL Rockbridge Baths VA 1/10/2025 44,000 V Rolls Of Paper
…
The Issue:
- The model frequently times out when processing longer emails within the set 3000ms timeout limit.
- Increasing the timeout to 10,000ms temporarily resolves the issue but is not feasible in my setup.
- Reducing the temperature to 0.3 and optimizing the prompt for brevity still results in frequent timeouts.
Questions:
- How can I optimize the prompt or process to handle large structured emails within the 3000ms timeout?
- Are there strategies for splitting structured emails into smaller chunks for processing without losing context?
- Are there alternative methods/tools to handle large datasets with better response times?