Hi everyone,
We’re working to improve GPT’s accuracy and consistency in generating responses for queries involving structured tabular data. The objective is to process these structured files effectively to ensure accurate record import into our database.
We would appreciate any suggestions, best practices, or insights to optimize GPT’s performance in this area. Your expertise and feedback would be invaluable!
We’ve been working with a dataset containing fields like events, sessions, and locations, but GPT often struggles to generate accurate responses, especially for queries involving multiple rows or complex relationships. While we’ve optimized our prompts and retried multiple times, the results remain inconsistent, and it’s challenging to achieve accurate outputs efficiently.
Here’s a sample record from our dataset to provide some context:
| charity_id | Events.id | Events.title | Events.brief | Events.description | Events.start_date | Events.end_date | Events.type | Events.createdAt | Events.updatedAt | Events.kind | Events.status | Events.publishState | Events.timezone | Events.visibility | Events.contactor | Sessions.id | Sessions.startTime | Sessions.endTime | Sessions.createdAt | Sessions.updatedAt | Sessions.eventId | Sessions.status | Sessions.hasDate | Sessions.hasWaitList | Sessions.minimumAvailability | Sessions.maximumAvailability | Sessions.registrationDeadline | Sessions.isAvailabilityLimited | Sessions.approxWeekRequired | Sessions.approxHoursOfCommitment | Sessions.name | Sessions.timezone | Cause_Events.id | Cause_Events.eventId | Cause_Events.cause_id | Cause_Events.createdAt | Cause_Events.updatedAt | DevelopmentGoal_Events.id | DevelopmentGoal_Events.eventId | DevelopmentGoal_Events.developmentGoalId | DevelopmentGoal_Events.createdAt | DevelopmentGoal_Events.updatedAt | Locations.id | Locations.fullAddress | Locations.manuallyEnteredCityName | Locations.manuallyEnteredState | Locations.manuallyEnteredCountryName | Locations.isManual | Locations.type | Locations.linkType | Locations.eventId | Locations.createdAt | Locations.updatedAt |
|---------------------------------------|-----------|-------------------|----------------|----------------------------|-----------------------------|----------------|-------------|------------------|------------------|-------------|---------------|--------------------|----------------|------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------|-------------------------|-------------------------|------------------|------------------|----------------|---------------|----------------|-------------------|---------------------------|---------------------------|---------------------------|----------------------------|---------------------------|--------------------------------|--------------|-----------------|----------------|-------------------|-------------------|---------------------|---------------------|--------------------------|--------------------------|--------------------------------|-----------------------------|-----------------------------|-------------|---------------------|-------------------------------------|----------------------------|--------------------------------|----------------|---------------|----------------|----------------|----------------|----------------|
| 58db463f-edc2-4f8a-af5a-d68a733bdd29 | | Test Event Title | Short Summary | Giving Activity Description | 2025-01-08 12:26:49.994+00 | | 0 | | | 0 | 1 | APPROVED | Asia/Manila | 0 | {"name":"Renel Nuestro","email":"renel@catalyser.com","number":"","useDefaultContact":true,"overwriteWithTeamLeaderDetails":true} | | 2025-01-30 17:00:00+00 | 2025-01-30 20:00:00+00 | | | 1 | TRUE | TRUE | 1 | 10 | 2025-01-30 17:00:00+00 | TRUE | | | Session 1 | Asia/Manila | | | | | | | | | | | | 156 General Luna | Malabon | Metro Manila | PH | TRUE | 0 | -1 | | | |
Our goals:
- Enable GPT to process such data efficiently.
- Generate structured and accurate responses consistently.
Challenges:
- GPT often misses the mark on the first attempt, particularly for multi-row queries.
- Responses require multiple retries, making the process time-consuming.
- Gemini performs better in similar tasks, indicating possible limitations or configuration needs for GPT.
Questions for the community:
- Are there best practices or prompt techniques for working with GPT and tabular data?
- Has anyone faced similar challenges, and how did you overcome them?
- Would fine-tuning or specific configurations improve GPT’s performance here?
Any guidance, tips, or resources would be highly appreciated.
Thanks in advance!
Catalyser team