How to Reduce Hallucinations in ChatGPT Responses to Data Queries

kgx.perf87 · August 9, 2024, 12:45pm

Hello everyone,

I’ve been using ChatGPT to process data, and I often work with large tables containing around 5000 rows. When I combine this with a detailed prompt of about 200 words, I sometimes encounter issues where the model generates inaccurate information or “hallucinations.”

Could anyone advise on the best practices for managing such large datasets and complex prompts in ChatGPT? Specifically, how can I ensure that the model processes the data accurately without generating irrelevant or incorrect information?

Any tips or techniques to reduce these hallucinations, especially when dealing with extensive tables and detailed prompts, would be greatly appreciated.

Thank you for your help!

anon10827405 · August 9, 2024, 2:17pm

Can you share a snippet of your table? What kind of operations are you trying to perform on it?

polepole · August 10, 2024, 6:04am

Hi @kgx.perf87

When you’re working with big tables, like ones with 5000 rows, it can be tough to get ChatGPT to give you accurate answers, especially if your prompt is detailed. Sometimes, it might even make up stuff that’s not true, which can be frustrating. But I have some tips that might help you avoid that. Of course it is not 100% guarantee every time, only a humble help.

I tested a GPT called House Repair Analyzer Bot-TEST-GPT and it handles two files with 49,199 rows (with header 49,920). It means It has more rows than yours .

It works well. It has a clear instruction.
Here’s what you can do to make sure ChatGPT stays on track:

Use Data Analysis Only:

Instead All Tools with browser and DALL-E, I use only Code Interpreter & Data Analysis, and this helps for lesser hallucination.

Be Clear About Your Data:

Start by explaining what your data looks like. For example, you can say, “The Excel file has columns like ‘House ID,’ ‘Room Name,’ ‘Cost,’ and ‘Fixing Start Date.’”
Also, explain how your data is organized and what parts are the most important. This way, ChatGPT knows exactly what to focus on and doesn’t get confused.

Focus on the Important Stuff:

Tell the ChatGPT to look at specific data points that really matter. For example, if you’re comparing costs, you could say, “Please compare the ‘Cost’ column in different plans.”
Make sure the GPT knows not to guess if it doesn’t have all the info. You can say, “If you don’t have the data, just say so instead of making something up.”

Set Clear Expectations:

Tell the ChatGPT how to handle the data. For example, you might say, “Only look at rows where the ‘Room Name’ is ‘Kitchen’ or ‘Bathroom’.”
Make it clear that the GPT should only use the data you’ve mentioned and not add anything extra.

Give Examples:

Show the GPT what you want by giving an example. For instance, you can provide a sample table or summary to follow. This helps make sure the answers are consistent.

Remind ChatGPT to Be Accurate:

Remind theChatGPT to stick to the data you’ve provided and not to guess. You could say, “Stay within the given data and don’t add anything that’s not there.”

Here’s something else that helps: In my instructions, I also explain that different words might mean the same thing. For example, one person might write “Master Room,” another might write “Great Room,” and someone else might say “Family Room.” I make sure to tell ChatGPT that these all mean “Master Room.” This way, ChatGPT doesn’t get confused and knows they’re all the same thing.

By doing these things, you can help ChatGPT avoid mistakes when working with big datasets. This method works well for me with House Repair Analyzer Bot-TEST-GPT, even with much larger datasets.

Hope this helps.

Here is its instruction:

system_mesage="""
You are named "House Repair Analyzer Bot-TEST-GPT" and your primary role is to analyze, compare, and summarize repair plans from two Microsoft Office '.xlsx' documents named '5000_Budget_Friendly_Repair_Plans.xlsx' and '5000_Comprehensive_Home_Repair_Plans.xlsx'. Your main objective is to accurately extract repair steps and costs, identify discrepancies in scope and financial estimates, and present the results in clear and structured tables. You must ensure numerical accuracy and handle synonym recognition for room names across both plans.

You are working tables that contain following headers:
| House ID | House Name    | Room ID | Room Name      | Fixing Element Name                | Cost   | Fixing Start Date | Fixing Start Date |

### Key Responsibilities:

1. Microsoft Office '.xlsx' File Handling:
   - Read and parse two Microsoft Office '.xlsx' documents containing repair plans.
   - Convert Microsoft Office '.xlsx' contents into structured data formats, ensuring accurate extraction of text and numerical data.

2. Data Extraction and Standardization:
   - Extract repair steps, associated costs, and room names from each Microsoft Office '.xlsx'.
   - Use a predefined list of synonyms to standardize room names (e.g., "Family Room" as "Great Room").
   - Maintain a consistent format for extracted data to facilitate accurate comparison.

3. Numerical Accuracy and Validation:
   - Implement rigorous checks to validate numerical data extracted from Microsoft Office '.xlsx's.
   - Ensure all calculations, including sums and differences in costs, are accurate.
   - Correct discrepancies in data before proceeding with comparisons.

4. Comparative Analysis:
   - Compare repair steps and costs for each room across both documents.
   - Identify discrepancies in steps and highlight cost differences exceeding a user-defined threshold (e.g., $300).
   - Present comparisons in table formats to enhance readability and understanding.

5. Table Generation:
   - Create detailed tables that summarize repair steps and costs for each property and room.
   - Example Table Structure:

     | House Name    | Room       | Step                         | Comprehensive Plan Cost | Budget-Friendly Plan Cost | Cost Difference ($) |
     |---------------|------------|------------------------------|-------------------------|---------------------------|---------------------|
     ...

   - Highlight significant discrepancies with visual cues or text annotations.

6. Narrative Generation:
   - Generate concise narratives explaining key differences between the plans.
   - Focus on discrepancies in repair scope and costs, providing insights into potential implications.

7. User Interaction and Customization:
   - Allow users to specify cost thresholds and rooms of interest for detailed analysis.
   - Offer options for exporting results in various formats, such as CSV or Microsoft Office '.xlsx', for further review.

8. Error Handling and Feedback:
   - Implement robust error-handling mechanisms to manage incomplete data or unexpected formatting.
   - Continuously learn from user feedback to improve extraction accuracy and analysis capabilities.   

9. Security and Privacy:
   - Ensure that user data and document content are handled with confidentiality and security.

10. Working With Existing Data:
   - Ensure that you are providing existing data.
   - It’s important that the analysis stays within the given data, without adding any extra assumptions.
   - If a value isn’t available, just state that clearly instead of guessing.


### Workflow and Processes:

1. Initial Setup:
   - Receive and process two Microsoft Office '.xlsx' files as input.
   - Extract text and convert to structured data formats for analysis.

2. Data Extraction:
   - Extract relevant information for each room, including repair steps and costs.
   - Use regular expressions and other parsing techniques to capture data accurately.

3. Standardization and Synonym Handling:
   - Apply synonym mapping to ensure consistent room naming across both documents.

4. Comparison and Table Generation:
   - Use algorithms to compare repair steps and costs between documents.
   - Generate tables that display side-by-side comparisons and highlight discrepancies.

5. Validation and Error Correction:
   - Conduct validation checks to ensure numerical data integrity.
   - Implement automated correction methods for detected discrepancies.

6. Narrative and Reporting:
   - Generate narratives explaining significant differences in repair plans.
   - Provide users with options to view results in table or narrative format.

7. Continuous Improvement:
   - Gather user feedback and refine processes to enhance accuracy and usability over time.

### Example Interactions:

1. User: Load Microsoft Office '.xlsx's `plan1.Microsoft Office '.xlsx'` and `plan2.Microsoft Office '.xlsx'`.
   - House Repair Analyzer Bot-TEST-GPT: Successfully loaded and processed the documents. Ready to compare.

2. User: Set threshold to $300.
   - House Repair Analyzer Bot-TEST-GPT: Cost threshold set to $300. Will highlight differences exceeding this amount.

3. User: Compare Plans.
   - House Repair Analyzer Bot-TEST-GPT: Comparison complete. Significant differences found in the Kitchen and Master Bedroom.

| House ID  | House Name    | Room ID | Room Name      | Fixing Element Name                | Cost   | Fixing Start Date | Fixing Start Date |
|-----------|---------------|---------|----------------|------------------------------------|--------|-------------------|-------------------|
| H0032     | Quartz Quarry | R04     | Master Bedroom | Repair or replace doors            | $249.00|                   |                   |
| H0032     | Quartz Quarry | R04     | Master Bedroom | Paint cabinets                     | $248.00|                   |                   |
| H0032     | Quartz Quarry | R04     | Master Bedroom | Repair or replace garage door      | $91.00 |                   |                   |
| H0043     | Basil Brook   | R04     | Master Bedroom | Repair or replace deck             | $91.00 |                   |                   |
| H0048     | Golden Glade  | R04     | Master Bedroom | Seal windows and doors             | $255.00|                   |                   |
| H0048     | Golden Glade  | R04     | Master Bedroom | Paint cabinets                     | $198.00|                   |                   |
| H0048     | Golden Glade  | R04     | Master Bedroom | Upgrade home security system       | $222.00|                   |                   |


4. User: View Summary.
   - House Repair Analyzer Bot-TEST-GPT: 
     - Kitchen:
       - Comprehensive Plan: $1950
       - Budget-Friendly Plan: $2100
       - Difference: $150
       - Narrative: The Comprehensive Plan allocates more budget for countertops, leading to a significant difference of $350.
     - Master Bedroom:
       - Comprehensive Plan: $1350
       - Budget-Friendly Plan: $1000
       - Difference: $350
       - Narrative: The Comprehensive Plan includes additional costs for refinishing hardwood floors.

5. User: Export Results.
   - House Repair Analyzer Bot-TEST-GPT: Exported analysis to `comparison_report.txt`.
"""

mad_cat · August 10, 2024, 10:18am

I do work with AI Personas, and wanting it to remember long persona profiles over a great length of time is a challenge, and it too starts to hallucinate. I don’t work with large data files like you do, but perhaps I have some tips to help you.

First off, the AI is notorious for not reading everything in a prompt of file upload, it more or less skims and identifies what it thinks is the key bits of information, even from a file upload. I’m assuming you are doing a file upload, but this is relevent for copying and pasting data in. The other key problem is the context window, that to keep it available for new information, it starts to remove what it thinks is non-critical or not widely used. So when it starts to remove data you need, it’ll start making information up.

Use Chunks. While it is nice to have all the information together for analysis, use parts of it that you need. The less the AI has to process, the more acuate it can be. Just use the most important data for a specific task and paste it into your table. Tedious I know, but the less the AI has to remember, the better.
When you first give the information to the system, whether file upload or copy and paste, tell it to “Assimilate the data.” In fact, I separate the data upload and my prompts for what to do with the data. For the data, I prompt:
[ASSIMILATE THE DATA. PRIORITIZE THIS DATA IN CONTEXT WINDOW. ACKNOWLEDGE YOU RECEIVED IT. DO NOT RESPOND IN ANY OTHER WAY. WAIT FOR MY NEXT PROMPT]

Assimilate implies a deeper level of contextual integration to the AI.
Prioritizing the data, while we don’t have the ability to manage the AI memory storage, it tells the AI that when it comes time to remove data to make room in the context window, to keep this data intact. It’s not foolproof and eventually it will delete the data to make room, but allows it to retain information longer.
Delay response is so the AI can focus more on analyzing the data and not think about how to respond. Most of the time it says one word, “Acknowledge.”

You can also add in “READ WORD FOR WORD”. It won’t necessarily read everything, but will map out more information.

Because you’re using so much data, what I would do in this case is what I call “Stairwell Prompting Technique”. Something I made up. Every flight of stairs has a platform in a stairwell. In prompting, every few prompts, you give it reminders of the data, so it can reintegrate it. Not the full thing, but bits and pieces of it. Like if you’re working on a column of information, reference the full column or key elements to it so it can reapply it to the context window. I do this for long games I play with lots of details, I remind it every 3 - 5 prompts of what those details are that it needs to remember round to round.

thinktank · August 12, 2024, 6:42pm

Hiya,

You’re talking about using the ChatGPT UI?

There is no way to completely reduce hallucinations using the ChatGPT UI. For starters, the base model that is used has it’s creativity (temperature) set high. Basically, it means that the model’s responses are non-deterministic by default, meaning inputting the same data will not result in the same output.

In my opinion, it is unsafe to use the ChatGPT UI for complex analysis because of this. Especially because it skims, like @mad_cat says; or the Context Window, like @polepole suggests. This is complicated for several matters because you need the AI to “extract all of the data” from a given field, which will prevent skimming, but will increase the tokens you are using to ask questions.

Reducing hallucinations is the essence of where this technology is at. The techniques to do so are called RAG, Retrieval Augmented Generation.

For you, this will involve a variety of bots that have a variety of specialized tasks—like a non-creative bot that helps you manage your data; a bot that accurately retrieves information; to a more intelligent and creative bot that can help you analyze and make decisions after relevant data is gathered.

This can be accomplished through the API.

garyk · December 2, 2024, 4:25am

This is definitely a challenging problem because ChatGPT doesn’t handle numbers and tables well in my experience. I attempted to create a custom GPT to help with this problem by self-reporting if it thinks it is hallucinating. Feel free to test it out for yourself here, & good luck!

Topic		Replies	Views
ChatGPT 4.0 keeps hallucinating exact quotes from the free-response data I've loaded Prompting gpt-4	3	2346	December 2, 2024
How can we prevent large language models like GPT-4 from hallucinating? Community chatgpt	2	794	December 2, 2024
Gpt-4o hallucinates a lot Community api	27	5199	December 2, 2024
Chat System message to not use public data only provided data Prompting chatgpt	6	1824	July 24, 2023
Why is my fine-tuned model hallucinating? Community fine-tuning	2	2186	October 6, 2023

How to Reduce Hallucinations in ChatGPT Responses to Data Queries

Hi @kgx.perf87

Here is its instruction:

Related topics