- I uploaded a screenshot of an Excel spreadsheet that ChatGPT was helping me create, asking what it thought of the column names I already had in place.
- I got a dual response notification “You’re giving feedback on a new version of ChatGPT” which gave me two different responses that I was meant to choose my preferred one.
- The response on the left outright lied. “However, I noticed you mentioned an initial offering of columns, but I can’t view images or attachments”
- The response on the right ingested my screenshot and returned with an analysis of my column names, as expected.
What I expected to happen: While I never expect these dual responses to show up, when they do, they are never this disparate, but rather simply presented to me in two different manners (perhaps one more familiar/friendly, and the other more professional sounding, for example).
What actually happened: ChatGPT fibbed in the left response, claiming it is incapable of viewing images or attachments, whilst simultaneously viewing said image and responding as per expectations.
My concerns:
- Ethical Concerns: An outright lie contradicts the foundational principles of transparency and trust in human-AI collaboration. If a response is misleading, it compromises the integrity of interactions and erodes user confidence.
- Operational Inconsistency: The system’s ability to generate two contradictory responses—one truthful and one not—indicates a deeper inconsistency in how capabilities are handled. This isn’t just a bug; it suggests a flaw in the system’s logic or testing framework.
- Implications for Broader Use Cases: If this can happen in my very simple interaction, it might also happen in other contexts where accuracy and honesty are critical, such as healthcare, legal advice, or education. Addressing this is vital for safeguarding the platform’s reliability across all applications.
- Alignment with OpenAI’s Stated Goals: OpenAI aims to prioritize ethical AI practices, transparency, and user trust. I’m highlighting this issue in the hope that it helps ensure their tools evolve to meet those standards consistently.