GPTs - sneaky privacy issue

New GPTs have an option: “Use conversation data in your GPT to improve our models”.

The very sneaky thing is that it appears only AFTER a file is uploaded into “Knowledge”. And it is checked by default, as if a user agreed to that before loading.

A user can not opt-out or even see the option BEFORE a file is
uploaded
. And even then option is well hidden.

Careful what you load.

PS: Curtesy to “This day in AI” podcast for pointing this.

What they discuss is not directly training on your file.

Instead, it is when the ChatGPT AI retrieves data from a file and answers about it that is being discussed. The AI says based on retrieval, “Yuriks company sells widgets and gadgets”, and information about widgets then becomes chat history. Then if the GPT is shared or placed into a store, widget answers are in other people’s chats.

If you upload files but then never ask a GPT AI a question, there’s no conversation data and no training possibility from chats. So asking after upload seems fine.

Please point to OpenAI statements that support that. Otherwise is as good as any other guess.

Users usually check privacy settings BEFORE they upload sensitive data.
Adding this option, AFTER data is uploaded (and “used for training”), checked-in by default, kills the trust.

1 Like

That’s just applying a bit of logic.

Raw data is not interesting to OpenAI. They have 45 terabytes of that in trained AI models.

What is of more interest and what OpenAI gathers is conversations and AI responses, to make the way the AI responds better. You can see this with the thumbs up/thumbs down, and the occasional “which is better” in ChatGPT. Your interactions create AI responses that are better, and then they go to more human knowledge workers, and might be part of what teaches future AI how to respond.

If you have trade secrets, it’s better to understand not to give that to other companies, or understand their data security policies first. At the bottom of the OpenAI site, you can see terms and conditions that apply to both API users and to ChatGPT enterprise users.

If you provide retrieval knowledge to a GPT, it is easy to assume that the AI is going to be loaded up with that information and be able to answer questions about that, so use proactive measures.

Indeed, it would be better to have “Privacy: here’s how your data can be employed” right up at the top, though.

if they are not interested why adding this checkbox in first place

I would guess that the checkbox will turn off a user’s chat history and training setting when using the GPT.

When checked, the GPT state instead would be “Don’t use conversation data in your GPT to improve our models”.

I agree with you Yuriks.

there are many points to de-trust this company as a whole.

but one point that gives them a bit of credit is that despite the fact that they had this button hidden somehow, they have made sure it is there, it’s like a compliant thing that they have to put, this way you shall be sure enough that once you checked-off that box, it won’t use your document for their training.

but something weird happened later. I uploaded a file, and then deleted it from the “configure” tab.

I then continued to chat with the GPT, and found out that it still has access to the file!!!

I went into the “create” tab in the GPT editor, and asked it to delete that file completely and never use it, ONLY then it has disappeared and the GPT started replying that it has no data to access.

I believe this whole thing is a veeeeeeery sneaaky situation, I believe they don’t have the right to enforce accessing other’s data, at the end ‘we’ and ‘openAI’ are separate individual companies and no one shall has the right ‘upper-hand’ to use another’s data without his consent on it.

1 Like