I see where you’re coming from with your concerns about data ownership changes, especially for users of AI APIs. It’s definitely important for users to feel in control of their data, especially if a platform’s ownership changes and new management decides to alter data use policies.
With OpenAI’s API, I can confirm that inputs and outputs are not used for training unless users explicitly opt in. Since March 2023, stricter data privacy rules have been in place to protect user data. Inputs and outputs are stored only for 30 days, mainly for abuse monitoring and investigating issues, and are then deleted unless legally required otherwise​(
Maginative
)​(
OpenAI
).
For APIs like Whisper and embeddings, the same rules apply no data is used for model training without opting in. So for now, developers do have a good level of control. However, I understand your concern: in the event of a change in ownership, these policies could shift, potentially allowing previously protected data to be used differently under new management. This is where the need for advance notifications and user options to manage or delete data makes a lot of sense.
I built Kruel.ai to avoid these kinds of concerns by keeping everything local, meaning I never have to worry about how external policy changes might impact my data. But for those using cloud-based solutions, it’s crucial that platforms provide options like bulk data export and better opt-out policies in case of ownership changes.
By the way, just to clarify, are you referring to ChatGPT itself (the UI/chat interface) or primarily the API side of things? I ask because while the API gives developers pretty strong control over their data (including opting out of data usage for training purposes), the regular ChatGPT interface might have different implications for data retention and usage.
The policies around ChatGPT sessions may be more focused on individual user conversations, and depending on settings, it could store conversations that might be relevant to your concerns. Additionally, if you’re using a custom GPT built by another user or a third-party agent, things get more complex. If that custom GPT is interacting with external systems or has dependencies on external services, there’s always a risk that data could be pulled into other environments outside OpenAI’s direct control. In that case, if the creator pulls the GPT or has it connected to something external, your data could be at risk without clear transparency on what’s happening with it.
It would be helpful to know if your concerns are mainly around the API, ChatGPT itself, or custom GPTs interacting with external systems so we can narrow down the possible data control and privacy implications you’re referring to.