How to prevent devs from accessing messages?

gowner · May 19, 2024, 9:02pm

Hi there!
I am building a service on the API. I am very privacy conscious and so I would like to ensure that nobody, not even myself, can look at the messages sent by my users to my assistant. Right now, if I understand everything right, I can request any thread from the API if I have its ID and then read all the messages inside. I want this to be impossible.

Is there a way to prevent me and other devs working on the project from being able to read messages? I would like to be able to make that reassurance to my users. Obviously I don’t keep message logs on my own servers either.

Basically, I need threads to be completely inaccessible by anything other than my server creating and executing a run object on it.

Macha · May 19, 2024, 9:37pm

Hey there and welcome to the community!

This is like asking if you can protect an admin account from having admin access. It’s a bit of an occam’s razor for cybersecurity.

There are a couple solutions to the problem you want that I can think of, however, this is more of an architectural design question, and your code still needs to be able to access messages to a certain extent.

You, as the developer, and your team, needs to built the architecture first in order to make it private. You have to build the structure to be anonymous in order for it to be anonymous.

Have you considered working with encrypted packets? Basically, when the API sends an output, your code can encrypt that output and then pass it around. You can write code that will decrypt it so it can be added to a thread or used in whatever way, and you can develop your own functions/classes/whatever that allows developers to work with threads and assistants in certain ways, while preventing them from calling the functions in the base API that allows them to read the contents.

gowner · May 19, 2024, 10:22pm

Thank you so much for replying. Part of me was hoping there would be an “enhanced data privacy” mode or something where OpenAI handles this and all I have to do is make sure not to save messages myself. But alas…

You’re right, it’s tricky when not babied by OpenAI the way I wah-wah-want them to. Thank you for the ideas on what to do. I think there needs to be some sort of gateway service that handles all communications with OpenAI and only releases encoded content, checks for origins and so on. That still leaves me personally able to circumvent all that but at least nobody else can.

But since you seem to know a lot about this stuff, do you happen to know if OpenAI is in compliance with the GDPR in how long they persist threads? If not, I would have to build a solution to keep track of the age of a thread to delete it manually after a while.

Thanks again, you’re awesome!

Macha · May 19, 2024, 10:40pm

Well, OpenAI is in GDPR compliance, otherwise they would not be allowed to operate in Europe. This is why companies like Anthropic don’t operate in Europe yet (at least from what I’ve seen).

Keep in mind though, you as the developer are responsible for maintaining the compliance yourself with your own tool. I am not European, so I don’t know the explicit details of GDPR compliance.

jochenschultz · May 19, 2024, 11:57pm

1st of all the question is where do you store the ID?

If you’d

put it into a database,
encrypt the data at rest and in transit
use a special system that provides access to prod data
restrict the access to that tool in a way that at least 2 people have to authenticate
the tool records session automatically
hashes the recordings and sends the hash to a notary
tag the deployed version of the database access tool
have the code reviewed by different developers that are very well paid per number of security concerns they find and don’t know each other.
block all other access than from the deployment pipeline to that server
send a hash of the software together with that tag to a notary whenever you deploy a new version
put a couple of machine guns and grenade launchers in the houses of the key holders
hire security companies to make sure their beloved ones are save
hire private investigators to apply for jobs inside the security companies to double check everything goes well there

still employees at OpenAI or Azure might have access then (and speaking of Azure we need to remind ourselves about the loss of the master-signature-key).

So If you really want to go fully secure you will have to do it yourself - on your own machines. I wouldn’t recommend OAI models for cases of when you have data of let’s say government organisations that made public could cause a world war or the destruction of the planet or “unwanted/unplanned” financial crisis. Anything less is ok though - e.g. if it is just about let’s say medical data or other stuff of lower relevance because it only involves single persons individual concerns).

gowner · May 20, 2024, 2:00am

And if I can convince the other person to be evil with me, we can unlock the database and leak everything, if we felt like self destructing or something.

I guess my takeaway is that this kind of privacy is nigh impossible to achieve with OAI at least for me at this time. I will do my best and pivot to warning users not to send sensitive data and I’ll be transparent about where the data is, how long, and allow them to delete it at any time, since we can delete threads.

Thanks to both of you for the perspective!

jochenschultz · May 20, 2024, 7:21am

You can unlock the database but it will be recorded - combine that with an external anonymous whistle blower service and an external audit and you are good to go.

hmm ahh ok selfdestructing… then why would you even collect the ID of the request…

johndeebdd · May 20, 2024, 8:15pm

I feel like ChatGPT itself solves this problem. I think you’re suggesting that there is a chat, and the chat happens between the user and OpenAI ONLY, and that you the domain holder are not part of that interaction. You get the result of that interaction. I feel like the use case would be to send that user to ChatGPT. Perhaps create a method for the user to be re-directed to chatgpt.com, and then have the result of whatever they are doing sent back to you? That could be done. Another use case is that you’re hiding the PROMPT from the user. But cutting you, the domain originator out of the chat, should probably be done on the chatgpt.com domain itself.

Dus-DB · May 21, 2024, 12:20am

I guess not using assistants but rather the chat completion API there is no storage of threads / messages.

Topic		Replies	Views
Can AI check itself for safety and secrecy? Can I split prompts? Prompting	6	779	July 21, 2021
RAG on private dataset via LangChain, does OpenAI / ChatGPT get access to the documents? API	15	19627	February 6, 2024
Data Privacy and limitations Community privacy	8	2877	December 16, 2023
Data Privacy Using Custom GPT's Community chatgpt	1	6582	November 21, 2023
How I can make sure my customer prompts are not used by OpenAI Community	1	3128	January 27, 2023

How to prevent devs from accessing messages?

Related topics