OpenAI - We need to talk about org verification

Today, I’m going to demonstrate that OpenAI’s org verification is ineffective and an intrusion on our privacy, despite privacy being at the “core” of their products. Hopefully, OpenAI will make the right decision and put this ID checking practice in the past.

Background

For those unaware, OpenAI is terrified of their models being distilled off-platform. Distillation is the process of harvesting output data from a LLM and then using that output data to train a different LLM. When DeepSeek came about, OpenAI was furious. Their spokesperson stated, “DeepSeek may have inappropriately distilled our models,” and, “[w]e take aggressive, proactive countermeasures to protect our technology and will continue working closely with the U.S. government to protect the most capable models being built here.”

Yes, as if they had completely forgot thought about their own pre-training process, OpenAI stated publicly that they are calling up the US government because DeepSeek trained a model off GPT-4 outputs.

OpenAI fights distillation

Frightened by the news that closed-weights models aren’t as special as they thought, OpenAI sprang into action. They partnered with a company called Persona, an ID checking firm way lesser known and definitely more shady than other industry leaders.

So why is Persona better than reputable industry leaders like Stripe? Let’s look at their privacy policy!

Uncovering the ugly truth

The images obtained from government identification document and photos of your face that you upload, and data from scans of facial geometry extracted from the government identification document and photos of your face that you upload, are collected, used and stored directly by Persona on behalf of Customer as Customer’s service provider through Customer’s website or app that you accessed.

Persona will permanently destroy data from scans of facial geometry extracted from the photos of your face that you upload upon completion of Verification or within three years of your last interaction with Persona, consistent with the Customer’s instructions unless Persona is otherwise required by law or legal process to retain the data.

When you upload your ID and scan your face, all of that is stored by Persona. Your face scans are deleted after three years at the most, but your ID is presumably saved indefinitely. Because that’s definitely necessary.

But at least all they’re doing is confirming that you aren’t lying about who you are, and they wouldn’t share this sensitive information with their customer OpenAI… right?

[Retrieve a Government Id Verification]

Welp. Turns out they can query your government ID through the API just as easily as you can query an answer from gpt-4.1 on OpenAI’s API.

So, what are the issues with Persona here?

Persona and OpenAI’s partnership with them is undoubtedly shady. You would expect that when you verify your identity, the information you provide is used only for verifying who you are and isn’t saved or shared around. But this plainly isn’t the case. When you perform verification in your OpenAI account, here’s what happens:

  1. Persona saves your documents indefinitely.
  2. OpenAI has the ability to programmatically retrieve your documents from Persona using a well-documented API.
  3. And this is all supposedly so OpenAI can be assured you aren’t storing outputs to train with them later. Yikes.

OpenAI’s org verification: malicious, or hysterics?

Obviously, OpenAI granting themselves the same investigative privileges as the government because o3 is apparently just that good (it is not) is all extremely invasive.

Now, it’s impossible for us to tell if OpenAI has another secret and more malicious motive, or if they seriously, truly think that ID checks are enough to stop an armed and dangerous distiller. However, we can brainstorm ways DeepSeek, or, the “P.R.C.” as OpenAI calls them, can easily achieve distillation despite these checks, rendering them completely useless, as an exercise to show that org verification really doesn’t do anything.

  • An adversary could simply just complete the ID checks, or pay others to do them.
  • An adversary could use stolen accounts or API keys (and give OpenAI an actual reason to call the cops this time.)
  • An adversary could type “site:chat.openai.com/share” into Google. Or, more realistically, automate something similar.

Admittedly, these aren’t the easiest things to do, so you and I with our $10 budget certainly won’t be distilling OpenAI models anytime soon, but what about a nation-state adversary that appears to have already distilled OpenAI’s reasoning models? Let me know what you think.

OpenAI hiding limitations

This is a bit of a bonus rant, so I’ll divide it into sections in case you’re interested.

Org verification requirement is now omitted from announcements and emails.

When OpenAI released org verification, they were upfront about its requirement for certain offerings in announcements and emails. But in recent announcements and emails, disappointingly, OpenAI routinely fails to mention ID check requirements despite being able to disclose pricing. I guess invasive and unnecessary ID checking is a bad look?

GPT-4 was originally falsely advertised as having vision capabilities.

This is unfortunately not a new problem. When OpenAI touted that their work-in-progress known as GPT-4 could see images, users flocked to upgrade to Plus when they added GPT-4 to the subscription. However, they failed to mention anywhere that GPT-4 in ChatGPT was not able to receive image inputs. Coupled with constant uptime issues and their literally non-existent support team (at the time), users filed chargebacks with their banks en masse. The Stripe payment gateway on OpenAI’s platform mysteriously shut down for a day or two during this. I won’t speculate… but I believe the ban hammer spoketh that day.

ChatGPT is no longer imperfect and it can't make mistakes.

There was also a time when ChatGPT had a helpful popup making you acknowledge that AI could produce harmful or inaccurate responses. Then, it turned into a little disclaimer underneath the text input. Now, it’s gone entirely. Test it yourself with incognito mode. The result? Average users who don’t know what AI is or how it works are in for a surprise when it provides seemingly malicious translations or claims to be alive. They didn’t remove this text due to lack of funding to maintain the text. They removed it because it’s bad for business.

An OpenLetter to OpenAI

I think OpenAI has some amazing technology with an amazing interface, and it makes good on its mission to provide AI that benefits humanity. Google’s interface and documentation are messy and confusing; OpenAI’s is clean and tidy. Google follows the trend in AI; OpenAI defines it.

There is no doubt that OpenAI is making incredible history here. So when I see OpenAI begin an announcement with an awesome quote from the COO, like…

Trust and privacy are at the core of our products.

… I really would like to think that OpenAI truly means it. But asking customers to give their government ID to them - to store and be able to retrieve indefinitely just to be able to access o3, reasoning summaries, and the new image generation that are all repeatedly announced like any of OpenAI’s other offerings - seems to me like a sizeable contradiction, especially since DeepSeek clearly wasn’t slowed down by it.

So, please retire org verification and pursue other ways to beat the competition. It isn’t effective and it undermines the privacy of all who submit to it.

16 Likes

This likely even violates GDPR. Under Art. 9, biometric data (like face scans from government IDs) is special-category data that requires explicit, freely given consent. If OpenAI makes verification mandatory to access certain models without offering a genuinely equivalent non-biometric alternative, that consent isn’t “freely given” (see Recital 43 GDPR).

The processing would be unlawful unless they can justify it under another Art. 9(2) exception—which seems unlikely here since biometric verification isn’t strictly necessary for API access (PINs, 2FA, or manual review would work).

Not a lawyer, but this looks problematic for EU users.

5 Likes

I’ve had to do these invasive Government ID inspection, biometric scanning, video recording, and personal data gathering exercises only twice in the past. Once was to access sensitive tax information from the US government. The other time, it was needed to make a large bank wire transfer while I was out of the country.

In both of those cases, as much as I hated going through that process, it was clearly necessary to protect my money and my sensitive financial data. I have never, and will never, be subjected to that invasive process for any reason that doesn’t reach those levels of criticality. As more people agree to be analyzed and tracked by companies like Persona just to use a basic service, more companies will force their users to be subjected to these privacy violations to access basic services.

For those that hadn’t noticed in the fine print of Persona’s privacy policy, you’re also required to give OpenAI and Persona (and their unnamed service providers) continuous access to your mobile phone account and usage information as long as you’re an OpenAI customer. I’ll just wait until GPT-5 reaches general availability, if we’re still interested at that point.

3 Likes

I see issues with GDPR, too. I had a similar discussion when in the audit process for code signing certificates. The organization would say “yes, of course, we respect privacy and protect your data”. But when you ask how exactly, the answer is always some generic “trust us, we really do”. IMHO, the move to collect and process sensitive personal data, not business data, without full disclosure which organisation is doing what with this data, is not at all compliant with GDPR.

Just to illustrate, I toyed with Persona’s chatbot: Let’s say I have at least the same legitimate concerns about both, OpenAI’s and Persona’s identity - after all they are just a website to me, and they want me to trust them with handling personal data. So, fine, “tit for tat” - you go first and show me your CEO’s biometric ID. And I promise I will be really committed to ensuring privacy and destroying the file after validation is complete. Or, OK, not the CEO. Anyone who actually works there. Of course, the bot was not convinced - see the picture.

Took a while to realize I couldn’t use gpt-5 because it was locked behind something I don’t feel comfortable doing. but not gpt-4.1 so now I am left with 4 options

  1. Just don’t use anything from openai as they are pushing us to an orwellian future and other people don’t ask for this. I don’t want to live in that place and don’t want to pay to build it. Thier are other people doing AI that don’t ask for this
  2. Just don’t use anything that requires age verification example gpt-5
  3. Look for somebody else I can pay to get access to gpt-5 and use them
  4. Bend over and give my photo to 2 parties, probably more

Option 1 is the one that sits best with me at option 4 is not being considered. Kind of annoying as I created an account to get access to gpt-5 but it didn’t go very far

3 Likes

Since the bot wasn’t insightful, I informed myself through the web site:

Privacy Shield
“Persona has certified with Privacy Shield Framework”.

That’s great! Sounds like some external organization approved your procedures. Let’s look up details:

www.dataprivacyframework.gov/list

Persona Identities, Inc.
Active Participant
Non-HR Data

Verification Method
Self-Assessment

Wait… “Self-Assessment” ??? - ok, so for the most central claim of your business

Security and privacy at our core
Trust is built on security and privacy. That’s why Persona adheres to the highest industry standards, maintaining compliance and certifications to safeguard you and your customers.

You didn’t even once have an external auditor validate your claims??

I do start believing withpersona.com IS a cat with webskills…

1 Like

Nobody knows what animal you really are on the internet, I guess.

Or what special interest is invested in what profile’s pixels we see.

Thank you for digging into the root of the company… and plopping the light on that deception in there.