I am working on an application that wants to use the user’s own identity documents to help them enter information faster and more accurately. In my own user preview testing of a GPT I am developing, I tried to upload the bio page of my passport and it was rejected as “unreadable”. I tried again in a ChatGPT session and was told it was rejected for security reasons as being personally identifying information.
I’d like to understand how we can work on an app that wants to accept PII. There must be a way this can be worked out. Perhaps after login/auth of the user? I don’t know but am hoping to start the conversation here. Would an enterprise account allow this usage?
PS: I would have put a tag “PII” on this post if it was available.
Honestly, I think you’d get there quicker using open-source models and/or OCR. The thing is privacy is no joke, and to meet the GDPR/COPPA privacy bar requires a lot of manpower, not to mention if PIIs somehow leaked into training data, they’ll be there forever. So I think they’re playing it safe by not allowing any PII whatsoever to show up.
Agreed on all points.
Common OCRs are already great at extracting text from images. Google Cloud Vision offers (I believe) the first 1,000 for free / month.
thanks very much for the scanning ideas. Indeed textract is an option too.
But I still hope this thread keeps going about how to build on OpenAI’s platforms with PII because regardless of how the data gets to text it’s still PII and I still think the APIs need to account for this and allow users to safely build apps that take PII.