"Use of Copyrighted Books in 'My GPT' Knowledge and General ChatGPT Conversations: Uploads, Copyright, and Privacy Issues"

Dear OpenAI team,

I’ve been trying to answer these questions for weeks but I haven’t been able to find the information in the help. I imagine I’m not the only one with these questions and I would like to make good use of GPT while respecting copyright. Therefore, I write it here.

I’m planning to use the “Knowledge” section in the “My GPT” feature on your platform and have some specific queries about its use, particularly around file uploads, copyright and data processing. I would appreciate your thoughts on the following points:

If you can only answer one question, let it be this one.

.Using ChatGPT for summaries.
We know that people are using it this way but I am not sure if it is legal.
Ultimately, I would like to know the best way to use ChatGPT to summarize books or PDF files, ensuring compliance with copyright laws and keeping the content private. What are best practices for summarizing such content without infringing on authors’ rights?

More details here:

  1. Copyright and content in ‘Knowledge’:

    • Can I upload copyrighted books or documents to the “Knowledge” section?
    • What are the guidelines to ensure compliance with copyright laws?
  2. ** Clarification. Copyrighted PDF files uploaded to any chatgpt**.
    I found this recently:
    https://help.openai.com/en/articles/8555545-file-uploads-faq

If I upload a copyrighted book to a Chat conversation (not in knowledge) you delete it in 3 hours it says there, but then you say below that you can use everything from chatgpt to train the model.

Are the files safe or not in that case? Are they kept private or not?

  1. Privacy and data security:

    • What measures exist for the privacy and security of files uploaded to “Knowledge”?
    • Does OpenAI have access to these files or their content?
  2. Use of files for model training:

    • Does OpenAI use files uploaded to Knowledge to improve overall model performance? And those uploaded in any chat?
  3. Data retention and deletion:

    • How long are uploaded files kept on your servers?
    • What is the process to permanently delete “Knowledge” files?

6.** “Chat History & training” **.
What effects does the request “Do not train with my content” in the privacy center or turning off “Chat History & training” have on the use and privacy of uploaded copyrighted PDFs?


Thank you for your time and help in addressing these queries. Your detailed answers will be instrumental in my decision-making process regarding the use of your platform. Please, I have read other threads and I would like to ask that we not make assumptions but instead refer to reliable answers with references in documentation or from the OpenAi team.

Thank you.

Hey there and welcome to the community!

So, keep in mind, we are not OpenAI staff, mostly just other enthusiasts building and working with their tools just like you.

The root of your question is still a bit tricky and up in the air (GPT Store is still very new), but I think a good rule of thumb moving forward is “If you don’t own the IP, don’t upload it into a custom GPT”. The store is still getting its footing, but because there have been allusions to a revenue share model where people make money off of GPTs, this could put you in legal trouble in the future if you end up making money off the custom GPT.

One thing to note though, is that I see you might be conflating file uploads with knowledge uploads. If you are simply asking GPT to summarize text or handle a query that requires copyrighted information as context for a personal query, that is perfectly okay to upload. Knowledge files are different; those are files that GPT can specifically retrieve as context in any conversation at any point in time with the custom GPT. That’s where you have to be careful.

To answer the rest of your questions:

  1. Knowledge files are stored in OpenAI’s backend. So, the security of the files is as good as their own security. However, that doesn’t mean users can’t retrieve the information stored within the knowledge files, which is very easy to elicit. It is unclear whether or not they have retrievable access to custom GPT knowledge files. My guess is actually no, they don’t, but we also don’t know what mechanisms they have in place for preventing/removing uploads that violate their content policies. I would not worry about it though; OpenAI is very hands-off compared to other companies.

  2. No, they do not use Knowledge files to improve their model. However, it does seem that they use conversations to help improve their model. Meaning, if a user has a conversation with a custom GPT, it is likely they use that conversation to some degree to enhance their models. There are very tight-lipped about any details on that though (which is normal for tech companies).

  3. I am not aware of OpenAI storing anything beyond its existence. So, if it’s deleted or removed, it should be deleted and removed from their system. They make it clear you own your data, and so far, they have held true to their word. They do not keep your data without your consent like Google does. Knowledge files count as your data.

It’s good to hear you are trying to adhere to copyright. I would try to ask if what I am doing with the work is transformative. Transformative use - Wikipedia

Besides that, I would best try to adhere to OpenAI’s policies. Usage policies (openai.com)

and Brand guidelines Brand guidelines (openai.com).

Ultimately this is a problem for OpenAI to solve and enforce. Outcomes from the NY Times lawsuit will see how OpenAI adapts to these things.

P.S I am in no way affiliated with OpenAI :slight_smile:

I come here looking for the same answer :joy: I want to upload copyrighted business startup advice books to a GPT but its not clear at all whether that’s against terms or not.