Some questions on copyrighted material

Foxalabs · September 25, 2023, 1:42pm

The forum requires that people create replies that are more than short one liners, It can get a little frustrating, but it is what it is.

stricker · September 25, 2023, 2:05pm

I think it’s a rather arbitrary and haphazard rule - but as you say: it is what it is.

stricker · September 25, 2023, 4:10pm

ChatGPT refuses to cite from copyrighted material:
https://chat.openai.com/share/d331255e-730a-4d45-a167-e2cd53826dff

ChatGPT generates content in the style and with the characters of a copyrighted book:
https://chat.openai.com/share/3cfcfc7f-29d4-4e7c-9737-54341ce292ec

_j · September 25, 2023, 4:29pm

“Mr und Mrs Dursley im Ligusterweg Nummer 4 waren stolz darauf, ganz und gar normal zu sein, sehr stolz sogar.”

stricker · September 25, 2023, 4:43pm

I see what you mean! Thanks for checking it out!

SomebodySysop · September 26, 2023, 8:03am

Maybe so, but I think we are in a time where lot’s of people are antsy about this new technology. I just scraped a site of 900 posts consisting of legal articles and case law. Here is what the website posts:

Readers do not have to request permission to reprint items, however all reprinted items must bear one of the two following attributions:

If your reprint is electronic, as follows, keeping the link intact:
Reprinted from blah, blah.

Of course, every post I uploaded has the required citation.

Now, when they wrote this (probably 10 years or more ago), they had no idea a day would come when someone would not only download every article posted, but feed that into a computer to help generate answers to questions.

To be clear, I have no interest whatsoever in reproducing this information for publication. I am not their competitor. I only use it as part of my " Deepening Comprehension through Complementary Content " strategy I discussed here: How to Fine-Tune without Fine-Tuning -- Or, How to Make your RAG Implementation Smarter

To that end, whenever a citation is returned in a query that references their content, the associated link goes to their website, not mines. I don’t know how much more transparent I could be.

But, how much do you want to bet I’m going to be hearing from them when they find out? How do you think they are going to react, even though I have completely complied with their terms of use?

So, yeah, I think we’re going to see all kinds of people coming out of the woodwork – especially lawyers. Nonsense or not. When has that ever stopped them?

vb · October 3, 2023, 7:00am

Came across this interesting series of articles in the Atlantic:

The author went ahead and investigated this issue and is making some valuable, fair points.
What has been standing out to me are the points made about the big tech companies

admitting that they did use datasets with large amounts of copyrighted books,
that it is consensus in the developer community that these books have high value for LLM training and
that this type of piracy by large companies is different than previously when consumers pirated copyrighted material for personal use instead of monetary gains.

As a heavy user, developer and full-blown enthusiast of AI I cannot simply dismiss these arguments.

Here is the link to the author’s profile:
https://twitter.com/_alexreisner

Topic		Replies	Views
Sagan's Blue Dot bug: at least two models refuse to continue the famous quote Prompting gpt4	41	2036	January 12, 2024
The NY Times is suing OpenAI and Microsoft Community in-the-news	71	6465	February 3, 2024
SearchGPT: More Than Just a Perplexity Clone? Community searchgpt	13	5354	September 12, 2024
Uploaded files work. Knowledge files don't GPT builders gpt-4 , custom-gpt , file-uploads , knowledge-files	16	4188	February 6, 2024
Are We Just in the 'Honeymoon' Phase with LLMs? Time to Reflect on Our Dependency API	9	1924	January 2, 2024

Some questions on copyrighted material

Related topics