The futuro of GPTs and LLMs, copyright, illegality

In recent months, the debate has intensified around the use that artificial intelligence models like GPT make of copyrighted content. In particular, critical voices argue that these AIs have “appropriated” works under copyright, which would constitute a legal infringement.

However, it is essential to make a critical distinction from a legal, ethical, and evolutionary standpoint of human knowledge:

Studying is not stealing. Learning and generating new content is the foundation of humanity.
Since the beginning of knowledge, humanity has studied, absorbed, reinterpreted, and created based on the previous work of others. If studying a theory, understanding it, and then sharing our own conclusions were illegal, human knowledge would be stagnant.

A model like GPT does not store or literally reproduce protected content (except in controllable exceptional cases), but generates new content based on learned patterns. This practice is no different from a human studying books, articles, and then writing something new.

There is no appropriation without improper commercial use. GPT does not “sell” others’ content.
AIs do not present themselves as the authors of the original works, nor do they sell them as their own. Moreover, OpenAI and other models offer free access to millions of people. Even if there was a “benefit” from processing protected data, there is also a massive return of value to society.

“Fair use” and transformative use are key.
In many legal frameworks, such as those in the U.S. or Europe, fair use allows for studying, processing, and transforming content for educational, informational, or research purposes. AI falls into this field when it does not directly copy or commercialize content, but rather transforms it and generates new value.

Social and scientific benefits cannot be ignored.
GPT and other models have helped millions: students, researchers, people with disabilities, entrepreneurs. No one is forced to use GPT, but those who do receive value. This social reciprocity partly justifies its existence and use model.

The real problem: the superhuman power accumulated by LLM owners
The current debate is mistakenly focused on whether AIs “steal” content to train. That is not the real threat. The real risk is that the owners of LLMs (Large Language Models) accumulate superhuman power, economic and social, due to their exclusive control over advanced technologies that process, synthesize, and manipulate information on a global scale.

This power creates disproportionate advantages over individuals, small businesses, even governments. AI is not dangerous in itself. The concentration of its control is.

What should we really be discussing?

Not whether AI can study content, but how we balance the power its dominance creates. How can we ensure there are no monopolies of thought, information, or decisions based on AI? How do we prevent access to powerful models from being restricted to elites while others are left at a disadvantage?

The future of AI is not decided by copyright. It is decided by who controls the power that AI generates.

3 Likes

I agree with this in principle.

It was always the case with humans that some people were able to take other people’s work and monetize it much more effectively.

Instagram was built originally by a team of only a dozen or so people using a lot of open source libraries. Now it’s worth billions. Nothing was owed to the many contributors to those open source libraries.

Star Wars is arguably Tolkien in space, or at least heavily influenced by it at George Lucas’ own admission. Without Lord of the Rings there would be any Star Wars? I wonder who made more money, George Lucas or Tolkien?

Well Star Wars has made 10’s of Billions of dollars and Tolkien’s estate is worth half a Billion.

Similarity between the two is fair use?

And so LLMs could easily spew out guided me-toos of some other original concept and it would be fair game?

But there is one distinction and that is how efficiently they can copy and mimic and transform. It is now much easier to borrow multiple concepts and transform them into something which would not ordinarily infringe copyright.

I would say the current reach of copyright is highly debatable anyway.

It’s amazing to me that Jazz songs of the 1930’s and 40’s are still considered within copyright, and here we are nearly 100 years later!

1 Like

Free at last! :wink:

I think Conan the Barbarian is supposed to be free game now, but comics and others have some sort of rights?

1 Like

But what if you ask DALL-E for a colour version??

1 Like