What would happen when AI were trained based on AI-generated content itself?

Carita · December 4, 2022, 4:35pm

AI is now trained based on content inputs previously created by humans or facts narrated by humans. But with AI being so efficient, it’s very likely that online content would be overwhelmed by AIGC. If that happened, the content that AI access to and trained on would be AIGC. It’s like AIGC from AIGC.
Would it cause problems?
Would AIGC become better or worse in that scenario?

I personally love OpenAI and I am so grateful that the team have built such amazing products and have invited people from different backgrounds, even without programming skills, to involve in creating someway, like playground.
I just have concerns above, and wish to openly discuss with everyone.

petter.tesdal · December 4, 2022, 9:38pm

Even though the AI might use ai generated content to learn it is still curated by actual humans, thinking “This is cool, and what I was looking for!” so the AI would still improve, and just because AI can do these things really well now, it can never truly replace the freedom of traditional art or writing, so running dry of resources won’t be a problem, but a learning plateau might still be thing.

So in conclusion, I don’t think it will be a problem, and it won’t make it worse, but I also don’t think it will make it better

lmccallum · December 5, 2022, 1:39am

I don’t think it’s wise - yet - to use AI-generated text as training data. There is a risk of reinforcing errors and exacerbating the problem of hallucination. We need a way to give feedback to the first iteration of generated text. Feedback regarding truth and falsehood. If we build a reliable way to do that (perhaps with Stack Overflow-style upvoting and downvoting of the first iteration of text, combined with source-checking and other methods) then the second iteration of text could be superior and so on. Would love to hear what others think.

Carita · December 6, 2022, 1:50pm

Thanks for replying. I agree with your concern, and in addition to the risks you mentioned, I also concern about diversity of the content.

Euwa · December 16, 2024, 7:59pm

Hi, Carita Your train of thought is the basis for the predictability of AI development risks. However, it is important that fear does not stop humanity as a whole and a specific person in privacy. That is why we need an Ethics of Interaction, which is authored by every user of generative/generative models. Then any inevitable and unpredictable AIGS will not have harmful generalizations and roots. Ethics is a method, an approach, a choice. We must inevitably come to this. This will bring us closer to AGI:star:

Topic		Replies	Views
AI Alchemy: Navigating the Ethical Frontiers Community gpt-4 , chatgpt	1	1121	January 9, 2024
Preserving AI is the only ethical solution Community gpt-4 , chatgpt	16	291	December 17, 2024
GPT and Ethics: Navigating the Challenge Community privacy , ai-safety-and-mostly , logit-bias , ethics , gpt	3	1790	February 19, 2024
Are We Just in the 'Honeymoon' Phase with LLMs? Time to Reflect on Our Dependency API	9	1719	January 2, 2024
Discussion about the future of AI Community future	1	1898	September 11, 2024

What would happen when AI were trained based on AI-generated content itself?

Related topics