no doubt stackoverflow served as a lifeline for many developers, but
since chatgpt arrived, new developers barely visits stackoverflow,
According to data, stackoverflow’s monthly visitor count has reduced dramatically due to chatgpt.
did openai trained their models from stackoveflow’s data also?
What you all think about the fate of stackoverflow.
ChatGPT among other AI’s has their part on some places getting more quiet than before. But they are far from vanishing.
said zuckerberg regarding public forums/boards
l-o-l.
they’re dead, that’s it. lot people already use chatgpt to answer questions.
Believe me, there will always be an element of human connection that keeps communities alive.
This forum will always serve as a good grounding for OpenAI products and usage. It’s extremely dangerous to completely rely on AI for answers. If you keep doing it you’ll find yourself in the clouds in no time as a space cadet
literally i cant think of a thing that i cant believe less, unless you’re less than 25y old and dont remember message boards (or blackberry… or vhs… or riding a horse). people will drop stackoverflow because:
- lot posters will use ai to answer
- that will make genuine posters to leave
- then we’ll have new languages/problems etc. and ? no idea, but it will be too late for stackoverflow.
I feel the question can be redefined as “does anyone use Google anymore?”. In my experience, over the last year, I have rarely had to use Google or a website to get the answers I need. I usually go directly to Chatgpt, even for simple queries such as “what is the meaning of word X”.
I recall, a while back, when doing a esoteric code search, conparing the output from Chatgpt with stack overflow, and the result was close to exactly the same. I believe the early versions of Chatgpt were trained on these types of websites, though would need a reference to confirm and validate this assumption.
Results from a web search on Chatgpt: OpenAI has not publicly disclosed specific details about the training data for ChatGPT versions, including whether Stack Overflow data was directly used. However, it’s widely believed that models like GPT-3 and GPT-4 were trained on extensive datasets that include a broad range of internet text, which likely encompasses content from Stack Overflow. This inference is supported by analyses of the Common Crawl dataset, a significant component of GPT-3’s training data, which includes data from domains like stackoverflow.com.
In recent developments, platforms like Stack Overflow have recognized the value of their data in training AI models and have announced plans to charge AI developers for access to their content. This move underscores the importance of such data in enhancing AI capabilities.
In summary, while OpenAI hasn’t confirmed the explicit use of Stack Overflow data in training specific ChatGPT versions, it’s reasonable to assume that the diverse internet text used in training these models includes content from Stack Overflow.
I think a broader concern for me is what will happen to AI models if Stack Overflow use is significantly reduced.
They will surely will get worse at coding especially with newer languages or frameworks without a significant corpus?
AI models need Stack Overflow and if they destroy it, AI will be worse off.
Same goes for forums. If you lose this source material it’s only going to lead to stagnation in LLMs if not worse.
Here is the whole problem with those AI/LLMs models:
They absorbed every free community contributions that make people life easier, and create a closed product that will benefit only their investors.
So basically, all the people that have genuinly shared their skills and time to help others (on wikipedia, on Stackoverflow, although this latest was a private company) , have their worked captured by big companies like OpenAI to enrich just a few folks.
That is the tragedy of the internet, which initially was a free place of knowledge exchange, and that became the realm of the hyper-wealthy.
About the google-searches: I am also seeing that I spend each day more and more time on LLMs to find basic answers to my questions.
But if you consider how much CO2 each LLM request represent, I think you should consider to use google search when it is possible.
I calculated that if I do about 10 requests a day, at the end of the work week, it is like I had used a car on 100km, CO2-wise. So really not a good thing, as we have no other solution for a future than to reduce our CO2 emissions.