What is the impact of DeepSeek on the AI sector? šŸ”„

That surprises me, have you a source?

Blackwell has looked a big ā€œmehā€ to me, especially for gaming (only ~10% gains gen on gen).

Surely, there’s nothing ā€œonlyā€ about nearly 600W, that’s a LOT of power draw for a consumer device? You are looking at a PC pulling ~1.2.kW from the socket? that’s crazy! The 5090 doesn’t feel at all ā€œefficientā€. That’s up 33% from last gen (4090), with only a ~30% improvement in performance on AI tasks - so this just looks like it has scaled with power suggesting zero efficiency gains (at least for the consumer card). But you gotta respect the cooler design, lol.

1 Like

Yes :laughing:

It’s a fair point! That is a lot of power, but it’s still not a lot compared to what a rack would draw. In terms of FPS, it’s just more frames for more power. The extra efficiency is really only relevant for the types of computation that AI tasks require. According to the article, there are even more efficiency gains to be had if you’re willing to go down to FP4. :laughing:

2 Likes

I look forward to the 5090 benches to see how this pans out, but tbh it’s kind of irrelevant - it’s the only consumer discrete card with 32GB - your only other choice is currently an Apple Mac device like a mini? I don’t think speed matters that much for AI at home …

2 Likes

Yep, I couldn’t agree more. It’s honestly a shame that they won’t add ~$20 worth of VRAM to the lower-end cards because I’m not buying several thousand dollars worth of consumer GPUs. That’s just not within my budget :sweat_smile:

3 Likes

What value does OpenAI have at all now?

Everything changes quickly, but so far, technically and for what it offers, I think none more so than having been the global driver of AI, just as Xerox was the driver of personal computers in the 70s.

It is a hasty and subjective opinion, and perhaps tomorrow it will be irrelevant.

One advantage of OpenAI - I think - is that it gives less censored information.

For instance, DeepSeek for instance what happened in a ā€˜specific’ country in 1989 and it will not answer.

Then ask it to THINK why it cannot answer that and you may see some interesting thoughts on why it thinks it is prevented to share what it actually knows.

PS. I did above on the local model

That’s true, but I think it also applies totally to OpenAi.

In ā€œsensitiveā€ topics or those they consider sensitive, such as sexuality, gender, politics and so on, at most things like this happen, and even in some topics, they may not give you an answer if they are targeting a certain candidate for being ā€œsensitiveā€, but they can do so with the same question about another candidate, to give an example. In other words, both censor according to their own convenience.

That’s why I think it’s very important what happens with projects like open-r1 (Open-R1: a fully open reproduction of DeepSeek-R1) and others.

Edit: In fact, testing with Alibaba’s Qwen 2.5 chat or DeepSeek itself, I find that their responses ā€œcensorā€ even less than what I see with ChatGtp, or at least it is possible to address more ā€œcontroversialā€ topics without a paternalistic layer and in depth. Surely with some kind of subtle, or not so subtle, influence of the ideology of the Chinese model anyway.

3 Likes

The trend in financial markets continues, so far

This isn’t an issue with an open source model. The censorship can be easily tuned out. This is in contrast to OpenAI solutions where you risk permanently losing access from attempting such a thing


OpenAI models are still superior IMO, but DeepSeek isn’t far off. Let’s see these o3 models. I’m very excited.

Also, the information that the latter has provided to the world has been much more beneficial for humanity. It was a sad day when OpenAI decided to hide their reasoning tokens under the guise of safety.

2 Likes

Andrew Ng’s comments about this:

Key ideas

  • The release of DeepSeek-R1, an open-weight AI model from China, has highlighted key trends in generative AI. First, China is rapidly closing the gap with the U.S. in AI development, particularly with open-source models.

  • The availability of open-weight models is driving down costs and commoditizing the foundation-model layer, enabling more opportunities for application developers.

  • AI progress is not solely dependent on scaling up computational power; algorithmic innovations can significantly reduce training costs.

  • DeepSeek’s success underscores the strategic importance of open AI models in the global AI supply chain and raises geopolitical and business implications.

I get a bit skeptical when these guys talk about RL. I mean, super obvious idea, why are they talking about it now?

either, a) they’ve been deceptively not talking about it, or b) they are out of the loop relative to where they should be

Either way, reduces my trust in them quite a bit

If they said something like ā€œwow, I really flaked on thisā€ I’d probably trust them more

That said, I like this quote:

If the U.S. continues to stymie open source, China will come to dominate this part of the supply chain and many businesses will end up using models that reflect China’s values much more than America’s.

Though I can see OS getting banned.

TBH, I’m getting a bit skeptical of this RL stuff. I mean, super obvious idea. Pretty sure everyone tried it. I remember calling it ā€œRLCFā€ for reinforcement learning compiler feedback like way back.

I just did a search, I wasn’t the only one that came up with that term - [2305.18341] Coarse-Tuning Models of Code with Reinforcement Learning Feedback

Like I say, pretty trivial idea.

If RL post training is a big deal, we should see some quick jumps in the huggingface leaderboard very soon.

Fwiw, I also did some tests here - GRPO Llama-1B Ā· GitHub

All I saw was some improvement in formatting, admittedly it was just about 2 hours of training on an h100

Mistral is back in action with Mistral Small 3, 24B model - if the graphs are to be believed, it offers quite a punch at lower latency. It doesnt use synth data, and there is no RL, just fewer layers, and higher quality data. It looks ripe for some additional tuning on top. Mistral Small 3 | Mistral AI | Frontier AI in your hands

NOTE: Apache 2.0 license

3 Likes

It is very interesting to see how quickly this is advancing and how there are more and more economic options to use R1 via API outside of China and its official API, with companies from US, with a good context and prices that are going down.

1 Like

Wow, anyone read Dario Amodei’s post?

…Instead, I’ll focus on whether DeepSeek’s releases undermine the case for those export control policies on chips. I don’t think they do. In fact, I think they make export control policies even more existentially important than they were a week ago.

2 Likes

In my opinion, this would be a really terrible way to proceed and address what is happening from the USA, not only from a ā€œtechnologicalā€ perspective, but even from an ideological perspective, running the risk of being seen as trying to slow down progress / innovation / democratic access to technology and advances, etc., and making China become seen as a global benefactor (exaggerating a bit).

I suppose there will come times of misinformation, paranoia, fake news and so on from the industry or Silicon Valley, it is understandable perhaps, and part of the public will go that way, but it seems to me that in reality, things are moving along another parallel channel, opposite to what the CEO of Antrophonic proposes or suggests. They are going to transform this into a kind of conceptual war between Open Source and USA big techs, it makes no sense. All this has nothing to do with China anymore.

Furthermore, if restrictions theoretically achieved these advances in resource optimization by China, why propose even more restrictions as a valid path? Perhaps they will get Huawei to release cheaper and more powerful cards than Nvidia.

1 Like

I thought that was a very interesting essay. I don’t know who the guy is, but I strongly agree with his thinking. I don’t think we would have seen this type of market action if DeepSeek weren’t a Chinese company.

ā€œDeepSeek produced a model close to the performance of US models 7-10 months older, for a good deal less cost (but not anywhere near the ratios people have suggested).ā€

This is basically what we’ve been telling people for a long time: start building now, even if it’s not profitable, because in 10 months you’ll be able to do it at a fraction of the cost.

3 Likes

If China can’t get millions of chips, we’ll (at least temporarily) live in a unipolar world, where only the US and its allies have these models. It’s unclear whether the unipolar world will last, but there’s at least the possibility that, because AI systems can eventually help make even smarter AI systems, a temporary lead could be parlayed into a durable advantage 10. Thus, in this world, the US and its allies might take a commanding and long-lasting lead on the global stage.

ā€œHi, let’s ban chips to China so we can permanently rule the worldā€ … sounds like a great thing to say out loud.

What people don’t realize is that this sort of might makes right philosophy only results in end times. The notion that well armed adversaries will simply roll over is naive and gullible.

At this point, people need to accept that China will get there as well. They are certainly proceeding aggressively on chip manufacturing and at this point very likely recognize the strategic importance.

By fostering a more friendly cooperative relationship now it will certainly be much better for everyone when they do.

1 Like

Apparently OpenAi reaffirms this idea that has been talked about, the only announcements they have made since the DeepSeek ā€œepisodeā€ have been related to agreements with the government or similar.

  • Accelerating the basic science that underpins U.S. global technological leadership
4 Likes

yeh, so basically kyc. these 15k scientists get access to the real ai (much less safety, no doubt) while the rest of us get o3-mini or whatever.

an example: if you run any of the cyberseceval stuff on o1, you get flagged and no response.

  • Enhancing cybersecurity and protecting the American power grid

Welcome to the future of AI haves and AI have nots.

it’s beginning, for real

funny the timing of this so close to deepseek. i wonder if they appreciate how much this is going to make people want deepseek so much more.

it’s not just about cost, but now it’s about capability

ofc, don’t be surprised if deepseek gets banned in all of its evolved forms