MT-NLG - Are we ever getting access to the 530 B parameters trained model?

jspss96046 · September 4, 2022, 4:19am

I’ve been trying out different language models recently and I saw MT-NLG I understand it’s a crazy model but can’t seem to find the code that works, wondering if anyone has tried it

Fusseldieb · September 7, 2022, 8:28pm

If you have to ask, don’t even try it. Not judging, but those models require a massive amount of GB to run on a GPU, which most consumers don’t have. I’m talking about 50GB+.
Your average GPU has 6GB.

We’ll need to wait until OpenAI (or somebody else) integrates it and then use it via an API. This also saves on headaches since you don’t need to tune it yourself.

gwern · September 8, 2022, 8:33pm

No. Nvidia never released the earlier Megatrons either, like Turing-NLG, note. Their interest is in solely building the tooling as part of selling GPUs to well-heeled customers, not in necessarily releasing models or creating SOTA ones. If you look at the benchmarks MT-NLG is pretty disappointing anyway (highly undertrained, even before Chinchilla or GPT-4), so it is almost certainly not worth the hassle.

Topic		Replies	Views
Beyond Frustating - Where is gpt-4-32k mentioned everywhere on the internet? Deprecations	1	180	November 22, 2024
Is there a possibility that OpenAI will offer their models for on-premise usage, allowing users to fine-tune them on their own premises? Community gpt-4	3	2292	May 31, 2024
OpenAI's open weight models are here: gpt-oss-120b and 20b Open Models open-source , announcement , community	17	5225	August 11, 2025
Open Source is making rapid progress Community agi	21	2305	July 24, 2024
How to access GPT4 with Code Interepreter via API? API gpt-4	2	3619	July 11, 2023

MT-NLG - Are we ever getting access to the 530 B parameters trained model?

Related topics