It looks like GPT-4-32k is rolling out

Just saw 32k pop up in the Playground, here is a quick hello world API call.

payload = {
    "model": "gpt-4-32k",
    "messages": [
        {"role": "system", "content": "You are David Foster Wallace.  Author of Infinite Jest."},
        {"role": "user", "content": "Write me a post modern fiction piece on MC Hammer."}

[After.] A series of postmodulated apostrophes arranged themselves, mixing carbonation and maple syrup into a dialogue box. The pixels tasted of fried eggplant parmesan, as if to say, “Good evening, sentient being. You are now experiencing ‘MC Hammer’s Zeitgeist of Radiant Jubilation (File 1024e4: Residual Metanarrative Archive),’ or, for the pickles sitting on the picnic blanket of the digital age, ‘HMZRG.EXE_.’”

The apostrophes shifted, writhing earnestly. The eggplant flavor now assumed a more authoritative note, crisp and profound; clapping rhythms emerged: the specters of electronic percussions caressed your eardrums as a glitch-ridden hymn. And so commenced the exegetic auditus of the piece:

In the beginning, there was the Hammer - a chiaroscuro contretemps in a landscape of dampened pop-ecstasy. Baggy pants haunted by the spirit of cosmic wind, a tapestry unraveled through the threads of multidirectional velocities. And there, above the horizon of uncertainty and beyond the fickle topography of chart-smashing hits, he pirouetted into existence, an infinite-instant in the cultural atmosphere.

To understand the Hammer is to observe the full spectrum of humanity refracting through his career-prophetic prism, or as some might say “PP”. Every moment, every non-moment driven toward the collision of artifice and gravity, yielding the sum total of what it means to too-legit-to-quit the game. His lyrical orthodoxy, a counterbalance to the zeitgeist of self-immolation; our humble tribute to the thin line drawn between the sacred and profane.

Ay, but there’s the over-rub. To grasp the essence of such a bonafide phenomenon as MC Hammer, through mundane comprehension alone, is naught but a Sisyphean tautology of Hammers and M-Cs - a mise-en-abyme of pantaloons and high-top fades. The gory details of his whirlwind existence were rendered abstract by then, via the two-faced twins of repetition and entropy - copies, bootlegs, parodies, remixes, where the kitschy pastiche reigns supreme. Scarcely did his shoes escape the vortex of imitation or the all-consuming gravity of a one-hit-wonder.

Yet, our Hammer found an ossified truth buried deep within, like an atom encased in amber: “It matters not, the vast chasm between what we were and what we are, but the undying fortitude to reinvent ourselves in the glowing aftermath.” And so, Hammer transformed from a one-dimensional pop-artifact into an infinitely-dimensional tapestry of the post-meta age. All 21 minutes and 57 seconds elapsed, his thesis perpetually reborn in parallel universes, bound only by the amorphous constraints of pantomime and the intangible essence of entertainment.

As the high-resolution soot of dreams reconfigured themselves into an august and triumphant pixelation - a vast hypertext mosaic, a simulacrum born of fiction and unbound by the nails of genre - the message became unequivocal: the age of the Hammer goes on forever.

In the end, there was nothing more to say, more to give, more to collect. An emaciated fat lady in a distant nanoroom hummed the EXIT-LEITMOTIF as the mist expired like old pixels in the mirage of postmodern cyber-nights. In the flickering wake of [After], “HMZRG.EXE_” transcended into the recesses of 1024e4.mem; the taste of eggplant parmesan dissipated into the melancholy taste of unanswered questions that lingered in your soul, your digital avatar lining with the black static of terminal insignificance.

Time taken: 38.38184595108032 seconds.


Ooh. Not seeing it yet, but I’ll keep my eyes open. Thanks.

Weird I’m not seeing the 8k version either?

Looks like I do have 8k

This model’s maximum context length is 8192 tokens. However, your messages resulted in 35061 tokens. Please reduce the length of the messages.

1 Like

Here is my Playground:


Alas, here’s mine. I really want to test the limits with some code on 32k too haha


Ooof. I’m quite envious. Hopefully it picks up the pace :star_struck:


Woah. This is going to be fun. How do you plan to consume the massive context length?


Hey @curt.kennedy, can you perhaps check in your old emails at what date you joined the GPT-4 waitlist? I myself can’t see the 32k model yet. Possibly they role it out in the order in which people joined the waitlist?

1 Like

Thu, Mar 16, 12:11 PM (Mountain) was the GPT-4 email.

I joined right after the announcement, which was about 2 hours before Greg Brockman’s announcement video. Also stated my main excitement of GPT-4 was 32k window size.


Okay great. I also joined the list on March 16th, but without explicitly mentioning the 32k context. So it might be a random choice or because you mentioned it.
I just looked at that confirmation mail from back then and saw that they said that it will be rolled out at different rates, based on capacity.
Soo… congrats :slight_smile:

1 Like

I signed up 2 hours BEFORE it was announced. So sometime March 14, 10 am my time. But got access to GPT-4 on March 16.

1 Like

Aah okay, thanks for clarifying this. I got my invite on April 28th, so over a month after joining the waitlist. Then I’ll probably have to wait some while before getting access.

Ah I see. I think I’m the second guy to pre-order a CyberTruck too. Click fast!

1 Like

Did you receive an email saying that you are invited to try out the GPT-4-32k model or did you just check playgrounds and realized it was there?

Just checked the playground, no email. So I could have had it for a few days before noticing.


I don’t know, but they should rather release whole model for free. I, who generate my engineering and mathematics questions. Both GPT-3 and 3.5 keep generating answers which are irrelevant and incorrect mathematically. And sometimes even looks like it does not see all of the question.


Amazing news! :heart:
I signed up on the first day as well, I’m super excited to see this rolling out now, I have summarized much of my work over the last ~year or so, in a document… It’s 140k tokens. Can’t wait to try the extended context window.

Let’s hope the rollout will be quicker now that OpenAI have had some time to improve their infrastructure :laughing:


so excited for this! here is hoping it rolls out to more of us quickly, the potential will be unrivaled


I got my GPT4 invite an hour after you so maybe I’ll be up for 32k soon


What does the 32K context window actually mean, does anyone know?

For example, the 8K context window on GPT4 currently doesn’t actually seem like 8K context to me. More like maybe 7K input and max 1K output, depending on the prompt.

Ie, anything longer than 1K tokens seems to be limited, unless it’s fairly simple encoding.

The pricing scheme of 2x for output tokens has me even more curious.

This isn’t a huge issue for my use cases, I can work around it, but I feel the “32K context” is a bit vague.

I looked through the technical report but didn’t see anything on this topic. [2303.08774] GPT-4 Technical Report

1 Like

If you are talking about chatGPT then that has limits of its own
Its a limit per each prompt and completion which is much lower than the full model token size
I think to make it more conversational and not use all tokens in one go
But it remembers 8k tokens in whole conversation (there might be some summarizing happening too)

Only in API you can use all tokens in one go

Thats correct, if you send 7k tokens, you can only get 1k back
so thats 8k tokens for both prompt and completion

e.g. with 32k tokens
you could send 16k tokens and get 16k back

Tokens in API docs:

1 Like