It looks like GPT-4-32k is rolling out

Please keep us updated and let us know when you got access.
I wonder if this happens silently or if this is announced via email too…

It took 15 seconds.

I’m not complaining here, just curious. Your point about playground being faster is a good explanation.

Saya sangat senang akan kehadiran GPT-4
Mungkin akan membantu saya untuk bekerja sekaligus belajar tentang hal baru yang mungkin saya sama sekali tidak tau bahwa itu ada

I think I had gpt-4-32k a while ago, and then I think it disappeared.

1 Like

I’m fairly sure that video is edited, not fake, just filmed at lower framerate and played at normal speed, if you look at the way GPT is typing you’ll see that it’s a bit too fast :laughing:

When that is said I actually think the response times are pretty amazing, I don’t know any human who can process ~40 pages of text, answer questions and produce a detailed summary in just 10 minutes.

1 Like

Hey Champ and welcome to the forum!
You may have been part of a test group, or it may just have been a bug, it will return eventually.

Did you find the extended context window useful?

1 Like

Yeah, you might be right. Of course, if he was willing to fudge that, you have to wonder what else he might have fudged.

I mean look what I got access to!

image

(faked, obviously, by inspect → edit)

Crazy what people do for twitter followers.

3 Likes

Thanks for the laugh mate that’s a good one :rofl:

I’ve also made some fun editing: 💸 Sk1d_M4rkZ & C0d3 M0nk3yZ: KVM Pwn4g3 LoL

But thank you for reminding me to increase my usage limit, I’ll burn through that in one 1024k token prompt

1 Like

OK, here’s an interesting update. I fed gpt-4-32k around 19k input tokens, it spits out a small answer (216 tokens), and it only took 16 seconds! I ran it again and had the same behavior and 17 seconds.

So, with this data, it looks like a massive amount of input tokens is not going to be the driver, at least validating my assumption above that the delay is primarily a function of output tokens, not input tokens.

Oh, and on average, this was $1.19 per API call. So kids, don’t try this at home!

7 Likes

Did you have to do anything special to get ‘upgraded’

I am still on 8k over here not that I am complaining but 32k would be incredibly fun to experiment with :smiley:

1 Like

One issue I’ve found with large documents is that cGPT tends to skip over sometimes very important data. This may be solely a cGPT issue from it’s token management though. Have you tried using it to analyze any large documents such as a GitHub repo or maybe a large analytics report? I’m also interested in knowing how it manages lots of numbers in different places.

Side Note: Huge, easy industry will be converting old, unmanageable spaghetti code found in still-used massive archaic systems to newer, faster, and cleaner languages.

Also, you’re right about the generation tokens. Here is a snippet from the documentation:

Prompt tokens add very little latency to completion calls. Time to generate completion tokens is much longer, as tokens are generated one at a time. Longer generation lengths will accumulate latency due to generation required for each token.

https://platform.openai.com/docs/guides/production-best-practices/improving-latencies

2 Likes

Totally agree this is going to be a huge thing

I’ve already been taking code I wrote years ago with zero comments and obscure naming like “norm_qq” and “my_pp_max” asking GPT to “rename my variables, functions and improve readability”
The results are amazing for more compact code but it doesn’t work on larger sections like my 3.6k lines of “quest-system”

1 Like

:face_with_open_eyes_and_hand_over_mouth:

I’ve noticed the same. It’s almost like it becomes overwhelmed by the amount of information and gets tunnel vision, and/or starts hallucinating instead of using the actual, visible data.

Really interested in applying this theory with the 32k context window.
Hopefully we can both explore this soon and compare notes!

1 Like

I have know Idea what is your text about?
Try to use phrases like. New AI feature or How I used the API.
As to “our Hammer found an ossified truth buried deep within, like an atom encased in amber” Does that mean that you are using old chat bots to write your help!

So is it actually rolling out or was it a troll? Ican’t reallty tell now lol Is 32k out?

I’m so glad someone finally caught that joke, I’ve been using it for demo’s and I was starting to think my humor was bad :rofl:

It’s just the norm square of a quantum wave function and the my_pp_max function is just the maximum of the position vectors squared. I’m not the guy who writes ad’s for dongle extenders :laughing:

Did you do anything special to get access to the 32k version? I want to shove my novella in and get ideas for the sequels. :slight_smile:

1 Like

The only thing I did was sign up as soon as the waitlist webpage went live. Which was a few hours before the YouTube demo of GPT-4. I also stated on the waitlist questionnaire that I was most excited about the 32k window size.

I don’t think it’s related either to my status as being a Regular here in the forum either, since other Regulars don’t seem to have access. It’s more of a first come, first serve, and I was excited to check out 32k type of thing on the waitlist questionnaire.

1 Like

Do you have big plans for utilizing 32k?

I think everyone will start using 32k, especially once the pricing starts to go down.

But right now, I will probably be fat-dumb-and-happy with GPT-4-8k, and embeddings for relevancy. I don’t have a strong requirement (yet) for larger context, but I see this as the future, especially as LLM’s are starting to push this ceiling up, and sub-quadratic-time architectures are rolling out (Hyena for example), and they can operate with large context windows and low-latency.

But if anything, larger windows provide the LLM with more context without going crazy with embeddings. But I still love embeddings too!

1 Like