Could GPT-4 Turbo be a quantized version?

It’s obviously a huge leap forward in speed and context size, but the slight degradation in accuracy compared to normal GPT-4 makes me wonder, could OpenAI have taken notes from the open-source community and quantized for speed and scalability?

I know it’s a bit of a random topic, but I’m curious what you all think. This is all just speculation, of course.