GPT- 5.3-Codex-Spark Research Preview with 1000 Tokens per Second

An ultra-fast model for real-time coding rolling out today to ChatGPT Pro users in the Codex app, CLI, and IDE extension.

From the announcement:

Codex-Spark marks the first milestone in our partnership with Cerebras, which we announced in January⁠. Codex-Spark is optimized to feel near-instant when served on ultra-low latency hardware—delivering more than 1000 tokens per second while remaining highly capable for real-world coding tasks.

At launch, Codex-Spark has a 128k context window and is text-only. During the research preview, Codex-Spark will have its own rate limits and usage will not count towards standard rate limits. However, when demand is high, you may see limited access or temporary queuing as we balance reliability across users.

You can just build things!

9 Likes

Apparently, while working on Spark, something was implemented that many more developers will benefit from:

12 Likes

This looks fantastic but having started several projects with the base codex, you are warned not to switch to this model.

I look forward to my next inspiration and I will give this a go from the get-go! :rocket:

Real-time? :exploding_head: :sweat_smile:

Keen to hear people’s impressions esp. when compared to codex medium and Claude.

2 Likes

Yes, the context window of the Spark model is 128,000 while 5.2-Codex has a length of 400,000 tokens. Switching to the faster model mid-session needs some additional management.

2 Likes

Makes sense - perhaps “upgrading” is less of an issue? ie spark → normal.

2 Likes

I plan to use it for single, clearly defined tasks.
With some adjustments to the plan, full Codex should supply these to sub-agents to achieve both speed and support for complex implementations.

2 Likes

The persistent websocket optimizations and targeted optimizations to the Responses API have landed.

As of today, Codex Spark is 30% faster!

1 Like

Curious if anyone finds any “quality” drop worthwhile because of the speed benefit?

With one project running I find full codex limits throughput of features - whereas if you have two or three concurrent “projects”, it is fast enough when you have to switch between each project to review and contribute.

“Real time” is definitely very attractive if you are just doing “one project” at a time.