when ask chatGPT 5 (with auto reasonsing) a question like: “what happened in the new today? look at washingtonpost.com” it returns very quickly. When I ask the same question in playground using gpt-5 without any prompt and with reasoning and verbosity set to low it takes way longer. What is chatGPT doing that makes it so fast and how can we build systems that can also respond as quickly?
If OpenAI has a front-line decision-making routing to an immediate generation model or a reasoning model, then they might also detect that it is a web search request that needs no thought at all, and can in fact be serviced by gpt-4o such as the API Chat Completions gpt-4o-search models that only give AI-style web results as answers to immediate search injection.
If you want generations to go faster, you can set the reasoning effort to minimal, and set the API parameter on Responses for the maximum tool calls to a low figure (and also tell the AI "you only get 3 uses of tools per response, make them count). Otherwise, you have gpt-5 that does lots of reasoning pre-generation, and can call search tools with many queries and go off reading pages in pieces, turn-by-turn, which you pay for each time as new input.
ChatGPT responses feel faster because the app adds system prompts, optimized settings, and sometimes streams partial outputs right away. In Playground, you’re hitting the raw model with no optimizations, so it can be slower. To get similar speed, use streaming + light system prompts and keep reasoning/verbosity low.