Temperature in GPT-5 models

Would you say include_obfuscationto false could be an option in Responses API under include array sometime in the future?

Or they only use obfuscation with the Chat Completions API? I see they use a lot of bandwidth with json stuff, IMHO json responses could be more minimal without so much repetition specially concerning streamed responses..

PS: Sorry I didn’t test thoroughly as I haven’t got a lot of API credit so I am speculating with seniors who have seen lots of json reponses

include_obfuscation is a new parameter that lets you shut off an additional key during streaming that has a string of randomization and an adaptive lengthening of the streaming chunk. It is offered on Chat Completions.

The include_obfuscation parameter is also on responses API, but only in the “get model response by id” as a URL query parameter, not in normal response streaming. That’s the endpoint for using an existing response_id in the URL path to get a previous or background AI output, when also using stream to get it. But: it’s not an accepted parameter when you POST a request and SSE stream like normal, and Responses will give you this key regardless.

It looks like it comes along with text content events without shutoff capability now, but your tool call doesn’t get the treatment of this obfuscation method against a side-channel attack on data security:

ResponseTextDeltaEvent(content_index=0, delta='Yes', item_id='msg_1234', logprobs=[], output_index=0, sequence_number=4, type='response.output_text.delta', obfuscation='vhJAhujwn4Urs')

ResponseTextDeltaEvent(content_index=0, delta=',', item_id='msg_1234', logprobs=[], output_index=0, sequence_number=5, type='response.output_text.delta', obfuscation='rL5vR1oXv0ehQNX')

If you’re using Responses, that’s a sign that you care little about bandwidth and want multiple copies of the full response sent to you.

What is peculiar is they say “normalize payload length” as what it does. If it was a completed previous response you are streaming by calling GET with ID, it should not be in token sized chunks, but can be delivered to you in evenly-sized bytes or glyphs. If it was normalizing the length token strings, the length that it extends a responses doesn’t have coverage of those 100 hyphen tokens…

Thanks for the comments and clarification. I wanted to be sure, but yeah, I read the docs as you said but there are many moving parts to heed.. Thanks a lot!

In my experience, gpt-5-mini and nano supported temperature but with default value “1” which is useless if you want only deterministic answers and no-hallucination.

Does “GPT-5-chat” support temperature values of “0” and “logprobs” parameter?

I made this by trying.

gpt-5-chat-latest allowed parameters

feature supported
store yes
frequency_penalty yes
presence_penalty yes
max_tokens yes
temperature yes
top_p yes
logit_bias yes
stream yes
stream_options.include_obfuscation yes
stream_options.include_usage yes
stop yes
response_format:json_object yes
response_format:schema no
service_tier no
reasoning_effort no
logprobs no
tools no
modalities no

i am using gpt-5-nano and its supported. only that only one value (1) if not integers are supported. below i tried to use a temperature of 0.1.

File “C:\Python313x64\Lib\site-packages\openai_base_client.py”, line 1594, in request
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {‘error’: {‘message’: “Unsupported value: ‘temperature’ does not support 0.1 with this model. Only the default (1) value is supported.”, ‘type’: ‘invalid_request_error’, ‘param’: ‘temperature’, ‘code’: ‘unsupported_value’}}

Is there any update to temperature and frequency_penality not working?

I am currently using 4o-mini and 5-nano was recommewnded as a replacement. That clearly isn’t the case.

Thank you in advance.

Thanks for your interest in setting sampling parameters and keeping this topic alive, parameters which, on normal models, can be useful in getting top token choices in classifications and judgements, instead of random alternates, along with error-free code output.

gpt-5 is a reasoning model, like o1, o3 and o4-mini (that serve a parallel purpose and are quite useful too).

OpenAI doesn’t allow modifying the sampling parameters on these reasoning models with internal dialogs. There is no justification to be found, though.

One can speculate that part of the routing and internal classification of tasks is also in adjusting the sampling of results automatically. For models that produce and propose hypothetical solutions to themselves, novel and unexpected prediction branches can be inspected and retried or indeed may be a “best”. Thus, by self-observation, the model also may be made more confident in its final production.


Regarding a recommendation you found: I would not use gpt-5-nano for much of anything. Its quality tops out quickly while producing tons of tokens of reasoning deliberation, and gpt-5-mini is just better for not much more cost, and o4-mini will likely meet the mini-need even better.

Yeah, this is something I noticed too. GPT-5 models are specifically designed for reasoning, so some parameters like temperature don’t really apply the same way they do in “traditional” completion models. If your request gets handed off to a non-reasoning model under the hood (like GPT-4.1), that’s when parameters like temperature might actually take effect.

Basically, with GPT-5, the idea is that the model is already trying to provide coherent and thoughtful responses, so tweaking randomness through temperature isn’t as meaningful. For anything where you really need that old-school sampling control, you’d want to explicitly target a model that supports it (like 4.1).

Hope that clarifies a bit! Just my two cents from playing around with it.

With due respect, 5-nano is being touted and pushed as a drop in replacement for 4o-mini in cost and performance. The cost portion is certainly true, the the performance aspect is clearly false. The fact that the api just doesn’t “eat” the unused parameters is a clear example of that. It breaks code instead.

I received an email that 4o-mini was going to be retired soon… That is a problem is there isn’t a reasonably comparable replacement that DOESN’T break code.

Thank you.

gpt-4o-mini-2024-07-08 is not on the deprecations list. It has no immediate snapshot replacement in the same family that could put it on a fast track deprecation.

https://platform.openai.com/docs/deprecations#overview


Compare the pricing, where I at least show gpt-5-mini here to be on-topic (where you will have higher expense than hinted because of reasoning billed as output pricing also.)

Yes, they do. Just migrated back to 4.1 because we couldn’t get reliable behavior with GPT 5.

Eliminating the temperature setting was annoying as well, as it broke our code, quite unexpectedly.

Ok, now I’m getting nervous…

We just migrated back to 4.1 with our usecase, because of insufficient control over GPT-5-nano.

Is there a 4.1 snapshot that will be there for the foreseeable future? (Is there a difference to the models provided by MS via azure? As that’s what we’re actually using.)

gpt-5-nano is just bad in general. You’ll wait and pay more anyway while it makes 5x as many reasoning tokens for poor answers; gpt-5-mini is the minimum to be considered.

Deprecation: the earliest gpt-4o is still around, along with gpt-4 from March 2023 or gpt-4-turbo from Nov 2023, so there’s no signs the newest non-reasoning instruction-following model from 2025 would go away.

Edit: almost like they were reading this post: gpt-4-0314 (2023) and the “turbo” gpt-4-1106 (2023) plus gpt-4-0125` (2024) are getting shut off in under six months.