Stocastic nature of GPT-4 Turbo

bhargavasista · May 21, 2024, 4:44pm

Hi Community,

You may have witnessed this already in your trials so I’m hoping to gain some insight. I’m using GPT-4 Turbo on an Azure PTU for a RAG system and in a recent trial I noticed that when I ask the same question with same retrieved context on the same endpoint, I get different responses even if the temperature is set to 0.

Any idea on why this may be happening and how to make the responses more consistent?

anon34024923 · May 21, 2024, 4:50pm

make sure you also set these params to be sure

    top_p: 1,
    frequency_penalty: 0,
    presence_penalty: 0,

Other than that I’d suggest giving it more granular data or a more granular task, you may be giving it data that is too broad for it to give a consistent outcome, or asking a question that is too broad for it to give a consistent outcome. I have managed to get it to give pretty consistent outcomes about 90-98% of the time. If you’re looking for it to respond the same way 100% of the time you should feed it data and a task where it has the ability to narrow down its answers to a small subset. e.g. truthy values, numeric values.

If you want it to summarize something the exact same way every time I haven’t figured out how to do that.

bhargavasista · May 21, 2024, 4:56pm

Hello, thanks for a quick response. I’ll double check the code but those params should be set to the values that you have highlighted.

I understand the stochastic nature of LLMs so there’ll be some variability. However, what’s interesting is that earlier when we were testing GPT-4 Turbo on multiple endpoints on PayGo rather than a single endpoint on PTU, the responses with same implementation (parameters, system prompt, retrieved context, etc.,) were more consistent.

In some cases, just adding a period at the end of the user question seemed to have made responses on single PTU endpoint more consistent.

Topic		Replies	Views
Inconsistencies in API response to same prompt and similar content API gpt-4 , gpt-35-turbo , api	3	5051	July 18, 2023
I get different answers to the same request API gpt-4 , gpt-35-turbo , chatgpt , api	2	5252	December 8, 2023
Inconsistencies in GPT-3.5-turbo Model Behavior During Load Testing API gpt-35-turbo , api	0	908	November 11, 2023
Run same query many times - different results API	11	8035	December 21, 2023
Inconsistent result with long context Community gpt-4	3	296	April 17, 2024

Stocastic nature of GPT-4 Turbo

Related topics