Parallel API Calls for the Same User Query Result in Inconsistent Responses

sharathkshetty1 · October 16, 2024, 5:09pm

I’m building an LLM-powered application that processes product description files and uses a prompt to summarize the task. I have an OpenAI account with one API key, and I’m running the same application on different machines using this shared API key. When I run the application on a single machine, the responses are accurate and as expected. However, when multiple machines use the same API key in parallel, the results become inconsistent and erratic.

Is there a known issue with the OpenAI API when it’s accessed in parallel by different servers using the same API key?

merefield · October 16, 2024, 7:59pm

Open AI LLMs are not deterministic even with Temperature zero. So if you run with the same prompts on multiple machines it is expected you will get slightly different results.

Your key is irrelevant.

oozturk3434 · October 17, 2024, 2:48am

Could it be the same Seed value solution?

_j · October 17, 2024, 4:47am

If you want to always have “best” results of an input - responses that start similarly (but may diverge) - you would set top_p: 0.00001.

A seed parameter of fixed value, if used and re-supplied to the API model, will re-run with the same component of randomness in the sampler (randomness essentially turned off by the top_p above), but if the underlying computations are not identical (which they aren’t), then this doesn’t have as much meaning.

Thus, inconsistent responses are expected responses. You can resend and possibly get a better answer, or different brainstorming ideas.

You can read a bit more. OpenAI hasn’t come out directly and explained the technical reason for varying logits and dimensions on language models and embeddings since gpt-3.5+.

You can look at the ‘fingerprint’ returned in an API call to see if repeatability is furthermore not expected, perhaps due to the model running on a different server architecture (some models varying by up to five fingerprints in large trials).

Topic		Replies	Views
Run same query many times - different results API	11	8508	December 21, 2023
I get different answers to the same request API gpt-4 , gpt-35-turbo , chatgpt , api	2	5663	December 8, 2023
Why does the answer vary for the same question asked multiple times Community api	8	3075	May 22, 2024
Why is GPT-4 giving different answers with same prompt & temperature=0? API	6	17106	April 6, 2023
Logprobs keep changing when using the same prompt in chat.completion API api	3	1576	March 5, 2024

Parallel API Calls for the Same User Query Result in Inconsistent Responses

Related topics