Hallucinated Responses in `/completions` API (GPT-4o)

anael.mashinsky · November 18, 2024, 7:48am

Hallucinated Responses in `/completions` Streaming API (GPT-4-O)

Context:

While using the /completions API with streaming enabled for GPT-4o with temperature 0.4, we encountered a problematic behavior where part of the response becomes repetitive, looping phrases.
The output also included nonsensical strings, diverging entirely from the provided input context.

Key Details:

No Correlation to Payload:
- The hallucinated response is unrelated to the context provided in the payload.
Repetitive Tokens:
- Portions of the response repeatedly generated.
- This looping behavior continued for multiple tokens, rendering the response unusable.
Irregular Occurrence:
- This issue could not be reproduced with the same input prompt and settings. (also with temperature 0.4 & 1)

Possible Factors:

Streaming Context Handling:
Could there be an intermittent issue in how streaming requests manage token buffers or state transitions?
Low Temperature:
Despite using a low temperature (0.4), where deterministic output is expected, the model generated nonsensical and repetitive tokens. Testing with higher temperatures (e.g., 1) did not yield the same problem.

Steps Taken:

Verified Input Context:
The payload provided to the model did not contain any information resembling the hallucinated output.
Reproduction Attempts:
Retried the same API call with identical parameters, but the issue did not recur.

What steps can be taken to prevent such issues? how can this issue occur

Environment Details:

Model: GPT-4O
Temperature: 0.4 (issue occurred); also tested with 1
Streaming: Enabled
Platform: Python FastAPI

Topic		Replies	Views
Randomized responses from Chat completions with gpt-4-0125-preview Bugs	0	48	March 10, 2025
GPT-4o - Hallucinating at temp:0 - Unusable in production Feedback api-hallucinations , gpt-4o	26	5801	July 24, 2024
Repeated responses from gpt-4-0613? Bugs gpt-4	16	1285	March 7, 2025
GPT-4 Streaming Output Radically Different than Static Output API	4	1545	October 9, 2023
Incomplete Output of 4o through API API gpt-4o	3	818	May 20, 2024