4o and 4 API output has typo/missing words

chinmay1 · July 17, 2024, 5:40am

We are using 4 and 4o for a bot that streams responses. As shown below in the response, the response from GPT API has numerous typos. We have tried a low temp of 0.2 and high temp of 0.7 without any impact.

I’m sorry, Andy I’m really reluctant to to dealership. I spent a bit of time during my last visit and it’s frustrating that this completed then. Can explain exactly what needs be done and why wasn’t taken care of my previous visit? , are there any alternative that don’t require me come back in?

merefield · July 17, 2024, 6:10am

Is this a direct call to official Open AI endpoints or via a proxy?

I must say that is very poor quality output …

spit · July 17, 2024, 11:59am

Facing same issue here, looks like it doesnt deppend on model (3.5, 4o, 4 turbo etc.) - response streaming is bugged.

merefield · July 17, 2024, 12:01pm

That makes a lot of sense!

@chinmay1 if you turn OFF streaming does the output improve?

chinmay1 · July 17, 2024, 1:05pm

It’s direct API (no proxy). We need streaming. Without streaming the latency is huge.

p.sonnleitner · July 17, 2024, 2:47pm

Facing same issue here - pls fix it

vfssantos1 · July 17, 2024, 3:13pm

We are having the same issue here. And this is specially critical for use cases (such as ours) that require JSON output, as it’ll never be able to properly format the output. For now, we’ve changed to Google’s Gemni API.

chinmay1 · July 17, 2024, 3:13pm

I am suspecting that OpenAI documentation is lacking. We may not be processing tokens properly. I have asked my engineering team to look into it. My assumption is that stream happens via UTF and then we convert it into letters and words. Don’t know for sure. But if that’s the case we may need ti set buffer

chinmay1 · July 17, 2024, 3:14pm

Please see my response below. Have you checked buffer handling?

anon22939549 · July 17, 2024, 3:15pm

That’s not the point of the suggestion. The suggestion is meant to help understand the source of the issue.

vfssantos1 · July 17, 2024, 3:17pm

I also thought this could be the case, at first.
But for us, it apparently started all of a sudden, yesterday. Before that we never had such problem.
I’ve personally investigated this issue, and the chunks sent by the stream are poorly formatted, effectively being an error from the API.

chinmay1 · July 17, 2024, 3:21pm

Started for us yesterday too. We started to see around late evening.

chinmay1 · July 17, 2024, 3:22pm

How much time it takes to switch to Gemini? Just swap the API keys and call different endpoints or something more involved?

vfssantos1 · July 17, 2024, 3:26pm

We already had a “connector” for that, so that switching was not a struggle.
But if I’m not mistaken, recently google also announced an OpenAI-compatible api - that is, only by changing the URL / api keys to google’s, it’ll work.
We’re not using that, though, so I’m not sure.

By the way, what is the tech stack you’re using there?

We’re using Deno here…

chinmay1 · July 17, 2024, 3:30pm

I am not an Engineer so dont know what Tech Stack (Deno) mean. If you can explain in 1-2 sentences I can get you the answer from my engineering team. We are on Azure/Node application.

vfssantos1 · July 17, 2024, 3:34pm

Oh, you’ve answered already. Deno is a Javascript runtime, similar to Node.js that you are using. With the tech stack question I basically wanted to know the programming language you are using for the application, and / or if you are also using any ‘libs’ (lib stands for 'library") on top of that to interact with the AI APIs (like, node.js has an openai lib that facilitates communicating with their api - we’re also not using it, though. There are other famous ones, specially langchain)

spit · July 17, 2024, 6:25pm

I’ve fixed it for PHP app by changing library from orhanerday/open-ai to openai-php/client.

chinmay1 · July 17, 2024, 7:21pm

I think we have an issue at our end too which is putting errors from OpenAI on steroid. Trying to fix that. Would post

vfssantos1 · July 17, 2024, 7:37pm

Yeah, same here.

Turns out we’ve worked around the issue. In our case, the objects in the stream were being cut in the middle, and then the remainder of the object would come in the next chunk (even tho it’s the same object).
We created a buffer for the case an object from streamed chunk cannot be properly parsed, wait for the next chunk, and join these partial chunks before handling them.

Up until yesterday we’ve never had this issue. Something changed in OpenAi side.
It’s a bit more complex, and we’ve never experienced before either in OpenAI API, or other providers as well, but seems to be working.

chinmay1 · July 17, 2024, 7:40pm

Same. Although our solution may differ. I will collect info from my Engineering team and post.

Topic		Replies	Views
Streaming events returned bunched up Bugs	8	221	July 19, 2024
Chatgpt api (openai-node v4.26.0) stream issue with gpt-4 models Bugs	18	1449	February 15, 2024
Request failed with status code 400 API	43	59675	January 29, 2024
Streaming Responses - Exploring Cost-Efficient Alternatives to SSE with AWS Lambda & API Gateway API api	17	16781	February 29, 2024
Is anyone experiencing WebSocket Realtime Error on Chrome browser? API	77	1363	January 27, 2025

4o and 4 API output has typo/missing words

Related topics