4o and 4 API output has typo/missing words

We are using 4 and 4o for a bot that streams responses. As shown below in the response, the response from GPT API has numerous typos. We have tried a low temp of 0.2 and high temp of 0.7 without any impact.

I’m sorry, Andy I’m really reluctant to to dealership. I spent a bit of time during my last visit and it’s frustrating that this completed then. Can explain exactly what needs be done and why wasn’t taken care of my previous visit? , are there any alternative that don’t require me come back in?

4 Likes

Is this a direct call to official Open AI endpoints or via a proxy?

I must say that is very poor quality output …

Facing same issue here, looks like it doesnt deppend on model (3.5, 4o, 4 turbo etc.) - response streaming is bugged.

1 Like

That makes a lot of sense!

@chinmay1 if you turn OFF streaming does the output improve?

It’s direct API (no proxy). We need streaming. Without streaming the latency is huge.

Facing same issue here - pls fix it

We are having the same issue here. And this is specially critical for use cases (such as ours) that require JSON output, as it’ll never be able to properly format the output. For now, we’ve changed to Google’s Gemni API.

I am suspecting that OpenAI documentation is lacking. We may not be processing tokens properly. I have asked my engineering team to look into it. My assumption is that stream happens via UTF and then we convert it into letters and words. Don’t know for sure. But if that’s the case we may need ti set buffer

Please see my response below. Have you checked buffer handling?

That’s not the point of the suggestion. The suggestion is meant to help understand the source of the issue.

1 Like

I also thought this could be the case, at first.
But for us, it apparently started all of a sudden, yesterday. Before that we never had such problem.
I’ve personally investigated this issue, and the chunks sent by the stream are poorly formatted, effectively being an error from the API.

Started for us yesterday too. We started to see around late evening.

How much time it takes to switch to Gemini? Just swap the API keys and call different endpoints or something more involved?

We already had a “connector” for that, so that switching was not a struggle.
But if I’m not mistaken, recently google also announced an OpenAI-compatible api - that is, only by changing the URL / api keys to google’s, it’ll work.
We’re not using that, though, so I’m not sure.

By the way, what is the tech stack you’re using there?

We’re using Deno here…

I am not an Engineer so dont know what Tech Stack (Deno) mean. If you can explain in 1-2 sentences I can get you the answer from my engineering team. We are on Azure/Node application.

Oh, you’ve answered already. Deno is a Javascript runtime, similar to Node.js that you are using. With the tech stack question I basically wanted to know the programming language you are using for the application, and / or if you are also using any ‘libs’ (lib stands for 'library") on top of that to interact with the AI APIs (like, node.js has an openai lib that facilitates communicating with their api - we’re also not using it, though. There are other famous ones, specially langchain)

1 Like

I’ve fixed it for PHP app by changing library from orhanerday/open-ai to openai-php/client.

2 Likes

I think we have an issue at our end too which is putting errors from OpenAI on steroid. Trying to fix that. Would post

Yeah, same here.

Turns out we’ve worked around the issue. In our case, the objects in the stream were being cut in the middle, and then the remainder of the object would come in the next chunk (even tho it’s the same object).
We created a buffer for the case an object from streamed chunk cannot be properly parsed, wait for the next chunk, and join these partial chunks before handling them.

Up until yesterday we’ve never had this issue. Something changed in OpenAi side.
It’s a bit more complex, and we’ve never experienced before either in OpenAI API, or other providers as well, but seems to be working.

2 Likes

Same. Although our solution may differ. I will collect info from my Engineering team and post.

1 Like