4o and 4 API output has typo/missing words

Just came to the forum to find this post. Also experiencing the same issue. Our partial parsing solution worked until today. Haven’t deployed anything to the system in question since the last time it worked, so the change is definitely on OpenAI’s end.

We are seeing this issue left and right too. The JSON is not coming properly. We are still trying to fix the code. Will keep you posted.

BTW, I created a ticket on this last night. For a canned response from Helpdesk. They have still not acknowledged the problem. Not sure how to bring it to OpenAI admin notice.

I also created a ticket. That’s probably the best we can do to get OpenAI’s attention. If you’re running into this issue, click Help in the dashboard and report an API bug with as much detail as you can.

1 Like

Thats what I did last night and again this morning as I got response from them. In the past they were very responsive. Not so much anymore. Also, I am getting a lot of “Error” emails from OpenAi today. But they dont explain what those Error are. So maybe they know the issue and are fixing them.

1 Like

Ah, this turned out to be the issue for us. We built our project before the OpenAI Node client lib handled streaming calls, and there turned out to be an error in our partial JSON parsing logic that wasn’t revealed until OpenAI made a change on their end in the last couple of days.

In our case, I wasn’t emptying the buffer object after a successful parse. Once I made that change, I stopped seeing dropped characters. I remember having found some Python code deep in some old docs that I had (obviously incorrectly) rewritten into JS to handle this originally.

So the solution would probably be: inspect your streaming call code, use an official lib, or check out the Vercel AI SDK if you’re using JS/TS. Those options will handle the streaming responses for you.

1 Like

Thanks to everyone coming back to share workarounds.

Trying to find out if anything changed, but we may never know.

Appreciate the attempts to help each other, though. That’s what makes this place so special.

Actually, I spoke too soon! Still seeing this issue, even after fixing the partial JSON parsing. Going to try switching to one of the official client libs to see if that sorts us out.

2 Likes

Here is what my engineering team is trying:

const OpenAI = require("openai");
const openai = new OpenAI({
  apiKey: <OPENAI_KEY>,
});
 
async function init() {
  const stream = await openai.chat.completions.create({
    model: "gpt-4o",
    temperature: 0.8,
    messages: [
      {
        role: "system",
        content: `User will provide a topic and you need to create a sales Pitch`,
      },
      {
        role: "user",
        content: `Provide a sales pitch for google search engine`,
      },
    ],
    stream: true,
  });
  let dataAns = "";
  for await (const part of stream) {
    if (part.choices[0]?.delta?.content) {
      dataAns = `${dataAns}${part.choices[0]?.delta?.content}`;
      const regex = /(.*[\.\?])$/gm;
      let found = dataAns.trim().match(regex);
      if (found) {
        dataAns = "";
        console.log("found", found);
        const textMessage = {
          text: found[0],
        };
      }
    }
    if (part?.choices[0].finish_reason === "stop") {
      console.info("finished the stream now ending the connection");
      console.log("dataAns", dataAns);
    }
  }
}
init();

Seeing how there aren’t thousands of similar issues I’d wager that there are some edge cases here.

  1. Were and have you guys been expecting single tokens before or batches of variable sizes?

  2. Are you using unofficial or outdated libraries

1 Like

This code does not look healthy.

It seems like you’re trying to capture full sentences?

I’m on my phone but I believe I see an issue.

You are using regex… For some reason to capture a sentence by a full stop or question mark.

Then, you are naively taking all the content from BEFORE (inclusive), and discarding any additional tokens that were after. Causing words to be lost

This was a WIP code I posted and it had that error. But the original error is not this. The original error is that we are not getting proper JSON response. Instead of getting one full JSON response, it’s being spilt into two.

Here is the revised code

const response: any = await openai.chat.completions.create({
  model: "gpt-4o",
  temperature,
  messages: messageList,
  stream: true,
});

let finalString = '';
let lastProcessedIndex = 0;
let fullStop = false;
let endCall = false;

for await (const chunk of response) {
  if (chunk.choices[0]?.delta?.content) {
    finalString += chunk.choices[0].delta.content;
    
    const regex = /.*?[.!?](?:\s|$)/g;
    let match;
    
    while ((match = regex.exec(finalString.slice(lastProcessedIndex))) !== null) {
      const sentence = match[0];
      const sentenceEndIndex = lastProcessedIndex + match.index + sentence.length;
      
      console.log('Sentence found:', sentence.trim());
      
      const { text, containsPattern } = this.checkEndCall(sentence);
      
      if (containsPattern) {
        console.log('containsPattern:', containsPattern);
        automotiveCache[chatObject.userId]["data"].push(text.trim());
        endCall = true;
      } else {
        automotiveCache[chatObject.userId]["data"].push(sentence.trim());
      }
      
      lastProcessedIndex = sentenceEndIndex;
      fullStop = true;
    }
  }
  
  if (chunk?.choices[0].finish_reason === "stop") {
    console.info("Finished the stream, ending the connection");
    
    if (!fullStop && lastProcessedIndex < finalString.length) {
      const remainingText = finalString.slice(lastProcessedIndex);
      const { text, containsPattern } = this.checkEndCall(remainingText);
      
      automotiveCache[chatObject.userId]["data"].push(text.trim() + ".");
      
      if (containsPattern) {
        endCall = true;
      }
    }
    
    automotiveCache[chatObject.userId]["endCall"] = endCall;
    automotiveCache[chatObject.userId]["finish"] = true;
    
    const botResponse = { "sender": "bot", "content": finalString };
    
    if (automotiveConversationCache[chatObject.userId]?.convo?.messages?.length) {
      automotiveConversationCache[chatObject.userId].convo.messages.push(botResponse);
    }
    
    if (!automotiveConversationCache[chatObject.userId]) {
      const searchQuery2: any = { "accountId": chatObject.accountId, "userId": chatObject.userId, "status": 0 };
      const check1 = await ConversationAutomotiveModel.findOne(searchQuery2);
      const check2 = await ConversationAutomotiveModel.findOne(searchQuery);
      
      if (!check1 && !check2) {
        return { statusCode: 400, message: "Bad Request" };
      }
    }
    
    setTimeout(async () => {
      console.log('Saving bot response to db:', botResponse);
      await ConversationAutomotiveModel.findOneAndUpdate(searchQuery, { $push: { messages: botResponse }, $set: { modifiedOn: chatObject.modifiedOn } });
    }, 500);
    
    automotiveCache[chatObject.userId]['dataString'] = "";
  }
}

FWIW it does not appear that anything is wrong from the API standpoint in my context in python.

client = OpenAI()

DEPLOYMENT_MODEL ="gpt-4o"

async def async_main():
    messages = [{"role": "system", "content": "User will provide a topic and you need to create a sales Pitch"},
                {"role": "user", "content": " Provide a sales pitch for google search engine "}
            ]
    
    stream =  client.chat.completions.create(messages=messages, model=DEPLOYMENT_MODEL, stream=True)

    full_delta_content = "" # Accumulator for delta content to process later

    # Process the stream response for tool calls and delta content
    for chunk in stream:
        delta = chunk.choices[0].delta if chunk.choices and chunk.choices[0].delta is not None else None

        if delta and delta.content:
            full_delta_content += delta.content
            
    print(full_delta_content)

def main():
    asyncio.run(async_main())

if __name__ == "__main__":
    main()

I would suggest starting with the basics

2 Likes

After I got into the debugger, it turned out that the issue DID lie in our JSON parsing logic, still. Our apps using the latest official client libs weren’t affected. It was related to the issue @vfssantos posted about above.

For anybody else running into this, I’d suggest either upgrading to the latest version of the client lib, or running through your stream parsing logic in a debugger.

1 Like

But something changed in output from API. Earlier such bifurcation of JSON used to happen once in awhile but since yesterday it was with every sentence at the start. So when bot responds with 3 or 5 sentences the first couple would have this issue always and rest would be OK.

Im new here so apologise in advance for sounding like a dummy.
I have a web app that uses gpt-4o api and like many of you here, im experiencing shocking responses where words and letters are missing.

Is someone able to help me understand, something like this, will OpenAi fix this or is this something I’ll need to get my backend developer help me with?

import { Configuration, OpenAIApi } from "openai";
import dotenv from "dotenv";
import { IncomingMessage } from "http";
import { io } from "../index";
import { AnyArray } from "mongoose";
import Chat from "../models/chat";

dotenv.config();

/**
 * Takes in a prompt, base prompt, business prompt, and context, and returns a string.
 *
 * @param {string} prompt - The user's input prompt.
 * @param {string} basePrompt - The base prompt for the assistant
 * @param {string} businessPrompt - The business-specific information prompt from the user.
 * @param {any} context - The context for the conversation (chat history).
 * @returns {Promise<string>} - The response text generated by the OpenAI chat completions.
 */
const openAIService = async (
	room: string,
	chatId: string,
	prompt: string,
	basePrompt: string,
	businessPrompt: string,
	context: any,
) => {
	// Get the chat from the database
	const chat = await Chat.findById(chatId);
	if (!chat) {
		return;
	}

	console.log({ room, chatId, prompt, basePrompt, context });

	io.on("connection", (socket) => {
		socket.on("joinRoom", (chatId: string) => {
			socket.join(chatId);
			console.log(`User joined room ${chatId}`);
		});
	});

	const configuration = new Configuration({
		apiKey: process.env.OPENAI_API_KEY,
	});
	const openai = new OpenAIApi(configuration);

	const completion = await openai.createChatCompletion(
		{
			model: "gpt-4",
			messages: [
				{
					role: "system",
					content: `${basePrompt} \n\n Before we begin I would like to give you some background on my business: ${businessPrompt} \n\n Here are the previous chats: ${context}. \n\n Please respond using British English and all dollar references should be Australian Dollars not pounds.`,
				},
				{ role: "user", content: prompt },
			],
			// functions: [{ name: "get_response", parameters: schema }],
			// function_call: { name: "get_response" },
			temperature: 1, // Try configuring this differently.
			max_tokens: null,
			stream: true,
		},
		{ responseType: "stream" },
	);

	// const responseText: string = completion.choices[0].message.content;
	let response = "";

	const stream = completion.data as any;

	stream.on("data", (chunk: Buffer) => {
		const payloads = chunk.toString().split("\n\n");
		for (const payload of payloads) {
			const data = payload.split("data: ")[1];
			try {
				console.log(JSON.parse(data).choices[0].delta.content);
				const content = JSON.parse(data).choices[0].delta.content || " ";
				response += content;
				// console.log("Response:", response);
				console.log(content);
				// Emit each chunk of data as it arrives
				io.to(chatId).emit("openai_response", response);
			} catch (error) {
				console.error(`Error with JSON.parse and ${payload}.\n${error}`);
			}
		}
	});

	stream.on("end", () => {
		setTimeout(() => {
			console.log("Closing Stream");
			// Emit a custom event to signal the end of the stream
			io.to(chatId).emit("stream_end", response); // Send final response if needed
			//! Add the response to the database
			chat.messages.push({
				question: prompt,
				answer: response,
			});
			chat.save();
		}, 2500);
	});
};

export default openAIService;

Welcome to the community.

I would recommend this. Maybe point them to this thread.

Make sure your OpenAI library is up to date, etc.

Basically, the way streams are “chunked” has changed. No word on whether it will change or what happened… yet. But it sounds like a lot of edge-cases as someone mentioned upthread… usually older libraries or hand-written streaming code…

1 Like

Im new here so apologise in advance for sounding like a dummy.
I have a web app that uses gpt-4o api and like many of you here, im experiencing shocking responses where words and letters are missing.

Is someone able to help me understand, something like this, will OpenAi fix this or is this something I’ll need to get my backend developer help me with?