What to do if OpenAI keeps misunderstanding my prompt?

I am working with audio files from phone conversations and using Google Speech-to-Text to convert them into text. Each conversation includes the order of the sentence, the speaker, the time of speaking, and the content of the conversation. For example:

1, Speaker1, 0, Uh, yeah, so, I was thinking, um, maybe we could, like, uh, you know, start the project, uh, next week?
2, Speaker2, 6, Um, yeah, I guess, but, uh, do we have, like, all the resources we need, or, um, are we, like, still missing some stuff?

I want to use OpenAI to edit this content by removing filler words like “uh,” “um,” “like,” improving grammar, and correcting spelling mistakes. The result I am expecting is something like this:

1, Speaker1, 0, I was thinking maybe we could start the project next week?
2, Speaker2, 6, I guess, but do we have all the resources we need, or are we still missing some stuff?

Due to the output length limit (max tokens), I provide the entire conversation to OpenAI but only ask for a specific part, such as “Please return lines 1 to 50.” Initially, OpenAI worked well, but after some time, it sometimes only returns lines 1 to 20, which causes errors in my program.

When OpenAI returns the wrong line numbers, I have tried resending the request, such as: “You are returning the wrong results. Please return from line 1 to line 50.” However, sometimes resending the request is not effective.
The OpenAI API I have used: chat completion: https://platform.openai.com/docs/api-reference/chat/create

I used gpt-4o model

How can I ensure that OpenAI always returns the exact number of lines requested? Or is there another approach to ensure the text is edited as desired?

Welcome to the community!

That’s kinda tough, because

  1. the models suck at counting (but if you add line numbers that shouldn’t be the issue
  2. the models are generally lazy, and will aim to terminate the response somehow if it gets too long

I’d do it like this

  1. ask for a JSON response. If you only get 20 items, request the next 30, until you’re done.
  2. consider generally asking for smaller chunks
  3. if you ask for a JSON array, you can use logit bias https://platform.openai.com/docs/api-reference/chat/create#chat-create-logit_bias to surpress the end of an array (“}\n”, “]”) and enhance the probability of array continuity (“},”) - this can sometimes overcome the laziness of the models, although if you push it too far you might start seeing other artifacts.
  4. you could consider using a structured output, and requiring indices 1-50 to be part of the schema. (I wouldn’t do this, but it could be a desperation option)
2 Likes

Thank you. I will try the methods you shared with me.