How can I explain to OpenAI to take his time (Chill bro)

jgandara · January 23, 2025, 11:05am

Hello team,

I have been analyzing similar questions on the forum for some time but I can’t find a solution to my problem.

We have developed an application for auditing phone calls after transcription. We sent to the OpenAI API (gpt-4o-2024-11-20) the following information:

Transcript.
30-40 questions that I want the wizard to analyze about the transcript.

Most of the questions the wizard is able to evaluate correctly and recognizes simple and complex concepts. My problem is that there are times when the assistant seems to make up content that does not exist in the conversation. For example it may tell me that the telephone agent confirms that the customer has received an SMS in the conversation and there is nothing similar in the text of the transcript.

Basically it seems to hallucinate sometimes, I don’t know if it is because of the large size of the input (5000 tokens in, 2000 tokens out) or because of the “hurry” in answering the wizard.
The less temperature I leave to the wizard (0.02 is the current one) the better it works, but I can’t avoid some cases of hallucination.
I have tried to explain in the wizard script not to be in a hurry to answer, to make sure that it is based exclusively on the text of the transcript, etc. but as I say I can’t correct it at all.

Sorry for the long text, can you think of any advice to improve the results? Is there a way to tell OpenAI that I am not in a hurry to get an answer? Do you think that using the 4o model is the best option?

Thank you very much to those of you who read it and even more to those who answer

dominickwirzba · January 23, 2025, 11:44am

It’s simple: if possible, simply split your input into smaller requests and bundle them into a sequence of multiple requests.

Good luck and have fun

_j · January 23, 2025, 12:58pm

There is no concept of “hurry” to a language AI.

However, there is a sense of “response is getting too long, better finish”, or "better minimize the length of each of 40 parts requested.

So you might need to use multiple runs with less tokens being generated, under 1000. This also prevent a long list of questions and a growing answer from distracting from the input document.

The AI making its choice of a word “yes” or “no” is just predicting the certainty of one of those two tokens (or others).

When the output generation is far away from the document and the questions, the choice becomes even more random.

For an answer more grounded in reality, you can give guided questions, having the AI repeat the question it is currently answering, first producing some truth about where that information comes from, and then answering. This can be structured in a response format:

[
 {
  "question_index": 1,
  "question": "In document, has customer received an SMS",
  "preliminary_reasoning": "The document doesn't seem to discuss text messages anywhere",
  "best_citation_from_document": null,
  "answer": "No"
},
{
  "question_index": 2,
  ...

Using a strict structured schema, you can make only two enum strings possible, giving the more desirable one clear choice, which no distraction from ranks, such as whether the AI should write “yes” or “Yes”.

hugebelts · January 23, 2025, 1:04pm

Yes. You can tell it to “chill”, at least somewhat.
As well Andrew Ng on Deep Learning AI, Anthropic in their prompt Guide and for sure OpenAI in their guide for ChatGPT tell you:

You may add to your prompts things, like:

“Think Step by Step.”
“Take a deep breath and count to 10, then start”.

etc.

and the like.

It’s explicitly explained here by OpenAI itself:

OpenAI’s Prompt Engineering Guide

Excerpt:

And that’s only one of the 6 ways they are describing there.

Topic		Replies	Views
How can I improve response times from the OpenAI API while generating responses based on our knowledge base? API chatgpt , api	3	25511	November 9, 2023
How to control o3-mini chat model without returning "One moment please" API chatgpt , azure-openai , o3-mini	3	794	March 11, 2025
How to reduce OpenAI response time? API	12	18495	May 23, 2023
ChatGPT answers partially to request API chatgpt	6	447	February 20, 2025
How can I escape long prompts ( note that experts are saying that longer prompts might risk diluting the specificity of the request ) Prompting gpt-4 , plugin-development , api	5	2629	May 25, 2025

How can I explain to OpenAI to take his time (Chill bro)

Related topics