Error 500 for some prompts in 2025-02-07. Probably a tokenization issue

For the past 2 hours I’m getting the error InternalServerError: 500 Timed out generating response. Please try again with a shorter prompt or with max_tokens set to a lower value in the API for some specific prompts. In my debugging it seems like it’s related to the { or } character. It’s probably a tokenization issue.

The following prompt in the system message fails, for example:

Verifique se é possível responder ao usuário utilizando APENAS as informações fornecidas abaixo.

# Informações

[
 {
  "id": "qa_3661",
  "title": "Como baixar o XML das notas modelo 55 em massa?",
  "content": "Para realizar o download de todos os XML de notas modelo 55 emitidas pelo seu CNPJ é possível pelo seguinte caminho:\r\n\r\n1. Relatórios > Relatórios Fiscais > Baixar/Imprimir NF-e;\r\n2. Preencha os campos: Modelo, Mês e Filial; \r\n3. Após preencher os campos, clique no botão \"Baixar arquivos XML do mês\".\r\n\r\nObservação: Caso não tenha notas geradas não irá constar no arquivo.\r\n\r\nO arquivo leva em questão notas fiscais emitidas pelo CNPJ da filial tanto notas emitidas pelas Vendas ou em Compras, também notas canceladas levando o XML de cancelamento."
 }
]

## Formato de resposta

Sempre responda em formato JSON de acordo com esse schema:

{
 "analise": string, // Análise sobre as informações disponíveis acima e se é possível responder a pergunta do usuário ou não
 "eh_possivel_responder": boolean
}

But the following works:

Verifique se é possível responder ao usuário utilizando APENAS as informações fornecidas abaixo.

# Informações

[

  "id": "qa_3661",
  "title": "Como baixar o XML das notas modelo 55 em massa?",
  "content": "Para realizar o download de todos os XML de notas modelo 55 emitidas pelo seu CNPJ é possível pelo seguinte caminho:\r\n\r\n1. Relatórios > Relatórios Fiscais > Baixar/Imprimir NF-e;\r\n2. Preencha os campos: Modelo, Mês e Filial; \r\n3. Após preencher os campos, clique no botão \"Baixar arquivos XML do mês\".\r\n\r\nObservação: Caso não tenha notas geradas não irá constar no arquivo.\r\n\r\nO arquivo leva em questão notas fiscais emitidas pelo CNPJ da filial tanto notas emitidas pelas Vendas ou em Compras, também notas canceladas levando o XML de cancelamento."

]

## Formato de resposta

Sempre responda em formato JSON de acordo com esse schema:

{
 "analise": string, // Análise sobre as informações disponíveis acima e se é possível responder a pergunta do usuário ou não
 "eh_possivel_responder": boolean
}

Note that the only difference is that I have deleted the { and } .

I can replay the error in the playground too:

1 Like

Here is testing out how I would write this task, along with it being translated for the understanding of others:

developer:

You are an automated classifier and validator, reporting if a task is feasible. Check if it is possible to respond to the user using ONLY the information provided below. Don’t answer the user, perform the analysis and output the response format.

Information

{
“id”: “qa_3661”,
“title”: “How to download XML files of model 55 invoices in bulk?”,
“content”: “To download all XML files of model 55 invoices issued by your CNPJ, follow these steps:\r\n\r\n1. Reports > Fiscal Reports > Download/Print NF-e;\r\n2. Fill in the fields: Model, Month, and Branch;\r\n3. After filling in the fields, click the "Download XML files of the month" button.\r\n\r\nNote: If there are no generated invoices, they will not be included in the file.\r\n\r\nThe file includes invoices issued by the branch’s CNPJ, whether from Sales or Purchases, as well as canceled invoices including the cancellation XML.”
}

Response Format

Always respond in JSON format according to this schema:

{
“analysis”: string, // Analysis of the available information above and whether it is possible to answer the user’s question or not
“is_possible_to_respond”: boolean
}

Notice that the information you provided is not an array, but rather an object (dictionary). So the way you make it work reduces the quality.

However, sending exactly your “bad” system message, or mine, never produces the error.

Here are recommendations to try that might avoid an error in the AI producing a bad output:

  1. for classification, use top_p: 0.01. You do not want your answer to be “creative”, but instead the best quality.
  2. Do not send the system message alone. A user message also must actually describe the task.
  3. For the user message, place data in a container so it is not misunderstood:

Here is the user’s question for you to analyze:
“”"
{data}
“”"

Here is the information that must be used to answer, to check if answering from this is possible:
``` json
{more data}
```

A “500 Error” is usually that the AI produced an output that could not be parsed by the API backend. The recommendations will make the output more reliable.

Also, the fixed output format is a great case for using structured outputs, the “response_format” that can accept a JSON schema.

A change by OpenAI that breaks applications is always frustrating. You can make your AI application more robust.

Thanks for the prompting tips, but that is not really my full prompt, it’s just a small example so that the error could be replicated by devs for debugging. I was having these errors in production even though I didn’t change anything for days. Now the problem has been solved, and I didn’t change anything again. So it was trully a bug, but I guess not everyone was affected.