Api does not support utf-8 encoding

Greetings to all,

I am currently working on a chatbot application in Flutter and I have encountered an issue with the encoding of the response received from the OpenAI API. Despite using HTTP requests, some characters, such as ‘é, à, où, …’, are being displayed incorrectly. I have attempted to resolve this issue by adding the ‘charset=UTF-8’ parameter in the API configuration, but the result remains the same.

1 Like

Hi @darrelx

What to you mean precisely above?

Displayed where? On an HTML page? In the terminal?

:slight_smile:

For a French response, instead of “célébrité”, “répondre à”. I get “cà © lèbrèbres”, “rà © pondre Ô.
DIsplayed in my terminal, i’m using the last version of Powershell, i don’t think the terminal doest not support. And please excuse if my English or my words are not clear in advance.

I’m having the same kinds of problems. It looks like there are often incorrectly encoded characters in the data sent back from the API, perhaps due to mixing together bits of text from different sources with slightly different or inconsistent encodings?

thank you all for helping me. I solved my problem. The solution was a “response.bodyBytes” instead to “response.body”. bodyBytes is utf-8 encoded, so just decode it and access the message content.

In dart you can resolve this with utf8.decode(<the text>.codeUnits);

But I agree that for me it used to work all the time with a raw response from openapi directly encoded in utf8…

1 Like

Thanks, this solved my problem.

I search Google & OpenAI API (python) but could not find the response.bodyBytes, could you share the details how to fix this problem?

    var temp = json.decode(utf8.decode( response.bodyBytes));
    print(temp);
2 Likes

Close to @badpaybad, for Python developers, I fixed it with a simple:

response.choices[0].message.content.encode("utf-8").decode()

I’m using the OpenAI Node.js API but still have text encoding problems, e.g. “K%C3%A4ngor” instead of “Kängor” but also “f\u00f6r” instead of “för” so different kinds of wrong encoding (is the LLM generating these on purpose?).

Is going “raw” and using response.bodyBytes the only solution?

Switching to gpt-4-turbo-preview (from gpt-4-1106-preview) solved it for me.