I use Response API with Gpt4.1. But I am generally having the response like following:
{"products": [{"Eans": ["3614220449517"], "Name": "Chlo\u001e Signature Eau de Toilette", "LabelText": "Chlo\u001e Signature Eau de Toilette er en delikat og feminin duft som kombinerer friske og blomsteraktige noter for \u00005 skape en tidl\u0000f8s eleganse. \nBruk: Spray p\u00005 h\u0000e5ndledd, hals og andre pulspunkter. Unng\u0000e5 \u0000e5 gni inn parfymen for \u0000e5 bevare duftens integritet.", "UsedUrls": ["https://www.chloe.com/us/fragrance_cod46467854fe.html"], "ProductId": "3098228", "ProductName": "test", "LongDescription": "<p>Chlo\u0000e9 Signature Eau de Toilette er en delikat og feminin duft som kombinerer friske og blomsteraktige noter for \u00005 skape en tidl\u0000f8s eleganse. Denne duften \u00005pner med toppnoter av bergamott, sitron og magnolia, som gir en frisk og livlig start. Hjertenotene best\u0000e5r av gardenia og rose, som tilf\u0000f8rer en myk og romantisk karakter. Basenotene av musk og bomullsblomst gir en varm og behagelig avslutning.</p><p>Duftnoter:</p><ul><li>Toppnote: Bergamott, sitron, magnolia</li><li>Hjertenote: Gardenia, rose</li><li>Bunnote: Musk, bomullsblomst</li></ul>", "ShortDescription": "En delikat og feminin duft med friske og blomsteraktige noter."}, {"Eans": ["3616303459512"], "Name": "Chlo\u001e Pouch", "LabelText": "Chlo\u0000e9 Pouch er en lekker pouch fra Chlo\u0000e9 i et lekkert brunt design. \nDenne praktiske og stilige vesken er ideell for \u00005 oppbevare sm\u0000e5ting som telefon, n\u0000f8kler og kort, og er perfekt for b\u0000e5de hverdagsbruk og spesielle anledninger.", "UsedUrls": ["https://www.chloe.com/us/chloe/shop-online/women/pouches"], "ProductId": "3300514", "ProductName": "Chlo\u001e Pouch", "LongDescription": "<p>Chlo\u0000e9 Pouch er en lekker pouch fra Chlo\u0000e9 i et lekkert brunt design. Denne praktiske og stilige vesken er ideell for \u00005 oppbevare sm\u0000e5ting som telefon, n\u0000f8kler og kort, og er perfekt for b\u0000e5de hverdagsbruk og spesielle anledninger.</p>", "ShortDescription": "En elegant og praktisk brun pouch fra Chlo\u0000e9, perfekt for sm\u0000e5ting."}]}
This is structured response, namely json schema response.
Why does API return the json content in this encoding? The weirdest thing here is unicode characters make no sense for some characters like å,ø, é. So I am unable to solve it by manual replace logic.
What would be the proper solution to this?
Have you tried the Unicode/charset conversations of the returned and parsed JSON in the programming language you use? Had similar outputs (mostly in french) and was a breeze to solve in parsing code
While the gpt-4.1 model is symptomatic, I propose tackling it head-on, with the name itself indicating the structured output response’s purpose and type of character set.
{
"name": "api_utf_8_json",
"schema": {
"type": "object",
"properties": {
"products": {
"type": "array",
"description": "A list of products.",
"items": {
"type": "object",
"properties": {
"Eans": {
"type": "array",
"description": "List of EANs for the product.",
"items": {
"type": "string"
}
},
"Name": {
"type": "string",
"description": "The name of the product."
},
"LabelText": {
"type": "string",
"description": "Label text providing details about the product."
},
"UsedUrls": {
"type": "array",
"description": "List of URLs associated with the product.",
"items": {
"type": "string"
}
},
"ProductId": {
"type": "string",
"description": "Unique identifier for the product."
},
"ProductName": {
"type": "string",
"description": "The product's name."
},
"LongDescription": {
"type": "string",
"description": "A detailed description of the product."
},
"ShortDescription": {
"type": "string",
"description": "A brief description of the product."
}
},
"required": [
"Eans",
"Name",
"LabelText",
"UsedUrls",
"ProductId",
"ProductName",
"LongDescription",
"ShortDescription"
],
"additionalProperties": false
}
}
},
"required": [
"products"
],
"additionalProperties": false
},
"strict": true
}
Schema name change did not help either. The weirdest happened now is that it returns almost the same thing for the non-english characters as \u001E. So, it is impossible to replace it ordinary programming. I have seen some people suggested using mini/nano models to fix the wrong input which I really don’t like.
And even sometimes it goes totally crazy with such irrelevant things:
{"api_utf_8_json": [{"Eans": ["3607346232385"], "Name": "Chlo\u001e Eau De Parfum For Women 75ml", "Keywords": ["parfyme", "eau de parfum", "blomsterduft", "rose", "magnolia", "liljekonvall", "amber", "sedertre", "Chlo\u001e", "3607346232385"], "LabelText": "Chlo\u001e Eau De Parfum er en tidl\u001ds o
g feminin duft som kombinerer friske blomster med varme undertoner. Toppnoter av peon, litchi og fresia gir en lett og innbydende \u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001
d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u
001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001
d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u
001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001
d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u
001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\u001d\...