Incorrect Enum serialization during tool calling — garbled symbols (`\u0000...`)

whoiam1101 · October 20, 2025, 11:26am

When using tool calling with an Enum type, all properties and enum values written in Cyrillic produce corrupted symbols like \u0000... or \x00... in the generated object.
The issue does not occur when Latin characters are used instead. Cyrillic strings should be serialized and deserialized correctly without corruption. Instead of readable Cyrillic text, the resulting object contains garbled sequences with null bytes.

Steps to reproduce

Define an Enum where member names or values use Cyrillic letters.
Use this Enum within a tool calling function.
Inspect the generated or serialized result.

Possible cause

Likely an encoding mismatch during serialization/deserialization (utf-8 vs utf-16 / utf-32) or improper string handling during JSON generation.

Suggestion

Ensure consistent UTF-8 encoding across the pipeline and, if applicable, use:

json.dumps(obj, ensure_ascii=False)

Topic		Replies	Views
Realtime tools: enum not enforced; invalid strings slip through Bugs api , function-calling , structured-output , realtime , api-realtime	2	158	October 19, 2025
Bug: Unicode should not be escaped in multiple tool call results Bugs function-calling , tools	0	683	June 12, 2024
GPT-4o returning malformed Unicode like \u0000e6 instead of æ — encoding bug? Bugs	1	509	July 25, 2025
Tool Calling Issues with Non-Latin Languages Bugs	8	457	May 9, 2025
[SERVER-SIDE ISSUE] High token cost for non-Latin characters in structured output descriptions Bugs api , structured-output	1	153	September 10, 2025

Incorrect Enum serialization during tool calling — garbled symbols (`\u0000...`)

Steps to reproduce

Possible cause

Suggestion

Related topics