I’ve tested tool calling functionality with Korean text and discovered that LLMs sometimes misinterpret data when it’s returned through tool calling, while the same content works perfectly in English.
My testing showed:
- Korean text via tool calling: frequent misinterpretations
- English text via tool calling: consistently correct interpretation
- Direct prompt inclusion: works correctly in both languages
- JSON format helps but doesn’t fully resolve the Korean issue
Has anyone experienced similar issues with other non-Latin languages (Japanese, Chinese, Arabic, Russian, etc.)? Are there any recommended workarounds or best practices for multilingual tool calling?
Are you using korean names for functions (and descriptions) too, or only asking it to return the data in korean?
1 Like
From my experience, anything other than a simple conversation flow gives iffy results, and my only language is English. If you’re able to prompt the LLM in English you may get better results. You should still be able to process Korean text even when prompting in English.
Maybe an issue with character encoding? I had issues with unescaped curly braces in my instructions, which also had affected JSON responses, that you can enforce using a json_schema.
The descriptions and names of the tools are written in English.
Only the data is in Korean.
For example, if you include “떡볶이”, it will be interpreted as 덮볶이, 도돌이, etc.
1 Like
Are you referring to using json_schema to pass the tool’s response?
I’m currently using langgraph’s reactAgent, so it’s automatically handling tool requests as toolmessages.
The prompts are all in English, with some Korean in the tool responses.
The required data is fetched by the tool call, processed and returned, which causes LLM to make mistranslations.
Interesting, my first thought was that with these being written in english, it was supposed to bring better results.
Since you mentioned that you are using reactAgent, it might be due to how it deals with data internally.
Perhaps a customized approach directly using the API would give you a more detailed control to debug what is happening.
You could try removing all Korean characters and test your setup with only using English. Ideally, with no special characters or anything like that, just letters. If the error persists, you can at least rule that out as a cause.
I for example had also almost only issues, when assistants called tools in a new chat for the first time in that chat, the OpenAi API server failed easily half of the time with a generic “Server error - Sorry, something went wrong” message. And in my case, it was mainly because of unescaped curly braces, that i’ve in my instructions. But i can’t say, where in the chain of responsibilities the error finally occurred, because so many things depend on each other, especially when tools are called.
And with JSON response, i meant the final response, the assistant generates, that can be a JSON object, enriched with additional data for internal use. There are 2 ways to make the API response a JSON object, none of them was working because of the curly braces.