Hello guys does using functions have any limits? Or any side effects?
Hello guys,
The bug is still there. Similar issue for spanish language. Any solution is in sight?
Hi there,
Same bug for French language (which is full of accentuated characters).
Just to add some feedback: the model rarely send us the good ascii data (not a valid representation of the underlying data): we often get â\nâ or even â\ufffdâ - which is the replacement char (ïżœ) in ascii - instead of accentuated chars. So json handling librairies do not change anything.
Hope we get a fix soon!
Thanks
Thank you for all the additional details everyone shared regarding this issue.
Weâve rolled out a fix that addresses escaped unicode characters, to bring output formatting more in-line with our previous models.
However, we understand that there are additional issues raised in this discussion regarding incorrect unicode characters or invalid unicode characters being produced by the model. Unfortunately, we do not anticipate this new fix to fully address these issues. We are still working on it, alongside other model improvements, but canât provide a timeline at this time.
@enoch Thank you for your efforts. However, you have fundamentally failed to address the actual issue at hand, for the following reasons:
The central concern of everyone in this thread has always been the issue of incorrect Unicode characters being produced. In reality, nobody gets stuck just because of Unicode escape decoding. Any JSON parsing library can handle the \uXXXX escape sequences without any problem. It seems your fix merely adds a middleware layer that transforms the GPT modelâs output into raw UTF-8 format instead of the \uXXXX format, which doesnât solve the core issue at all. We could do this translation step on our own. What matters is that the characters output by the model havenât changed.
Indeed, my previous post includes a complete script to reproduce the issue, and the results of running it now are the same as before, replete with erroneous outputs. If this issue remains unsolved, the function/tool calling feature is virtually unusable for non-English languages that include non-ASCII Unicode characters. This has been the same issue from the beginning and the one that people care about, not two separate problems of which youâve only addressed one.
This is a serious issue, and it deserves your full attention. I suggest that the banner alerting users to this bug be reinstated on the function calling documentation page to ensure that people are aware thereâs still an unignorable issue at hand, and then work to resolve it as promptly as possible.
The fix seems to make it worse, 1106 model now produces non-utf8 output, such as âĂÂÔöâŠâ. The issues started around 18:00 UTC 12/1.
+1, it was messing up utf8 output on emojis every now and again but since yesterday its producing stuff like this consistently ð\x9F\x94\x84.
Now it is even worse! Now both gpt-3.5-turbo-1106 and gpt-4-1106-preview models generate incorrect results.
Chiming in to report that Iâm also experiencing server-side encoding issues which started recently (less than 7 days ago) on gpt-3.5-turbo-1106 with the function tool.
Iâm using the chat completion endpoint to ask the model to call a function with a single value argument from an enum. One of the values in the enum is âProcĂ©dureâ, note the accent on the âĂ©â.
Debug logs confirm that the request JSON is properly formatted.
The model replies with a function call with the following argument: âProcĂ©dureâ. Note that the âĂ©â has been mangled and is now expressed as two consecutive unicode characters âĂâ and â©â.
This behavior is consistent with the âĂ©â character, expressed as 2 bytes in UTF-8, namely 0xC3 and 0xA9, being interpreted as ASCII by the model, as these two bytes correspond to âĂâ and â©â in the extended ASCII table (ISO 8859-1)
Hi, as @juria rightly pointed out, the problem for at least 2 days (2 days ago my project stopped working properly) also affects the gpt-3.5-turbo-1106 model with the function tool. Since then, 70-80% of queries end up with a 500 error (and whatâs worse, charging a query fee nonetheless. I donât know if this is due to encoding, so I mention it as a curiosity). From my perspective, using functions in this model has lost its meaning (at least for non-English languages, although note that emoticons also encode badly). I created a simple function in python that corrects incorrect returned values, maybe it will be useful to someone, until OpenAI fixes the problem:
def fix_bad_encoding(text: str) -> str:
words = text.split(' ')
for i in range(len(words)):
while True:
try:
new_word = words[i].encode('latin1').decode('utf8')
if new_word == words[i]:
break
words[i] = new_word
except UnicodeError:
break
return ' '.join(words)
# [...]
function_args = json.loads(fix_bad_encoding(tool_call.function.arguments))
For âProcĂ©dureâ returns âProcĂ©dureâ,
for âCzyĂ
ÂŒbyĂ
 planowaĂ
 podjĂÂ
ĂÂ takie dziaĂ
Âania?â (example taken from the returned function argument) returns âCzyĆŒbyĆ planowaĆ podjÄ
Ä takie dziaĆania?â.
Some simplified information: I donât know for which languages this is able to correct the problem. The function divides and iterates over words (separated by a space) because, some words are correct and such sentences triggered an error in my case. Also, one word has an operation in the loop because it happened that the word was âdouble wrongly encodedâ and a single operation didnât give a positive result.
I hope the latest defective update will be fixed soon.
Same observation here for any -1106 model; French chars with diacritics end up corrupted by the model : bĂ
uf / pérégrination / désaccord.
Among the fix beeing a total fail and the warning removed without any decent reason, itâs now worse than before.
That situation is severely frightening about the issue-fixing process in there. Did you actually ship the fix without any test ?! be it manual or automated ?
I mean, in the current situation Iâm not even able to make the model produce ANY non-corrupted char with diacritics; not like it would be hard to reproduce.
âPlease, tell me a random word in French with an accent in itâ
Oo
I experienced the same issue with the function-calling tools.
I defined function arguments like :
{
"name":"ScalpProperties",
"description":"Propriétés du cuir chevelu",
"parameters":{
"type":"object",
"properties":{
"cuir_chevelu":{
"type":"array",
"items":{
"type":"string",
"enum":[
"normal",
"irrité"
]
}
}
}
}
}
Ant what I got:
{"cuir_chevelu": "irrité"}
The fix indeed made it worse, and impossible for us here in the nordics to use the function calls.
Has it been fixed? gpt-3.5-turbo-1106
This is a complete mess! We have to shut down our complete service if this canât be fixed soon. If possible, do a roll back of that previous bug fix.
Same issues here, now also with gpt-3.5-turbo-1106, not only gpt-4-1106-preview.
Hope this isnât the model itself (shouldnât be since it works fine without function calling).
We need a real fix here, impossible to do ourselves!
I am also waiting for them to fix this error. It makes it very difficult to move software to the production level.
Same here ![]()
I use the regular chat API can someone tell if the problem occurs using the assistant API?
I understand OpenAI is a lab but still ![]()
