When attempting to generate something in isiZulu, I sometimes get this. Using whatever max tokens I have set and costing me more for an unusable completion.
I’ve had to work around this by generating the response in a more familiar language and translate the response with another api call.
But this makes the user wait longer while spending more money.
It seems pretty obvious that you’ve ran out of output tokens, which is set by the max_tokens parameter when calling. Running the phrase through tiktoken, the ulu at the end of this phrase is its own token, so the output is truncated at the appropriate spot for this to be true.
Since the output is encapsulated within JSON, and the closure of the JSON is never returned by the generation, the parsing of the JSON gives you the “unterminated” error.
Thank you for assisting with interpreting the error message.
However, my issue is with the fact that the completion returns repetitive and unusable information.
I believe that this is a bug, and I am not sure how to report this other than posting it here.
That repeating is a common trait that can be provoked by several types of inputs on many different AI models.
If you need to ensure such inputs are quickly discouraged from catching themselves in a repeat pattern and will find a different output that might trigger a stop, you can increase the frequency penalty and repetition penalty in the API call, as well as increasing the temperature a bit to give it a better chance of breaking the repeat.