Yes, that is the logical conclusion from little trickles of evidence. A different model at least as far as its fine-tuning on generating output.
The same is seen when you use gpt-3.5-turbo with or without function included in the API call - the AI has no understanding of a function definition when it is injected into the base model, but the emulated definition works fine when also including a different function to trigger the right model.
The language the AI emits can’t be emulated as AI input, as you can’t wedge yourself into that part of the ChatML token container. “Exploit” would be just calling a function, which the AI already does. The AI essentially emits a to=function.function_name, a message that is addressed to a “recipient”, which recognizes and accepts the AI output (json typical unless you make your own uncharted rules) and gives you the function_call AI language you receive by API.
I’m under the same impression. It’s why I believe it can easily lead to “whitespace spam” if there’s no suitable tokens, and why instructions are also needed to push the model towards writing in a JSON format.
Yes, totally, but the reason I started this thread seems like a different one, because completion often stops mid-sentence in a JSON-String value and always after 1050 tokens. So, mid-sentence, nearly every token is valid JSON.
…
I just had an idea. I’m using topP = temperature = 0. From my understanding, this reduces the space of possible next tokens, to the single most likely one. That may not play well with JSON-Mode, as I’m leaving the model with just a single option to chose from, which increases the likelyhood of whitespace loops.
Whatever, it would be great if openAI could be a litte more open about how their tech works Then we could figure this out way more easily.
I’m experiencing the same bug. I’ve narrowed the bug to the fact that on the 1024th generated token, GPT-4 will begin to generate empty space characters or a malformed JSON.
I’ve tried spamming more “Respond in JSON” in the system prompt, as well as modifying the topP and temperature values but to really no avail (or maybe I haven’t tried hard enough). Any recommendations if you were able to fix it since?