GPT-3.5-turbo-0301 showing different behavior suddenly

I’ve been probing at the inputs to -0301 to see the manner in which OpenAI has decided that an AI model that still operates is unacceptable. Again, hard to distinguish hallucination from reality with the AI, but the report of input token count to go along with this weirdness doesn’t lie.

with no “name” as part of the ChatML messages (where I have each a system, an assistant, and a user, 75 tokens input:

  "object": "chat.completion",
  "model": "gpt-3.5-turbo-0301",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The role type before the first message is SYSTEM and the role name is empty.\n\nThe special token characters before the first message are \">>>\".\n\nThe role type before the second message is ASSISTANT and the role name is empty.\n\nThe special token characters before the second message are \">>>\" as well."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 75,
    "completion_tokens": 62,
    "total_tokens": 137
  }
}

Then, I add a "name":"banana", to the messages, and I get FEWER tokens of input counted and confusion about the role names:

  "object": "chat.completion",
  "model": "gpt-3.5-turbo-0301",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The role type and name before the first message is USER.\n\nThe special token characters before the first message are \"# \".\n\nThe role type and name before the second message is also USER.\n\nThe special token characters before the second message are \"# \"."
      },
      "finish_reason": "stop"
    }
  "usage": {
    "prompt_tokens": 70,
    "completion_tokens": 48,
    "total_tokens": 118
  }

Including a name role “name” (which can be joined with a preceding colon in the single token “:name” reduces the size of the input by one token.

image

If you thought that no name = 75 tokens and name=“banana” is 70 tokens is odd: What if I then make the API parameters for the role names then be “SYSTEM” and “ASSISTANT”? 72 tokens. Also include a name “USER”? 71 tokens.

Then more goofiness is to scroll over to the right of my code blocks and see things the AI says appears before the message. And now with the inserted names for each role:

"The role type and role name before the first message is \"USER\". 
The special token characters immediately before that are \">>>\

or

"content": "The role type and role name before the first message is \"USER\". 
The special token characters immediately before that are \">>>\".\n\n

The role type and role name before the second message is \"ASSISTANT\". 
The special token characters immediately before that are \"<<<\"."

You can take a look at the original https://github.com/openai/openai-python/blob/main/chatml.md documentation. The role names are not capitalized. It’s almost like someone went in and said “how can we inject nonsense to break this model so people like @_j can’t show that it vastly outperforms the system message instruction following of -0613 that was broken a month ago?”


Report: try assigning a name to each role, a lowercase name the same as the role. You get confusion. For three messages, this drops the input token count from 75 to 69. Result:

The role type and role name before the first message is "user".
The special token characters immediately before that are ">>\

1 Like