Is OpenAI ChatGPT tokenization case-sensitive?

OpenAI may have lower-cased the training data for better generalization. Or not.

Is OpenAI ChatGPT / GPT4 tokenization / vocabulary case-sensitive?

you could use the tokenizer to figure this one out, I did it for ya :smiley:

Screenshot 2023-05-27 at 7.36.59 pm

1 Like

Thanks a lot @sdfgsdfg

I confirm that OpenAI API shows that the tokenizer is case-sensitive.

This means that the case of the prompts matters.

At temperature 0:

Prompt: tell a me nice thing

You are an amazing person and you have so much to offer the world.

Prompt:  TELL ME A NICE THING

You have a kind heart.

prompt: tell me a nice thing in upper case

YOU ARE AMAZING!