How Efficient Are Single Characters? An Output-Length and Token Compression Benchmark for GPT Models

This experiment compares the output length and token compression efficiency of GPT-4 when repeating single characters such as “い”, “ぬ”, “e”, and “z”.

The results show that frequently used characters and languages tend to be more token-efficient, leading to significantly longer outputs under the same token limit.

Additionally, a “hallucination” was observed:the model failed to include an explicitly instructed ending sentence.This suggests a mechanism behind how GPT-4 handles output near the token limit.

ChatGPT #tokenlimits #promptengineering #hallusination

1 Like

Dear vb
Thanks for checking and adjusting the category so quickly!
Also, I appreciate the like — glad the topic resonated even a bit.