This experiment compares the output length and token compression efficiency of GPT-4 when repeating single characters such as “い”, “ぬ”, “e”, and “z”.
The results show that frequently used characters and languages tend to be more token-efficient, leading to significantly longer outputs under the same token limit.
Additionally, a “hallucination” was observed:the model failed to include an explicitly instructed ending sentence.This suggests a mechanism behind how GPT-4 handles output near the token limit.
ChatGPT #tokenlimits #promptengineering #hallusination