You are not alone! I’ve been experiencing similar issues with GPT-4o, particularly inconsistencies in reported token window size and outdated knowledge about current LLMs.
After extensive testing, I can confirm that despite selecting GPT-4o and having internet access enabled, the model consistently insists on having only 8.000 tokens of context. It even made an erroneous prediction that reaching a 128k context window would take another 10 years – clearly inaccurate given models like Gemini already surpass this.
Most alarmingly, the model completely denied the existence of a 128k context window for GPT-4o, despite numerous reports and OpenAI’s own documentation confirming its availability.
The responses are getting progressively worse, less accurate, and overall weaker. It’s extremely frustrating for paid users like us to encounter such limitations and misinformation, especially when we expect cutting-edge capabilities from the supposed “most advanced” models.
I strongly suggest reporting this to OpenAI support and documenting your interactions. It seems we are not the only ones encountering this confusing and misleading behavior from GPT-4o. Perhaps collective feedback will encourage OpenAI to be more transparent and accurate with their model’s capabilities.