GPT-4 has a context window of 128,000 tokens, while Gemini boasts a staggering 2 million. Although these numbers are exciting, the reality is somewhat different. You may have observed, as I have, that the quality of reasoning by LLMs tends to falter with lengthy inputs—a phenomenon that current eval…

Reasoning Degradation in LLMs with Long Context Windows: New Benchmarks

alhome August 15, 2024, 8:01pm 2

That’s intriguing. I wonder why Sonnet 3.5 and Gemini perform better with negative d values?

Topic		Replies	Views
Need Help With Prompts? Ask me* Prompting chatgpt	149	18869	February 6, 2024
Poor quality response on trained LLM with pdf files Community gpt-4	29	6028	May 1, 2024
New 4-turbo model has a unique limit? Or is this a bizarre hallucation? API	18	4450	January 26, 2024
How to confirm that you got the correct value from a text other than repeating the same prompt over and over API	39	770	September 1, 2024
Train (fine-tune) a model with text from books or articles API	62	27769	November 30, 2023