All of the chat models currently have limited output sliders.
I can see a few practical reasons:
- Too many new users misunderstand and specify the context length and can’t make API work;
- Setting is corresponding to what these models will actually produce, because they have been trained to output curtailed lengths to keep ChatGPT users in line.
- if 128k decides to go into a repeat loop, it can cost you several dollars