edit: I had GPT-4 quash my post’s hopes, with the output length now being less than GPT-4…
The recent unveiling of the GPT-4 model has brought about significant advancements in context length capability, far surpassing its predecessors. It has not only doubled the capacity of the previous model, gpt-3.5-turbo-16k, but it has also increased the context length by 16 times compared to the earlier version of GPT-4. This groundbreaking development is now accessible to all, marking a profound shift in the AI landscape.
However, it’s important to note that this expansion in context length mainly pertains to the input that the AI model can comprehend, not the output it generates. This means that while the model can understand larger amounts of information at once, the output is still limited to 4k tokens. As such, tasks requiring extensive output, such as answering a large number of questions at once or writing a full movie script, are not feasible with the current model.
The fine-tuning of this model by OpenAI is another aspect that needs to keep pace with the expanded memory space. For instance, the command “summarize this document” used to produce a summary of 400-800 tokens on the 8k model, based on the reinforcement learning examples provided to the evaluators. Now, the model can understand a larger document, but the summary size will still be within the output limit.
This also means that the new “check in later and get your answer” assistants API, introduced to manage the increased complexity, will still provide answers within the 4k token limit.
The prior largest API models already ran out of [cognitive ability] prematurely (Is API completion model davinci-002 and its 16385 context useful for ANYTHING?)…