Insights on ChatGPT Enterprise Using GPT-4-1106-Preview Based on Context Length Specifications

Diet · January 20, 2024, 1:32pm

My understanding is that a lot of models’ context length is very vram bound - so it could very well be the same model, just running on cheaper hardware.

This could potentially also explain why OpenAI API charges by input tokens, and have an output limit; perhaps they delegate to specific nodes with specific hardware configurations by prompt length

    flowchart
         GPT-4-->a
         a["tiktoken +4k"] -->8k
a-->16k
a-->24k
a-->32k
a-->...
a-->128k

Topic		Replies	Views
Will the new context window make it to ChatGPT API chatgpt , api , gpts	2	15203	January 29, 2024
GPT 4 Turbo is limited to 4K? API gpt-4	16	13552	April 9, 2024
chatGPT-4 context lengths API	7	39813	December 13, 2023
GPT-4 128K only has 4096 completion tokens API gpt-4	9	25814	February 27, 2024
ChatGPT-4 Limits? Are they the same as for ChatGPT-3.5? API	12	8503	December 12, 2023

Insights on ChatGPT Enterprise Using GPT-4-1106-Preview Based on Context Length Specifications

Related topics