Easy way to get a context window for a model

sciencealone · December 10, 2023, 8:22am

As far as I understand, the OpenAI API does not contain any method to get the size of the context window for a model used for a chat completion. It might cause some overload or underutilization of a model.

Definitely, some model names are clear enough (like gpt-4-32k) but the majority of them must be manually checked using the correspondent tables at https://platform.openai.com/docs/models.

It would be great to fix this inconsistency by any or all of these measures:

Include the size of the context window in the name of each model.
Improve the model object (https://platform.openai.com/docs/api-reference/models/object) by adding the size of context window.

Am I missing something? Are there any existing methods?

_j · December 10, 2023, 8:45am

Yes, some have made their own existing methods…

def init_price_classes():
    self.price_classes = {
        # language gpt-3.5, 4 (limits are base)
        "gpt4": {"price_in": 0.03, "price_out": 0.06, "price_train": -1, "limit_ktpm":10, "limit_rpm":"200"},
        "gpt4-32": {"price_in": 0.06, "price_out": 0.12, "price_train": -1, "limit_ktpm":10, "limit_rpm":"200"},
        "turbo": {"price_in": 0.0015, "price_out": 0.002, "price_train": 0.0080, "limit_ktpm":90, "limit_rpm":"3500"},
        "instruct": {"price_in": 0.0015, "price_out": 0.002, "price_train": 0.0080, "limit_ktpm":250, "limit_rpm":"3000"},
        "ft-turbo": {"price_in": 0.012, "price_out": 0.016, "price_train": 0.0080, "limit_ktpm":90, "limit_rpm":"3500"},
        "turbo-16": {"price_in": 0.003, "price_out": 0.004, "price_train": -1, "limit_ktpm":180, "limit_rpm":"3500"},...

def init_model_list():
    self.model_list = {
        'ft:gpt-3.5-turbo':         {'price_class': 'ft-turbo',   'endpoint': 'chat',  'tokenizer': 'cl100k_base',  'context': 4097, 'cutoff': '2021-09', 'retire_date': '', 'tune': 'tune'},
        'gpt-3.5-turbo-0301':       {'price_class': 'turbo',      'endpoint': 'chat',  'tokenizer': 'cl100k_base',  'context': 4097, 'cutoff': '2021-09', 'retire_date': '2024-06-13'},
        'gpt-3.5-turbo-0613':       {'price_class': 'turbo',      'endpoint': 'chatf', 'tokenizer': 'cl100k_base',  'context': 4097, 'cutoff': '2021-09', 'retire_date': ''},
        'gpt-3.5-turbo-16k':        {'price_class': 'turbo-16',   'endpoint': 'chatf', 'tokenizer': 'cl100k_base', 'context': 16385, 'cutoff': '2021-09', 'retire_date': ''},

It would be cake for OpenAI to add this to the models endpoint, but the last time they messed with that, they removed metadata…

sciencealone · December 10, 2023, 5:37pm

Thank you, @_j. Unfortunately, it is the same manual method I am trying to evade…

mrko · December 29, 2023, 7:48pm

+1 for adding an API for programmatically discovering model properties. This could be an entirely separate metadata API instead of embedding in other API responses and could also distinguish between model properties (e.g. context window size, training data cut-off) and price info.

_j · December 30, 2023, 4:47am

Easiest way to get a context window length for a model?

The hard way:

from openai import OpenAI
import re

example_error_msg = ("This model's maximum context length is 128000 tokens. However, "
  "your messages resulted in 262151 tokens. Please reduce the length of the messages.")

def get_context_error(err_msg):
    """Search for the integer value after 'maximum context length is"""
    match = re.search(r'maximum context length is (\d+)', err_msg)
    if match:
        max_context_length = int(match.group(1))
        if 1000 <= max_context_length <= 200000:
            return max_context_length
        else:
            raise ValueError("extracted context length beyond (1000-200000).")
    else:
        raise ValueError("No value found matching context length.")

def get_context_len(modelparam="gpt-3.5-turbo"):
    """Probe for OpenAI chat completion model context length limit;
    API request with input and max_tokens bigger than any model"""
    cl = OpenAI(timeout=30)
    bigdata = "!@" * 2**17  # 256k
    try:
        response = cl.chat.completions.create(
            model=modelparam, max_tokens=265000,top_p=0.01,
            messages=[{"role": "system", "content": bigdata}]
        )
        raise ValueError(f"Context len: $$NO ERROR!$$:\n"
                         f"{response.choices[0].message.content}")
    except Exception as e:
        err = e
        # print(f"Error: {err}")
        if err.code == 'context_length_exceeded':
            return get_context_error(err.body['message'])
        else:
            raise ValueError(err)


if __name__ == "__main__":
    model = "gpt-4-1106-preview"  # just chat completion models
    context_len = get_context_len(model)  # use model
    print(f"{model} context length: {context_len} tokens.")

mgarcia · January 26, 2024, 6:59am

It would be nice to have some endpoints to get this metadata, i.e., all the details about a model.

Topic		Replies	Views
API-command for retrieving the context_length API	1	1422	April 10, 2023
Request: Query for a models max tokens API	13	5391	June 12, 2024
What is the context window of the the new GPT 3.5 Turbo model (gpt-3.5-turbo-0125)? API	7	20059	February 11, 2024
chatGPT-4 context lengths API	7	43334	December 13, 2023
API for additional model details? API api	3	2034	June 20, 2023

Easy way to get a context window for a model

Related topics