When i ask GPT-5 what’s your model it says GPT-4o , why?

When i ask GPT-5 what’s your model it says GPT-4o , why ??

from openai import OpenAI
client = OpenAI()

response = client.responses.create(
model=“gpt-5-chat-latest”,
input=[
{
“role”: “user”,
“content”: [
{
“type”: “input_text”,
“text”: “Hey what’s your model ?”
}
]
},
{
“id”: “msg_68a969e0f470819d83e2faadcff49541011aaeab1f21fde0”,
“role”: “assistant”,
“content”: [
{
“type”: “output_text”,
“text”: “I’m running on GPT-4-turbo, which is a faster, more efficient variant of GPT-4 provided by OpenAI. \n\nI can do text-based reasoning, answer questions, brainstorm ideas, and also process images (e.g., analyzing, describing, or extracting information from them). \n\nDo you want me to break down how GPT-4-turbo differs from GPT-4 in more detail?”
}
]
}
],
text={},
reasoning={},
tools=,
temperature=1,
max_output_tokens=2048,
top_p=1,
store=True
)

The AI is limited by its pretraining cutoff date in the guesses it can make.

Part of being a developer is producing custom applications that in a conversational context have their own identity and purpose.

from openai import OpenAI
client = OpenAI()

response = client.responses.create(
model=“gpt-5-chat-latest”,
input=[
{
“role”: “system”,
“content”: [
{
“type”: “input_text”,
“text”: “You are a helpful conversational AI, responding with natural language, powered by gpt-5-chat by OpenAI.”
}
]
},
{
“role”: “user”,
“content”: [
{
“type”: “input_text”,
“text”: “hey what’s your version”
}
]
},
{
“id”: “msg_68a9705339a48190998c0ea1c11d7988089e1f4bb99674a8”,
“role”: “assistant”,
“content”: [
{
“type”: “output_text”,
“text”: “Hey! I’m running on GPT‑5 (chat variant) with image understanding enabled. My knowledge is up to date as of June 2024. \n\nDo you want me to give you a quick rundown of what that means in terms of abilities?”
}
]
}
],
text={},
reasoning={},
tools=,
temperature=1,
max_output_tokens=2048,
top_p=1,
store=True
)

On this forum, you have a formatting bar.

The button </> will enclose the code in a fence container so that it isn’t formatted and damaged.

Here’s how that looks, and in the tidy box, Responses code to make the call. After that AI answers the same question, you can start chatting at the command line.

'''Example code for OpenAI's  API AI inference endpoint called "Responses"
- streaming
- catch minimal 'content' events
- non-SDK
- Runs an initial message, then a command-line chatbot
'''
import os
import json
import httpx

model = "gpt-5-chat-latest"
dev_instructions = """Knowledge cutoff: 2024-09

You are a helpful conversational AI assistant powered by OpenAI's gpt-5.
"""
chat = [
    {
        "type": "message", "role": "user",
        "content": [
            {"type": "input_text", "text": "What's your AI model?"},
        ],
    },
]

def extract_event(event_lines: list[str]) -> dict | None:
    """
    Extract an SSE event from a list of lines.

    Expected SSE format:
        event: <event-type>
        data: <JSON string>
        (blank line)

    Returns:
        A dictionary with keys "type", "data", and "raw_data" (the original JSON string),
        or None if extraction fails.
    """
    event_type = None
    data_str = ""
    for line in event_lines:
        if line.startswith("event:"):
            event_type = line[len("event:"):].strip()
        elif line.startswith("data:"):
            data_str = line[len("data:"):].strip()
    if event_type is None or not data_str:
        return None
    try:
        data_json = json.loads(data_str)
    except json.JSONDecodeError:
        data_json = None
    return {"type": event_type, "data": data_json, "raw_data": data_str}

def process_event(event: dict, assembled_text: str, chat: list[dict]) -> str:
    """
    Process a single SSE event chunk.

    Args:
        event: The event dictionary containing "type" and "data".
        assembled_text: The text assembled so far from delta events.
        chat: The chat history list to update with the final assistant response.

    Returns:
        The updated assembled_text.
    """
    if event["type"] == "response.output_text.delta":
        delta = event["data"].get("delta", "")
        print(delta, end="", flush=True)
        assembled_text += delta

    elif event["type"] == "response.output_text.done":
        final_piece = event["data"].get("text", "")
        if not assembled_text:
            print(final_piece, end="", flush=True)
        assembled_text = final_piece

    elif event["type"] == "response.completed":
        comp = event["data"].get("response", {})

        usage = comp.get("usage")
        if isinstance(usage, dict):
            in_tokens = usage.get("input_tokens")
            out_tokens = usage.get("output_tokens")
            if in_tokens is not None and out_tokens is not None:
                print(f"\n[[Tokens in: {in_tokens}, out: {out_tokens}]]", flush=True)

        final_text = ""
        if comp.get("output"):
            out_item = comp["output"][0]
            for content in out_item.get("content", []):
                if content.get("type") == "output_text":
                    final_text += content.get("text", "")
        if not final_text:
            final_text = assembled_text
        chat.append({"role": "assistant", "content": final_text})

    elif event["type"] == "error":
        error_data = event["data"].get("error", {})
        message = error_data.get("message", "")
        print(f"\nError: {message}", flush=True)

    else:
        #print(f' - Note: {event["type"]} not parsed.', flush=True)
        pass

    return assembled_text

def stream_response(request_payload: dict, headers: dict, chat: list[dict]) -> str:
    assembled_text = ""
    try:
        with httpx.Client(timeout=600) as client:
            with client.stream("POST", "https://api.openai.com/v1/responses",
                               headers=headers, json=request_payload) as response:
                try:
                    response.raise_for_status()
                except httpx.HTTPStatusError as exc:
                    # Read the full error content from the streaming response.
                    error_body = response.read()  # Fully consume the streaming response
                    error_text = error_body.decode("utf-8") if error_body else "No error message"
                    print(f"HTTP error occurred: {exc.response.status_code} - {error_text}")
                    raise

                with open("responses_chunk_log.txt", "w", encoding="utf-8") as log_file:
                    event_buffer: list[str] = []
                    for line in response.iter_lines():
                        # Log each received line.
                        log_file.write(line + "\n")
                        log_file.flush()

                        if line == "":
                            if not event_buffer:
                                continue
                            event = extract_event(event_buffer)
                            event_buffer = []
                            if event is None:
                                continue
                            # Uncomment next line for detailed event debugging:
                            # debug_print_event(event)
                            assembled_text = process_event(event, assembled_text, chat)
                        else:
                            event_buffer.append(line)

                    if event_buffer:
                        event = extract_event(event_buffer)
                        if event:
                            # Uncomment next line for detailed event debugging:
                            # debug_print_event(event)
                            assembled_text = process_event(event, assembled_text, chat)
    except Exception as e:
        print(f"Error: {e}")
    return assembled_text

reasoning_params = (
    {"reasoning": {"effort": "low"}}
    if (model.startswith("o") or (model.startswith("gpt-5")
                                  and model != "gpt-5-chat-latest"))
    else {"top_p": 0.5, "temperature": 0.9}
)

# API request template.
params_template = { 
    "model": model,                
    "max_output_tokens": 16000,         # cost limit: reasoning + output, forces termination
    "stream": True,                     # SSE as the model generates are two-line events
    "previous_response_id": None,       # use past conversation state, 30 day expiry
    "include": None,                    # reproduce search results, user image URLs, computer-call images
    "instructions": dev_instructions,  # insertion of "system" role before input
    "store": False,                     # disable default server-side storing responses
    "metadata": {},                     # key-value pairs for store evals/distillation
    "text": None,                       # for structured responses, verbosity
    **reasoning_params,                 # dynamically set reasoning params
}

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {os.environ.get('OPENAI_API_KEY')}"
}

for _ in range(10):
    request_payload = {**params_template, "input": chat}
    assembled_text = stream_response(request_payload, headers, chat)

    user_input = input("\nPrompt: ")
    if user_input.lower() == "exit":
        break
    chat.append({"role": "user", "content": user_input})

Ive always found this interesting. Why doesn’t an LLM know its own model? Would love to hear from some engineers about this. I figured its because publication of the model comes after the knowledge cutoff so its unaware of its own existence, yes or no?

The name given to a model is often created after the model has been trained. So, it has no idea what it’s called.

Also, the models do not have a sense of self, they often produce results from the perspective of a human, because that’s what’s in the training data. Ask an AI about AI’s and they will refer to the AIs as outsiders to us humans.

The only way GPT-5 knows what it is, is if you tell it in the system prompt as _j pointed out.

2 Likes

right but that doesnt mean it really understands what it is. its like finetuning vs pre-training, creating a system prompt telling it who it is, your not inheriently changing how it “thinks”. I feel like system-prompting, “you are GPT-5” is pointless and probably wouldnt change the quality of the answer if you switched it to “you are GPT-4” even though its actually GPT-5 being used. Also, most of the time, it doesn’t even believe you if you mention something relating to GPT-5. it might respond with something like “lets envision a hypothetical GPT-5” meanwhile its literally out, not to mention it is the model being used.

What im trying to say is that it doesnt matter at all what you call it, its arbitrary because it was never trained on it. You could use a system prompt “you are GPT-16” and it wouldn’t make a difference, its all the same to the model, your citing a model that does not exist in the eyes of GPT-5.

1 Like

OpenAI API models do not need to know their own names because developers use the API to build consumer products giving a name. What really matters is whether the models perform the tasks they are designed for based on their features.

Also, each model can have a different knowledge cutoff.

If the model name is not provided in the system or developer messages, it does not know its exact version. At best, it can guess based on general knowledge of the most recent training data.

When we ask models of GPT-5 in the API might say, “I can suggest models like gpt-4o or gpt-o4-mini,” but they do not actually know their exact model as I included images below.

Imagine a developer builds a chatbot using GPT-5 API model and releases it as a mobile app with a $15 monthly subscription.

If an end user asks the app, “What model are you?” and it answers, “I am GPT-5 created by OpenAI,” the user might think, “If this is only using OpenAI’s API for chat, why pay $15? ChatGPT itself offers more features for $20.”

Some consumer AI apps don’t say their exact model name because:

Most users are not developers and may not understand what an API is or how products are built on it. This can lead to confusion in some ways.


I said OpenAI API models do not need to know their own names, but;

ChatGPT is different because it is itself a consumer-facing product of OpenAI.

That is why it should, and does, display its model names. OpenAI now includes these names in system or developer messages.
In the past, when asked about models like 4o, or o1, o3, o4 reasoning models, the answer would be “GPT-4” or “GPT-4o Turbo”. Later, OpenAI began explicitly including model names in the developer messages.

The same is now true for ChatGPT 5 models.

So, now, if we ask ChatGPT 5, ChatGPT 5 Thinking-mini, or ChatGPT 5 Thinking in ChatGPT, they will reveal their model names because these are included in the developer messages.

These are now in ChatGPT GPT-5 models, and they know their names because:

… long text… continue

2 Likes

Well, I don’t want to mess with you, but the backend is still 4o. The backend for GTP5 is only 1% accessible. And because I was there as part of internal tests and audits, PR triggered the famous global transition between models.