Gpt-4-turbo-2024-04-09 does not anchor to most current cutoff date by default

Well, without explicitly steering the gpt-4-turbo-2024-04-09 model in the system prompt into being gpt-4-turbo-2024-04-09, it will not produce correct cut-off date data, you can try it out i.e. with the code I posted on the GitHub issue or here:

import os
from openai import OpenAI

# User question to be asked
user_question = "Is Cormac McCarthy still alive?"

# Instantiate the client with your API key
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

# Inform the user about the question being asked without model guidance
print("---")
print(f"Asking without specifying the model version in the system message: {user_question}")
print("---")
response = client.chat.completions.create(
    model="gpt-4-turbo-2024-04-09",
    messages=[
        {"role": "system", "content": "You are an AI assistant based on OpenAI's GPT-4."},
        {"role": "user", "content": user_question}
    ]
)

# Print the response
print(response.choices[0].message.content)

# Inform the user about the question being asked with model guidance
print("---")
print(f"Asking while specifying 'You are gpt-4-turbo-2024-04-09' in the system message: {user_question}")
print("---")
response_with_system_message = client.chat.completions.create(
    model="gpt-4-turbo-2024-04-09",
    messages=[
        {"role": "system", "content": "You are gpt-4-turbo-2024-04-09"},
        {"role": "user", "content": user_question}
    ]
)

# Print the response
print(response_with_system_message.choices[0].message.content)

(EDIT: I originally left the system message empty, but i.e. in the code above, it’s shown how the system message can instruct the model into being a GPT-4-based AI assistant and still output misaligned data [insert probabilistic model variations here] if the gpt-4-turbo-2024-04-09 is not mentioned)

Feel free to try it out. You can even change the system message to “You are a GPT-4 AI assistant” or whatever, and gpt-4-turbo-2024-04-09 still hallucinates being in April 2023 cutoff date at best.

Note that the new gpt-4-turbo-2024-04-09 is supposed to have knowledge up to December 2023 (see https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4), which it is unable to access (again, see the code example above) without the model variant specific steering in the system instructions.

I do wonder if it has something to do with the benchmark declines as mentioned i.e. here:

In developer experience terms, if it really is an “approach” is to have the API user pinpoint the model in the system prompt separately (even after pointing to the model in the API call itself), it differs quite a bit from what OpenAI’s competitors like Claude or Perplexity offer via their API by default.

When you specify a model sub-type like gpt-4-turbo-2024-04-09 in an API call, it’s (at least in my mind) a valid and logical expectation that this instruction alone should suffice for the model to utilize its most recent training data and knowledge cutoff, without further need for system prompt steering to pinpoint gpt-4-turbo-2024-04-09 separately for it to align to its latest cutoff date and its associated data.

API’s like Claude and Perplexity both seem to use the latest available data to deliver results when instructed to perform as an AI assistant without further system prompt steering on the API side, whereas gpt-4-turbo-2024-04-09 really seems to require pinpoint identification in the system message or it won’t anchor itself correctly to its proper cutoff date and hence can and produce i.e. outdated output by default.

The requirement for exact model identification within the system message (rather than relying on the selected model’s inherent knowledge parameters or i.e. simply to what model the API call is pointing to) could lead to the model sourcing information from an incorrect or outdated dataset, and that discrepancy might be a contributing factor to the latest model checkpoint’s reported sub-par performance.

Ideally, the latest model should always default to its latest data by default, and this doesn’t seem to be the case at the moment with gpt-4-turbo-2024-04-09. Again, feel free to use the code snippet above to do your own A/B comparisons with i.e. Q&A’s related to events that took place between Apr-Dec 2023 and come to your own conclusions. :slight_smile: