For a recently added engine to my Application, I am seeing some oddity that looks like a mapping bug.
I am sending a prompt with a request to use the gpt-4o-mini and I am asking the question below.
The ChatResponse says the actual ModelId is “gpt-4o-mini-2024-07-18”
Here is the UserPrompt
“What is your training data last day? What is your model Id?”
The Assistant responds with,
“My training data includes information up until September 2021, and I don’t have a specific model ID to provide. I’m based on OpenAI’s GPT-3 model. If you have any specific questions or need assistance, feel free to ask”
I am pretty sure that the mini should be trained later than 2021 and given the response, and therefor it appears that the wrong engine is replying.
The AI is responding from simply the pretraining and guessing, not by having some information injected into the context to alter the behavior - which is your job.
Your system message should give the AI application you create on API its identity and capabilities.
The API return metadata demonstrates you are using the model selected, or the model being employed by utilizing a simpler alias.
An even more practical AI model doesn’t have this answering bias at all.
I don’t think you understand the problem.
The engines might be evasive, they might answer with a date, but the answers from engine to engine are fairly consistent. And if they provide an answer for the training date, it is usually correct.
OpenAI language models are post-trained on RLHF, which uses past conversations voted high quality to reinforce the answering. This may have lots of answering of a GPT-3.x model knowledge cutoff.
Additionally, there is the world knowledge itself that the AI may have learned, and its own identity it may assume. Knowledge you could ask about directly, instead:
Provide a system message “You are AnswerBot, by Talldaniel. Pretrained knowledge cutoff: 2023-09”, then see your results.
I asked this question,
Tell me something that happened in the year 2023 and the month November in politics.
and the response was from the engine gpt-4o-2024-08-06 which is good until Oct,
I’m sorry, but I don’t have any information on events that occurred in November 2023, as my training only includes data up until October 2023.
So, if it is not a bug, then I will switch to asking questions about events until it can’t answer to make sure I have the engine I think I do.
The model is not smarter - you are just treated as a poor API programmer that needs an AI model to answer this question.
If you were to add capabilities of your own to the AI, this added message inserted by OpenAI into your system message can produce refusals instead of answering from your own new information.
The model gpt-4o-mini was trained on data upto 2023.
GPT models excel at solving tasks where natural language skills are involved and are great at instruction following and retrieval from given context which is why when using these models to look for factual data, it is recommend to connect them to a source like a search API to fetch fact information.
Having said that you can get model to give you events from before Oct 2023, by being specific about the timing from which you want the events from.