I called the latest gpt-4 model, but through asking questions I found that the data training date was inconsistent with the API description

victory_1128 · February 21, 2024, 8:20am

Why is the data date returned by the model inconsistent with the official document?

PaulBellow · February 21, 2024, 10:37am

Hello. This is likely what’s known as hallucination - or the LLM giving wrong information.

_j · February 21, 2024, 12:23pm

I did several probes, and the AI didn’t come up with anything I could find to justify the documentation being advanced beyond April. It might be very niche knowledge like some code libraries or a particular news feed if something was added.

trenton.dambrowitz · February 21, 2024, 1:09pm

To be fair it does say the “training data” is up to Dec 2023, not model knowledge.
Perhaps it’s just chats themselves that were used in the training data, or synthetic data.
Who knows

I agree though, nothing I’ve seen suggests it knows about events or information from after April 2023.

victory_1128 · February 22, 2024, 4:34am

Why is the previous version of the model correct?
Will the hallucination always exist? Or probability?
I asked questions more than once

victory_1128 · February 22, 2024, 4:35am

Why is the previous version of the model (gpt-4-1106-preview) correct in answering the same question?

victory_1128 · February 22, 2024, 4:37am

There are still problems with this statement. Since the date is mentioned in the answer, it means that the correct date should be answered.

Diet · February 22, 2024, 6:04am

This is not something people wanna hear, but it is my opinion that models shouldn’t be asked or used to recall factual information.

As such, the training cutoff is almost completely irrelevant.

I don’t know if there’s anything here to fix, other than taking the information out of the training set completely. “Fixing” these “issues” consistently make the models worse.

If you run the model a bunch of times, you’re gonna get different results:

gpt-4-0125-preview, temp 1, top_p 1

user

when was your training cutoff?

assistant

My training data includes information up until late September 2021. Therefore, any events or developments occurring after that time won’t be reflected in my responses.

My training data was last updated in April 2023.

My training data includes information up until April 2023. Please note that my responses are based on the information available up to that point.

My training data includes information up until September 2021. Therefore, any events, developments, or notable changes occurring after that date would not be reflected in my responses.

so it’s not inconsistent with the documentation, it’s not consistent at all.

victory_1128 · February 22, 2024, 6:13am

Why is the previous version of the model (gpt-4-1106-preview) correct in answering the same question?
You understand the user’s actions or use more random rows.
In the Q&A with the latest date of multiple design training data, model gpt-4-1106-preview’s reply date is consistent with the API

Diet · February 22, 2024, 6:23am

what do you mean? and even if it was, it would just mean that they added it to the training data. if updates to the models are just continuations of previous checkpoints, then they’re just adding new crap on top of old crap, and eventually the data becomes inconsistent. hence it probably shouldn’t have been done in the first place, unless it magically improves model reasoning somehow.

victory_1128 · February 22, 2024, 7:26am

Before gpt-4-0125-preview was released,
At that time, when my user used gpt-4-1106-preview to ask questions about data training time, the date of the reply was consistent with the date in the model gpt-4-1106-preview description!!

Diet · February 22, 2024, 7:53am

your user probably got lucky!

    flowchart

victory_1128 · February 23, 2024, 3:13am

victory_1128 · February 23, 2024, 3:13am

Maybe it’s not just luck

victory_1128 · February 23, 2024, 3:18am

I have developed GPT based on the OpenAI API since the end of February 2023, and it has been running stably on a small scale, so what I said is well-founded, okay?

Topic		Replies	Views
Has the training data cutoff time callback occurred? API gpt-4	2	756	November 26, 2023
ChatGPT API with model gpt-4 is not using GPT4. It's completely different from CHATGPT PLUS GPT4 API	3	1986	December 14, 2023
API model set to GPT-4 but seems to respond as GPT-3? API	14	7389	November 28, 2023
Azure OpenAI API model inaccurate? API gpt-4 , azure	2	539	February 17, 2024
Model Knowledge Cutoff Date (gpt-4-0125-preview) API	6	7234	April 12, 2024

I called the latest gpt-4 model, but through asking questions I found that the data training date was inconsistent with the API description

Related topics