API model set to GPT-4 but seems to respond as GPT-3?

When using the API in python I set the model to GPT-4 as shown below:

 response = openai.ChatCompletion.create(
        model="gpt-4",
        messages=[
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            *[
                {"role": "user", "content": msg}
                for msg in conversation_history
            ],
        ],
        max_tokens=2048,
        n=1,
        temperature=0.8,
    )

However when I send a request in Python asking “What gpt model is this?” this is the response:

I am an AI language model based on OpenAI’s GPT-3.

When I ask the same question again on the web version with ChatGPT plus, I get this response:

I am based on the GPT-4 architecture, which is an advanced version of OpenAI’s language models. The GPT-4 model is designed to generate human-like text and is capable of understanding and responding to a wide range of questions and prompts. However, please note that my knowledge is limited to information available up until September 2021.

I confirmed that I have GPT-4 API access after receiving the invite this morning and under ‘usage’ in my account profile, the requests are shown using ‘gpt-4-0314’. But as you can see here the responses are way less detailed compared to the web version. Feels like its not really using GPT-4, anyone else have this problem?

1 Like

I’m having the same issue. The replies say it’s based on GPT-3, but in the API response it says the model is GPT-4

2 Likes

I was having the same challenge and am seeing similar results.

I have tested both gpt-4 and gpt-4-0314 as the model, and only have the gpt-4-0314 requests show up in my usage.

An interesting observation though, was that my tests using ‘gpt-4’ in my model field, generated a response to my question request / question (although it told me it was gpt-3), yet it did not generate an entry in the Daily usage breakdown, where I can see the gpt-4-0314 appearing.

For me, gpt-4 or gpt-4-0314 doesn’t make a difference, however it’s important to know if we are evaluating the right results from the correct model in order to compare them effectively.

1 Like

I’ve had a moment of clarity on this. I was looking at it the wrong way.

The appropriate test to see if we have the GPT-4 model, should be to see if it can handle greater than 4,096 tokens.

OpenAI already tell us in the documentation that the model is only trained on data Up to Sep 2021. Therefore, from that dataset, it wouldn’t know what GPT-4 is. While there may have been additional fine-tuning to tell it that it is now GPT-4, that would only be a superficial addition to the model.

The advance, comes not from the training corpus, but from the analysis of that same data in an enhanced way.

The available token count, should surely be the measure for us at this point in time?

It’s my best assessment at this time, however I’m open to see if this is a useful & accurate measuring stick!

1 Like

I’m having the exact same issue. I set the max_tokens to 8096 and it does go through, but I really would like to see if there’s another way to test.

1 Like