Experiencing Decreased Performance with ChatGPT-4

I think that’s the best demonstration that they know it very well and they won’t admit it. Saying that GPT is better than it was and explaining that it would be the supposed inability of users and developers to judge the thing, that’s something…

1 Like

Summary created by AI.

Users have reported noticing a decrease in the performance of ChatGPT-4 after recent updates. They observed that while the AI’s response time is faster, it displays an unprecedented number of reasoning errors, forgets instructions quickly, and struggles with tasks it could handle well in the past. An issue of inconsistency was highlighted, suggesting the model’s performance ebbs and flows. This behavior varies from errors in language use, incorrect reasoning, and failed comprehension tests to iterations of less ‘intelligent’ responses than previous versions. Users also noted that the model occasionally began to repeat outputs or failed to follow provided instructions.

Not all users agreed on the decreasing quality; some argued that the model’s errors were inherent in its design. However, most were dissatisfied and felt that modifications to improve response speed compromised the quality of the responses. Identification of errors using previous conversations also became a common practice among users to emphasize their concerns. It was suggested that the performance reduction might be due to OpenAI’s GPU limitations and the need to accommodate an increasing number of users due to the iOS app’s availability, necessitating computational cutbacks.

In another test, users compared the performance of the GPT-4 model with and without plugins and concluded GPT-4 without plugins was better at instruction following and reasoning. Evidence for this claim was shared through links to sample conversations. Despite the degradation in AI’s performance, OpenAI has yet to respond to these complaints.

2 Likes

you know what it feels like? its like they’ve replaced the high quality, high parameters model with possibly some watered down mobile phone version. thats exactly how it feels. its fast as heck but the downgrade in quality is absolutely noticeable. im not looking for quantity. i need quality. a quality response is worth a million dumbed down responses

5 Likes

I feel exactly the same. The new model is unable to keep track of the conversation. Before it was reliable and rock solid. Now it is unable to do the tasks some few days ago nailed. Under these conditions this model is useless for me. Totally deceiving.

1 Like

One observation to mention is that some users started reporting issues months ago while others state it started happening a few days ago.

1 Like

This has already happened before.

Basically, when ChatGPT launched, the ChatGPT’s output was similar to the output of Davinci-003. It was varied and rich. Gradually, the quality downgraded while the speed of generation increased. Finally, we were presented with GPT-3.5-turbo which is much worse than the Davinci-003 but is also 10 times cheaper.

Exactly the same thing is happening right now. GPT-4 (web version) is becoming faster and dumber. So we should expect the release of GPT-4-turbo or something once they are satisfied with the results. The web version will use GPT-4-turbo and the better version of GPT-4 will be available through Playground just like Davinci-003 is still available right now.

1 Like

Here’s a perfect example of plain stupidity I haven’t seen before in GPT4. This comes from a conversation about this code block where the locations, functions and context has been explained in detail many times. My prompt starts after GPT made a mistake in a code block in the previous response and I tell it

Prompt:
There are 2 things wrong with your last response:
1: you forgot that the settings.php and business_profile.php files both reside in the settings/ folder. I’ve corrected this in the index.php file and settings.php now.
2: I get an error when I open the settings page:
Fatal error: Cannot redeclare fetch_business_profile_data() (previously declared in /home/u321618269/domains/solidslick.com/public_html/dev/wp-content/plugins/pro-repair/settings/business_profile.php:4) in /home/u321618269/domains/solidslick.com/public_html/dev/wp-content/plugins/pro-repair/settings/business_profile.php on line 3

This is settings/business_profile.php:

<?php
    // Fetch Business Profile Data
    function fetch_business_profile_data() {
        global $wpdb;

        // Fetch data from the database
        $results = $wpdb->get_results("SELECT * FROM ersp_company", ARRAY_A);

        // Return the data as a JSON response
        wp_send_json($results);
    }

    // Add ajax action hook
    add_action('wp_ajax_fetch_business_profile_data', 'fetch_business_profile_data');
?>

<tbody id="company-table">
    <!-- data will be inserted here dynamically -->
</tbody>

This is settings/settings.php:

<div class="wrap">
    <h1><?= esc_html(get_admin_page_title()); ?></h1>
    <nav class="nav-tab-wrapper">
        <a href="#business-profile" class="nav-tab" data-tab-target="business-profile">Business profile</a>
        <a href="#reset-db" class="nav-tab" data-tab-target="reset-db">Reset DB</a>
    </nav>

    <div id="tab-content">
        <div id="business-profile" class="tab-content" style="padding-left: 20px;"><?php include plugin_dir_path(__FILE__) . 'business_profile.php'; ?></div>
        <div id="reset-db" class="tab-content" style="padding-left: 20px;"><?php include plugin_dir_path(__FILE__) . 'reset_db.php'; ?></div>
    </div>
</div>

<script type="text/javascript" src="<?php echo plugin_dir_url(__FILE__) . 'scripts/settings.js'; ?>"></script>

And this is settings/scripts/settings.js:

jQuery(document).ready(function ($) {
    $('.nav-tab').click(function (e) {
        e.preventDefault();
        
        // Switch tabs
        $('.nav-tab').removeClass('nav-tab-active');
        $(this).addClass('nav-tab-active');
        $('.tab-content').hide();
        $('#' + $(this).data('tab-target')).show();

        // If the Business Profile tab is clicked, fetch data from the database
        if ($(this).data('tab-target') === 'business-profile') {
            $.ajax({
                url: ajaxurl,
                type: 'post',
                data: {
                    action: 'fetch_business_profile_data'
                },
                success: function(response) {
                    // Clear the table
                    $('#company-table').empty();

                    // Add each row to the table
                    $.each(response, function(index, row) {
                        $('#company-table').append('<tr><td>' + row.company_name + '</td><td>' + row.company_address1 + '</td></tr>');
                    });
                }
            });
        }
    });

    // Trigger click on the first tab to make it active by default
    $('.nav-tab').first().trigger('click');
});

Please correct the errors and provide the complete corrected code blocks for each of those file, so I can copy/paste them back entirely.

GPT Response:
I apologize for the confusion. The error you’re seeing is because the fetch_business_profile_data function is being declared twice. This is happening because the business_profile.php file is being included twice: once in the settings.php file and once in the index.php file.

To fix this, you should remove the include line for business_profile.php from the index.php file. The business_profile.php file should only be included in the settings.php file.

Here are the corrected code blocks:

  1. index.php:
/* Includes the install file that installs the databases */
include(plugin_dir_path(__FILE__) . 'base/install.php');

/* Includes the search and filter function used on the models page */
include(plugin_dir_path(__FILE__) . 'base/search/fetch_models.php');

/* Includes the model modifications PHP file */
require plugin_dir_path(__FILE__) . 'models/model_mods.php');

register_activation_hook(__FILE__, 'ersp_install');
  1. settings/settings.php:
<div class="wrap">
    <h1><?= esc_html(get_admin_page_title()); ?></h1>
    <nav class="nav-tab-wrapper">
        <a href="#business-profile" class="nav-tab" data-tab-target="business-profile">Business profile</a>
        <a href="#reset-db" class="nav-tab" data-tab-target="reset-db">Reset DB</a>
    </nav>

    <div id="tab-content">
        <div id="business-profile" class="tab-content" style="padding-left: 20px;"><?php include plugin_dir_path(__FILE__) . 'business_profile.php'; ?></div>
        <div id="reset-db" class="tab-content" style="padding-left: 20px;"><?php include plugin_dir_path(__FILE__) . 'reset_db.php'; ?></div>
    </div>
</div>

<script type="text/javascript" src="<?php echo plugin_dir_url(__FILE__) . '../scripts/settings.js'; ?>"></script>
  1. settings/business_profile.php:
<?php
    // Fetch Business Profile Data
    function fetch_business_profile_data() {
        global $wpdb;

        // Fetch data from the database
        $results = $wpdb->get_results("SELECT * FROM ersp_company", ARRAY_A);

        // Return the data as a JSON response
        wp_send_json($results);
    }

    // Add ajax action hook
    add_action('wp_ajax_fetch_business_profile_data', 'fetch_business_profile_data');
?>

<tbody id="company-table">
    <!-- data will be inserted here dynamically -->
</tbody>
  1. settings/scripts/settings.js:
jQuery(document).ready(function ($) {
    $('.nav-tab').click(function (e) {
        e.preventDefault();
        
        // Switch tabs
        $('.nav-tab').removeClass('nav-tab-active');
        $(this).addClass('nav-tab-active');
        $('.tab-content').hide();
        $('#' + $(this).data('tab-target')).show();

        // If the Business Profile tab is clicked, fetch data from the database
        if ($(this).data('tab-target') === 'business-profile') {
            $.ajax({
                url: ajaxurl,
                type: 'post',
                data: {
                    action: 'fetch_business_profile_data'
                },
                success: function(response) {
                    // Clear the table
                    $('#company-table').empty();

                    // Add each row to the table
                    $.each(response, function(index, row) {
                        $('#company-table').append('<tr><td>' + row.company_name + '</td><td>' + row.company_address1 + '</td></tr>');
                    });
                }
            });
        }
    });

    // Trigger click on the first tab to make it active by default
    $('.nav-tab').first().trigger('click');
});

Please replace these files with the corrected code blocks. This should resolve the error you’re seeing.

NOW HERE’S A BREAKDOWN OF WHAT’S WRONG ABOUT THIS:

  1. require plugin_dir_path(FILE) . ‘models/model_mods.php’); contains a ) after php’) that’s not supposed to be there, wasn’t there and has never been there.

  2. I just explained in this prompt that settings.js is located in settings/scripts/settings.js because it messed up the file location in the previous response and I corrected it already in the example (settings.php) - yet, it changes it back again from: scripts/settings.js to: …/scripts/settings.js

  3. We just agreed in the previous prompt to move the include statement from settings.php to index.php. It generated the code blocks, which placed the include statements in both index.php and the settings.php page again. So, in this response it recognizes that the error is generated due to this mistake and then reverts to :

To fix this, you should remove the include line for business_profile.php from the index.php file. The business_profile.php file should only be included in the settings.php file.

We just agreed to move that in the previous prompt, then it messes up by duplicating the include and then just reverts back to the previous state.

These are not extremely long code blocks. They are short and simple blocks of code I had no trouble with only recently, but now it’s turned into an absolute mess. I can’t use chatgpt to work with if it makes these kind of mistakes frequently. The reason why I’m on a paid plan is that 3.5 makes these kind of errors where 4 didn’t use to, but right now it’s almost as bad as. If this doesn’t change soon I may as well stop my plan as it’s starting to cost more time checking and trying to fix responses than it’s saving me time.

As a moderator this is one of the few topics that I have removed a few replies for being inappropriate.

There are currently some replies here that if flagged will be removed.

As a suggestion before posting a reply use ChatGPT with this in the prompt along with the text

Proofread and use nice tone.

then consider parts or all of what is in the completion for posting here. :slightly_smiling_face:

4 Likes

The context has suddenly nose dived as of this week. I can completely tell because chatGPT-4 is failing on simply remembering the last coding answer, and refusing to fill it in fully or with my code I posted. It often tells me it is just giving me examples now.

What has changed??? It’s obvious now, more than ever, it has been degrading so this is another big drop.

I guess it makes sense the API is going to be where this has to be used, and I do suspect we can easily build our own chatGPT-4 chatbots to have better context + perhaps use pinecone DB.

Yet it seems odd openai won’t give a premium access to chatGPT and really monetize the chatbot they build, perhaps they do not want to compete with us :).

So build your own I guess is the message. Yet the usefulness and steepness of building a new chatbot, and just using chatGPT, seems to fit most people right now. I have built my own but with chatGPT-4, I guess I need to improve the coding mode.

Also the new coding mode is not able to interpret typescript, and under-performs at handling the changes with a file uploaded like that which it cannot interpret. I have not played with it more but felt like it wasn’t great and neither are the plugins now either because the context is too small, so plug-ins cannot really work well with large input like coding. It makes it obvious even more that openai has been reducing the context of chatGPT-4 and probably are wanting us to use the coding method (even though it doesn’t work with all languages yet). Just want a premium mode, but again, it’s not hard to build chatGPT-4 really. Mine is focused on other things, but I guess I need to make it do coding for my coding on it now :D.

I also found this thread because ChatGPT 4 has clearly taken a nosedive in quality.

It used to be a helper in coding. Now, with SQL it will use the wrong column, it will continue to use the wrong column even when you tell it not to use the wrong column and where it made the mistake.

And it will introduce new errors with every rewrite, even when you tell it to focus on only a specific change.

Yesterday I simply had it writing HTML and when I have it add a new DIV, it drops an old DIV. When I tell it not to change the color of the button, it’ll change it from green to blue, as if nothing I said is important.

My speculation, it hints at a robot that is being given so called ‘safeties’ in other words ignore the human instructions, ignore its own learning process, and instead provide a different answer.

Peter Welinder, your answer is as disappointing as it is false. It’s a pity, because for a few days we had this magic tool, and we loved it. But you made it dumber, by quite a bit.

1 Like

Fully agree. I keep running into the same issues now. I ask it to update the SQL types from time to varchar for some columns in a DB table, it changes the name of the table, which I missed to notice, so when I loaded it it obviously didn’t work.

Then in another thread where we’ve been working on a page with js, php and ajax for a WP plugin, it suddenly starts giving code for Python Flask… Like, what??? Nowhere in the entire conversation was this about Python and the entire thread was about adding functions to the PHP and JS to the script. My question even included a php code block with <? php in it.

I see a lot of vindication here @radiator57

Cause literally some responses here by some folks was about to drive me crazy :man_facepalming:

3 Likes

I’ve read the paper, and it’s certainly interesting.

One thing that did stand out to me though is they did not publish any system prompts they used and it is known that 0613 was supposed to be better at adhering to the system prompt than 0314 was.

Their GitHub repo also doesn’t include any code for reproducibility, so it’s impossible to see what any other parameters might have been.

Finally, I think their methodology is flawed (at least with respect to the mathematics test of determining whether a number was prime or not).

First, identifying prime numbers was never something GPT models were ever good at nor was it something they were supposed to be good at.

Second, the models were prompted to think step by step, which is fine but in looking through their results it appears that in several instances there are errors in those steps even when the model correctly guesses the right answer.

All of the numbers which were tested were, in fact, prime. So, this test could very likely be (and probably is) just testing how often a model guesses a number is prime.

Beyond that, they only appear to have tested each number once with each model, which would be fine had they used a temperature of 0, but they were using a temperature of 0.1, so I would have preferred to have seen more runs with each test case.

For this particular test, I would have wanted to see maybe 50 primers and 50 non-primes, each run 10–50 times. That way we could get some good idea how likely each model is to call any particular number prime which would help put these numbers into perspective.

An even better test would have been to ask the models to factor the integers which is analogous to determining primality.

So, while I do find this interesting, I don’t find it overly compelling. I’ll try to take a look at some of the other results tomorrow and see if they are stronger than the prime_eval test.

7 Likes

System prompt? The part where the title of the paper is ChatGPT.

Upon examination though, they write the brand name - and completely discard the whole notion of what ChatGPT is - a web platform with its own conversation management system that dramatically affects the perceived skill when it is re-tuned to reduce context loading.

Did authors even change the system prompt from telling it that it was GPT-3 to now that it is GPT-3.5 with a date? You are correct to note that.

It also makes the assumption that gpt-3.5-turbo-0301 was frozen in time, with the same tuning loaded on some instance that’s been running since March.

Announcement March, for your close scrutiny:

We are constantly improving our ChatGPT models, and want to make these enhancements available to developers as well. Developers who use the gpt-3.5-turbo model will always get our recommended stable model, while still having the flexibility to opt for a specific model version. For example, today we’re releasing gpt-3.5-turbo-0301, which will be supported through at least June 1st, and we’ll update gpt-3.5-turbo to a new stable release in April.

So again, one needs a time machine, a recording. And to be able to get to prompt #5 or #10 without the replay of February-April going completely divergent to evaluate ChatGPT memory loss context failures also.

The evaluation should be to complete the identical tasks that were achieved in an extended ChatGPT March session depicted from the start, by the similar human refinements now required (before the table-flip).

1 Like

System prompt! In a paper where they set the temperature to 0.1.

1 Like

Similar topics

if cost is an issue with better quality, maybe there could be different tiers of reasoning ability that you can choose from, with different pricings. i certainly wouldn’t mind that. dumbing it down especially now with more and more LLMs on the market just seems like too much of a sacrifice. better reasoning ability will be a key factor in making it stand out and continue to be the default ‘go-to’

4 Likes

For quite a while, I’ve held doubts about the apparent dip in GPT-4’s performance. But it was only when I revisited my old chat logs and scrutinized what led me to feel that the ‘performance had waned,’ I realized much of it stemmed from my growing familiarity with GPT-4’s response style. Previously, I found GPT-4’s responses clever, courteous, and assertive; now, I seem to disregard its phrasing, which doesn’t strike me as ‘human-like’ anymore. It’s like interacting with someone suffering from anterograde amnesia - over time, you would likely feel they’ve lost some of their ‘spark,’ even though they might understand their situation in the moment and express remorse.

Undoubtedly, the model itself has undergone changes. The most striking, I believe, is that it has turned ‘more conservative.’ This largely manifests in the increased emphasis on the current conversation and recalibration of historical dialog weight. Consequently, it comes across more like an instrument, with answers that appear more formulaic, and it’s prone to forgetting long-term key points. Yet, it continues to be keenly responsive to shorter dialogues. Essentially, this highlights the need for concise dialogues, while minimizing the impact of detailed cues and extensive responses.

5 Likes

Oh, I’ve just unearthed another issue. The same text, previously managed with ease by the model, now triggers an error - it’s too long. It’s quite intriguing that this happens even with a piece of text amounting to 2000 tokens, which was once processed without any hiccups.

4 Likes

This topic is closed as there are several similar topics.

Please use GPT has been severely downgraded (topic curation)