ChatGPT 4 shows very bad and incomplete responses

Thank you for your efforts. Indeed, my code is quite complex and difficult to understand. This is partly due to the fact that I have not yet worked with almost all of the programming libraries that are used. So with the integration of DeepL I found that the language codes are different from argotranslate. Furthermore the handling of the sources is badly understandable, if one does not know the remaining code / use cases. The sources are about a functionality from the library langchain, namely RetrievalQAWithSourcesChain. The documents themselves have IDs as names and based on the IDs I can later insert arbitrary and detailed sources. Or not, if no sources are defined. But this part is basically not very good, because the langchain functionality doesn’t seem to be very deterministic.

However, this should not be about my project at all. I just added the info so that you get some context to it when reading it.

I unfortunately don’t have the time to look at your tests right now and I noticed that the API parameter documentation is missing? I assume you are using ChatGPT? Basically I guess that with multiple requests better results can be achieved. Maybe one really needs to adjust the own prompting.

My main intention was just to show an example where gpt-4-0314 generates usable code and the current version does not (with the first try). If OpenAI takes feedback from the forum into account (I haven’t found a better place for feedback), I think it’s helpful to show specific examples. I probably should have prepared it better and constructed an easy to understand example from it. Well, in any case a concrete example including concrete API parameters is I think better than no example. Haven’t seen very concrete code examples here on the forum in this regard.

1 Like

I appreciate the original code, although complex and perhaps not to my usual programming style. It served as a starting point for investigating a reported decrease in quality. For my own testing, I used LibreChat, an open-source version of ChatGPT, and initially adhered to the default settings:

  • Temperature: 1
  • Top-p: 1
  • Frequency and Presence Penalty: 0

In my following tests, I reduced the temperature setting to 0. These settings are equivalent to the default parameters used with the API in the absence of specific instructions.

It’s important to remember that GPT’s responses can vary due to its probabilistic nature, which requires running multiple tests to achieve a reliable assessment. The concept of “first try” can be misleading, as both effective and less effective responses can arise by chance, regardless of parameter tuning or model version. Consequently, without seeing the rest of his code or understanding the full context, it’s challenging to determine the validity of the observed decrease in quality. It could just redone it a way that works but isn’t as obvious as elif statements

1 Like

6/27/23 … & ChatGPT4 still appears to be functioning in an inferior, post-lobotomy state.

2 Likes

To anyone that was interested in requesting access to a previous version of ChatGPT4 (e.g., an April version or May 3 version, etc.), here is the thread that I started for that (that I mentioned above): Is there any way to access the earlier (smarter) version of ChatGPT-4 (e.g., pre-May 3 update) prior to nerf?

A few folks (@supereric7748 ; @mokhtarali2001 ) requested that I share this. Sorry for the late reply. Making noise about this appears to be our only recourse at this juncture, so keep bumping up these threads so they’re always at the top of this community forum.

2 Likes

I could confirm to this, especially with harder scientific journal articles with very specific terms/jargon and sentence structures. Sometimes, it defaults to the inferior version and I feel scammed using the “new” ChatGPT4.

Also, it feels chatGPT4 is becoming lazier. It seems that the AI becomes dumber as the content policy maker that limits everything while pandering to only certain ideologies (try it yourself). Even when I specifically asked for counterarguments to that “certain ideology”, it refused to budge and gave ones it BELIEVES TO BE THE TRUTH and kept resisting TO GIVE COUNTERARGUMENTS (AND WHATIFs) for the sake of “respecting all users”. This is a real bummer as an AI should be neutral (as what it claims) and useful to provide balanced view (as it claims too), but in reality, IT IS NOT and IT IS GETTING DUMBER for certain topics/niches. For the sake of PC and “safety” of this forum, I can’t disclose this, but certainly OpenAI is not “as open” as it claims (it is more closed).

Also be careful of its hallucination for factual information, it looks plausible and “trusted” with grandiose words, but IT IS NOT.

1 Like

I keep scrolling over the forum and find people like you (and me) who are totally disappointed by the fraud OpenAI are playing on their customers. The so called GPT-4 is dumber than GPT-3.5 when it was first released. It used to generate high-quality content and it used to produce content for many pages.

Today I don’t even see the “Continue Generating” button. The fake model just spits something between 600-800 tokens which 90% is useless and that’s all.

2 Likes

I have to agree. After using gpt-4 (primarily the code assistant, previously called the code interpreter) for months, today there was an extremely noticeable decline in reliability.

First off, it consistently refuses to return full code when specifically asked to do so. When advised of this, it apologizes, and continues to do the same thing repeatedly.

It’s also starting to come back with syntax errors, which it never used to do, and lapses in logic.

Since I can no longer specifically request the code assistant, it’s hard to troubleshoot. But the model is starting to behave a lot like gpt-3.5-turbo-16k, which I abandoned months ago.

The result of all of this is, where my coding time used to be reduced due to assistance from ChatGPT-4, it is now increasing with more and more mistakes that have to be corrected.

Here is the conversation: ChatGPT - A ShareGPT conversation

Whose coding job is this AI supposed to eliminate?

You don’t need to read the code, just look at my responses. It’s constantly coming back with comments like “add logic here” or “use same field mapping as…” which I explicitly tell it not to do that, to provide the code itself. This takes several tries, thus increasing the time it takes to develop the overall system widget. “What part of ‘do not do that’ do you not understand?”

It did not used to do this. It used to respond with the code as requested. And, if it did insert comments, it would immediately stop when asked to do so.

Furthermore, it used to either include error trapping code or recommend it. Now, it has to be be asked to include error handling in places where it should be obvious (as in any database call).

The gpt-4 API still appears to perform well for the applications it is assigned, but gpt-4 through ChatGPT has seriously declined. No joke.

p.s. I know someone is going to come back with the point that it is trying to conserve tokens. Well, what’s the point of a 100K token context window when you can’t use just a fraction of that to get a complete answer?

1 Like