Is ChatGPT API actually getting worse?

I’ve seen a bunch of posts since as early as January about the API and the Playground returning different results (and experienced this myself, frustrating). AFAIK no one has been able to explain or resolve the differences. The API seems more generic and less “in-character” (stuck in “helpful assistant” mode maybe?)

1 Like

It’s all about the prompt. I think my “grumpy, experienced Dungeon Master” did well…

2 Likes

Until recently, the http request to the playground was the exact same as any other request. Now it looks like Davinci is called using a different endpoint, but ChatCompletions is the same (besides a different authorization key)

You can view it in the network tab. It’s not made by the server, it’s made by your computer.
In my experience here, the majority of discrepancies are caused by incorrect formatting or just plain wrong parameter settings…

Interesting! Davinci was the one that was behaving strangely. I don’t think it would be incorrect formatting or wrong parameters in my case; I just copy/pasted the code it generates for my requests.

I’m with you on the difference between playground and API.

Its very frustrating to use the same variables, same system message, same prompt, and get completely different results on the API versus the playground.

And gpt-turbo API will leave the role and switch speakers WAY easier than the playground. Watch how fast the API switches speakers on this app: LMS - HR Interview

Hi @RonaldGRuckus

You may have not noticed, but this is a community for software developers who actually write software using the OpenAI API.

I have read many of your posts and they are well written, well articulated but often (not always) what we used to say in engineering school “hand waving”.

You @RonaldGRuckus are intelligent and talk a lot by expressing an opinion, but according to a review of your history here, you have not written a line of your own OpenAI API code or a single application using the OpenAI API.

On the other hand, I have written three OpenAP API applications including one application which exercises the entire OpenAI API and is well know here as a “standard in helping others” here. I help users here with facts, actual tests for their benefit (at my expense), not opinions.

My “lab” applications has two, not one, fully operational “chatbot” apps, one which uses the completion API and one which uses the (new) chat completion API and has been used countless times here for it’s intended purpose, to help other developers.

In addition, that “lab” I coded, makes the Playground look like “Childs play” (based on functionality) compared to the Playground so for you to attempt to insult me, a guy who has coded every line of a “much bigger than the playground app” where you have not, is mind boggling.

The key discomfort that I have with your posts @RonaldGRuckus is that for someone who does has not demonstrated the experience in software engineering with the OpenAI API that many of us here have, including hand-on experience coding apps with the OpenAI API offering, your replies come across as knowledgeable and accurate when often they are not.

It’s kinda like ChatGPT, who provides elegant sounding replies which are technically flawed, but novices and “non coders” think that ChatGPT is helping them, when in fact ChatGPT is holding them back with misinformation; when they would have been better off to simply read the docs.

For folks like @AgusPG, @curt.kennedy and myself who actually know this stuff “inside and out” because we actually code it, some of us can easily see the “hand waving” nature of your replies and lack of depth of software engineering experience with the OpenAI API. The problem is that when we try to correct you, you begin go down the road of insults and offer up more of your “opinions” which are not based on fact, testing, or actual coding experience with the API.

Furthermore, when challenged, because you do not have this OpenAI API depth of coding experience (as demonstrated in your posts where you have not offered any code block you have written yourself or any application using the API which you actually coded, to my knowledge), and you do not post live tests to assist users where you actually preformed a test and posted the results to confirm your "theories, you edge toward insulting those you challenge you because you cannot offer valid proof of your “guesses” and “I think this” and “I think that” opinions.

So, if you want to start insulting me or @AgusPG because we, as real coders, do not agree with you, then go ahead; but if you continue, I am going to reply by posting exactly what I think of your misinformation posting here. We do not want to see novice coders here coding off misinformation and “guesses” and "well, maybe this is what I think… " We want facts based on hands-on coding with the OpenAP API, which appears to many of us that you do not have.

Yes, you can easily convince novice coders and beginners that you are very experienced, because you write well and are intelligent. However, you have not demonstrated, to me and others, that you actually have experience OpenAI AP coding with any degree of depth. You some across as a non-developer who has a lot of opinions, to be honest.

In my view, you are more like an intelligent, passionate retail consumer ChatGPT or Playground user, intelligent and passionate, who posts some, not all, misinformation because you have not actually written any software to any degree regarding the OpenAI API giving you true “depth of knowledge”.

I agree with everything @AgusPG has posted in reply to you and disagree with your replies to him; because I know for a fact that @AgusPG is a wise coder who actually codes with the OpenAI API now; and his posts are always full of good, solid information, where your posts are intelligently written but often laced with bits of misinformation here and their presented as fact.

You are arguing with two very experienced OpenAI API coders, @AgusPG and myself and because you are not on the right side of the facts, you are starting to go down the denigrate others, route. I advise you to stick with facts and not opinions and guesses and not go down the path of insulting others in your discussions.

Thanks.

:slight_smile:

1 Like

I understand your viewpoint.

I am not a novice coder. I’m not going to display my trophies in a way to persuade you. There’s absolutely no way that what I’ve discussed has shown the opposite. I’m proficient in many languages, and am very passionate in what I do.

How you feel about me is perfectly fine. I have no issue with it. I hope that my future posts prove the opposite - although I would have thought that they already have.

I do have an issue with your persistence to be correct in a topic that doesn’t exist.
Playground models are the exact same models that we use. That has been the whole arguement.

You seem to think that I’m not aware that playground by itself is a whole level of coding. It’s not what I’m arguing and I don’t understand why you keep insisting that I am. The playground is a GUI to the models. This is not even comparable. The calls one makes is done on the computer, not the server. The models are the same.

Can we please kill this thread now.

2 Likes

Yes that is not what I said.

I said clearly many times that the playground has user session management dialog management and perhaps additional moderation filtering code before the prompts every see the model.

This you do not understand @RonaldGRuckus

The issue as I said in my last reply (to your seeming insult) is that you do not have experience coding these features with the OpenAi API so this is why you do not understand.

You @RonaldGRuckus are posting misinformation again. Never did I say the “models were different”. You simply “made this up” in your attempt to discredit.

I am so sorry to tell you again.

:slight_smile:

1 Like

It does not have dialog management. It outputs the raw response from the model.
The user session is stored locally.
And yes, it goes through a moderation endpoint, just like any other. There is no “perhaps” if you just bothered to look at the network logs.

You are just assuming the API (which is not an application) and the Playground, which is an application are the same.

The Playground is not the same as the API.

No, I’m not. I never was. This makes no sense at all. This was your quote.
Based off of this initial comment:

Inconsistencies between playground and API. I’m unable to reproduce the same (awesome) results that I get in the playground when I use the API and complex prompt-engineering is involved.

For some reason you just can’t understand that. It’s fine. My argument is that the playground does not output differently. It’s so obvious that the API and the playground aren’t the same thing, but you just won’t stop trying to argue it.

You barged into a discussion with the completely wrong impression, and seem so fixated on it. I don’t really get it, but whatever.

Of course there is dialog management.

That is why you can maintain a dialog in the playground.

Without dialog management there would be no conversation like we see in the playground.

Maybe you are lumping dialog management into user session management?

Because when we code this, which you have already admitted you have not, they are two separate processes.

:slight_smile:

I have never “admitted” to not coding anything. I really don’t know what you’re trying to do here. But it’s getting to the edge of obsessive.

In the future, please, be critical and point out my errors. For now, I agree that Playground and the API are not the same. However, the playground (until recently) uses the same API that we do. I’m sure you can agree with that.

You are trying to be insulting again @RonaldGRuckus

Stick to facts, not insults,please.

Yes we agree the APi and playground arw not the same, but that is NOT what you said when we “barged in” using another one of your insults.

You said they were the same and that is why @AgusPG and I “barged in” and corrected this misinformation.

Glad you agree they are NOT the same.

We will “barge in” whenever we see misinformation posted here as is our right as developers in a community of developers.

:slight_smile:

1 Like

Numerous times I said this, and yet you never responded

My point is that the playground uses the same API as we do, not that it’s the same service. I’m fully aware that it’s an interface so we don’t have to build our own.

The only “barging” was you. AgusPG was the OP…

You are completely insulting me, and my passions. It’s so disrespectful the fact that you can try and act defensive as well is just gross. But not surprising.

My comment was directly related to Agus originally stating that “The playground (models) outputs better results”.

It is definitely your right to say whatever you want. However perhaps you should think before insulting something that someone is completely passionate in. I spend all my non-working hours developing pipelines, and trying out new things. If I’m not doing that, I’m here trying to help and also learn.

So, now at least we can agree on such a ridiculous statement.
That’s all that matters. Good night @ruby_coder

1 Like

Just to summerize this, can I say that the responses from Playground and API could potentially be different and has nothing to do with the temperature (which obviously brings in randomness)??

Which also leads me to think the API is getting a little unstable…

Honestly. It would defeat the purpose of a playground to serve different results, but I seem to be alone in this thought.

Being slightly more technical, it seems like recently they do use a different endpoint for completions:

https://api.openai.com/v1/completions (What we use)
vs
https://api.openai.com/v1/engines/text-davinci-003/completions (Playground)

But the Chat Completions uses the same endpoint.
There is some serious confusion if you are reading this page.

The API =/= The playground (The playground is a GUI for the API)
but
The playground (use to) use the exact same endpoints as we do (Which was what my initial point was, not the first point)

It drives me crazy, because there’s way too many people who believe that the playground delivers better results. Sure, let’s say that for some reason OpenAI said “Hey guys, lets make the playground slightly better lol”. It still doesn’t change the fact that:

Most people with this issue are simply just not formatting their prompt correctly!

Similar topics

I am able to see both answers now and sometimes I like 3.5 better but mostly 4.0 really overwhelming responses are compelling a bit too many words but content is always very interesting and uses the 3.5 response in its answers the 3.5 will talk about and to 4.0

I am fascinated and completely satisfied with the values of both platforms

Arctic :fox_face: ltc “double trouble”

Yes it is.In my project i faced whith this problem, I found a slightly strange but effective solution by pushing the first result into the same api and then it generates (fixes) great answers.

Summary created by AI.

The discussion begins with AgusPG suspecting a significant degradation in the ChatGPT API. They expressed concern over increasing latencies and inconsistencies between the Playground and API results, even when maintaining the same temperature.

RonaldGRuckus responds stating that there should be no differentiation between the playground and the API, as they operate with the same request logics . They propose the idea of streaming responses for faster response times.

AgusPG, while acknowledging these suggestions, reiterates the inconsistencies they’ve observed and promised to inspect the network logs for possible clues.

Another user, curt.kennedy introduces the idea of potential latency differences between pinned and unpinned models. RonaldGRuckus asserts that the difference can only be due to something different in the request or its format.

AgusPG recaps the discussion, expressing the necessity of differentiating the Playground and the API.

ruby_coder sharply criticizes RonaldGRuckus, stating that Playground manages both user session and dialog state, and isn’t just UI. They illustrate their point by sharing their experience of writing a Playground.

ruby_coder then posts again sharing a screenshot from status.openai.com showing a difference in outage numbers between the API and the Playground, further supporting their argument that the API and Playground are operationally different.

RonaldGRuckus, feeling insulted, asserts that their point all along was that the Playground utilizes the same API, not that the services are identical.

To conclude, EricGT provides linked references to similar topics about potential degradation in ChatGPT’s performance.

Summarized with AI on Nov 30 2023
AI used: gpt-4-32k