Is ChatGPT API actually getting worse?

That is because the text exists in the dialog window. You don’t need to prune what the user sees in the UI to have session management.

You do not know how OpenAI prunes, summarizes, or otherwise management the session @RonaldGRuckus . You are guessing and offering your guesses as hard facts.

You “don’t know” but you assume and you post these assumptions and guesses as a “hard facts”.

There is no documentation anywhere in the OpenAI platform which states “the Playground results are the same as the API”.

We do not even know if the API endpoints used by the Playground run on the same hardware as the APIs used by developers in our code. We know, for a fact, we are charged per token usage, for both, but we don’t know for a fact OpenAI has not added additional filtering or moderation to the Playground to protect “their brand”, etc.

It is just a “guess” to say “they are the same”.

You are “guessing” and posting your guesses as facts; as if your guesses and assumptions are factual. However, they are not factual (they are just guesses), this I am sure as someone who has written two chatbots using the API (one using chat completion and one using completion endpoints, and both require a lot of coding on top of the API calls).

Furthermore, as mentioned, OpenAI must protect their reputation and media blah, blah attacks, etc. so it is very likely OpenAI has added addition moderation and filtering to the Playground; but this is just a guess, as I do not know for sure.

:slight_smile:

Well. I clearly don’t understand.

To me, it makes no reasonable sense that the models in the playground output differently than the models that we make our calls to. The endpoint, the payload, everything is the exact same. The only difference being the api key. I’ve never noticed a difference, and never had any reason to assume so.

If it is that way, I’m sorry for the misinformation

Yes, I understand your assumptions @RonaldGRuckus and you are not the only one who makes them here and elsewhere.

I also apologize for pointing this out so directly. I’m coding a different “non OpenAI” project as we speak, and am sure my reply comes across as “too blunt”, as they often do.

The OpenAI models are stochastic, so it is also not prudent to believe they will have deterministic outputs, based on input, especially for non 0 temperatures. The very nature of setting the temperature generates randomness.

So, if someone is using the Playground with a temperature of 0.7, for example, it is not correct to assume that an API call to the same endpoint by someone else will generate the same result (unless the model is overfitted, of course).

It’s easier to view the “non-overfitted” models as a kind of “cup of dice” and these stochastic models will generate a completion based on the temperature of the model (the randomness); so if you shake a cup of dice which are the same dice that I have, of course if we will get a different dice throw most of the time. This is true between consecutive API calls using the same code, based on temperature (randomness specified) and model fitting of course (not even considering the Playground).

However, with the Playground, we do not know how the input is finally filtered or moderated on the OpenAI side before the messages or prompt is sent to the API.

But honest, since I have written my own “Playground” which has a lot more features, I almost never use the Playground and have not attempted to reverse engineer it.

Anyway, I’m not trying to be a PITA, I’m just confident, based on writing two chatbots recently, using both the chat completion and the completion API methods, that there are myriad ways to filter, prune, summarize etc, the message history sent to a chatbot to maintain state; and slight variation in the implementation will change the output of the model.

Personally, I do not have the code in front of me on how the Playground does this, or how OpenAI may or may not add additional filtering to protect the OpenAI brand integrity, hence it’s hard to comment further without guessing.

However, my “best guess” is that OpenAI has some filtering and content moderation we are not directly aware of in the Playground because OpenAI must guard against people hacking around with prompts to generate headline grabbing compilations which will damage their brand integrity and mission.

Hope this helps.

:slight_smile:

1 Like

No, I completely agree that the results aren’t the same by its nature.

However, to say that there’s inconsistencies between the API and playground, that cause playground to output better results is just wrong. The calls are practically the same is all I’m trying to say.

No it is not wrong, for the many reasons I have already posted.

You are just assuming the API (which is not an application) and the Playground, which is an application are the same.

That is what is wrong @RonaldGRuckus

There is a lot of code written to maintain dialog state, which you would know if you sat down and wrote your own chatbot using the API to maintain dialog state.

Whatever you, as a developer, would come up with to manage state is not going to be the same as what the Playground does.

You are just making a huge assumption without any basic in fact; and that is why I keep calling you out on this.

It is easy for me, as a developer who has written two OpenAI chatbots which maintain dialog state, that the code on top of the API is the “secret sauce” (not the basic API calls).

The Playground is NOT the API. The Playground is an OpenAI software engineering developed application which uses the API; but there has to be code on top of the API (filtering, pruning, summarization, moderation, etc. what it is I do not know because I do not have the full application code in front of me). What happens between the UI and the rest of the process in the Playground happens with code written by OpenAI.

The Playground is NOT the API.

The Playground is an OpenAI application build on top of the API; just like any developer would develop a similar Playground, which I have written BTW; but my application is more more detailed than the Playground, and does a lot more (and of course, does it differently since I wrote the code and did not copy OpenAI’s code, as the Playground code is not public, open source to my knowledge).

If the Playground source code is public, open source, please post a like to it. According to the OpenAI GitHub official account, there is no Playground opens source:

:slight_smile:

Reference:

Yes, the playground is an interface so one doesn’t need to write their own code to try out the API. It uses the same endpoints.

To clarify, I know API and Playground aren’t comparable words. All I’m trying to say is that they make the same calls using the same parameters and endpoints that we have.

There is a lot more to a chatbot application than the API endpoints.

In fact, the API calls are the most trivial part of building a chatbot application which maintains dialog and use session state.

My feeling, based on your replies @RonaldGRuckus is that you have not developed a a full blown OpenAI chatbot applications using the API; because if you had of actually written one, which had to manage session, state, pruning, summarization, etc then you would know that the API calls are the trivial part.

:slight_smile:

Playground by no means has anything to do with the actual processing or management. It’s a front-end service.

Yes, I have written them, and I’m fully aware of the complexities. The playground does no pruning, or context management. I don’t really understand what you mean by that. ChatGPT does, not the playground.

By definition, playground is a GUI

Sorry, @RonaldGRuckus

It’s not really a good use of our time to debate back and forth with you when. you continue to post your opinions and assumptions as fact.

By whose definition?

Yours, of course @RonaldGRuckus

That is the core issue here. Whatever “you assume” and “you think” you offer as “hard facts”

I have written a Playground, and it is “much more than a UI” as you have stated @RonaldGRuckus, showing again, that you do not understand coding as much as you think. The API does not manage state. The Playground manages both user session and dialog state. It also may well add additional moderation and filtering to protect the OpenAI brand, as I keep saying but you keep rejecting.

I have not updated the topic above (busy coding), but it is more feature rich than before, including a
both chatbots, completion-api-based an chat-api-based:

I can assure you, @RonaldGRuckus that writing a full-blown chatbot app, playground or not, is not just writing UI code. The UI code is trivial (for a Rails developer like me).

Sorry, it’s pointless to continue this topic @RonaldGRuckus because you are wrong, but you think you are right and make so many assumptions based on “what you think” versus what is “a fact” that it’s not a good use of our time to continue, since I am almost sure next you will tell me you have written a “full blown playground”, blah blah, and you know all about that as well.

Take care.

:slight_smile:

I told myself not to answer in this topic anymore when I started seeing that we were getting off-topic and there was no real point on it. But I just needed to do it once again just to tell you that I admire your patience @ruby_coder. I truly do :rofl:.

1 Like

I work out at the gym 5 days a week, which helps, believe me, haha

Thanks for the kinds words, @AgusPG

Like you, am done with this “full of misinformation” off-topic discussion. My wife wants to go for a drive in the country-side to a new coffee shop and that is “top priority” for the rest of today.

:slight_smile:

1 Like

It’s fair that you don’t want to continue.

You’re clearly confusing a front-end service to a back-end one. Playground itself does not do anything that you are saying. Your front-end service doesn’t either. Playgrounds does not do any sort of business logic

My service is written in ReactJS, and then uses Kotlin with a sprinkle of Python in the back-end. Very hip, I know.

I would not say that my ReactJS web app does anything that you’re saying playground does. I would say that my back-end server manages of all that. Regardless, playground doesn’t even have these capabilities.

we have an instance in France & another one in Central US,
same issue in both so it’s not location dependant,

this happened in previous weeks with text-davinci, same story repeating with chat apis…

1 Like

Here is my final post, illustrating there is an operational difference between the API and the Playground as well as an application-level difference, posting a screenshot just now from status.openai.com

Obviously, without looking into the nitty-gritty details, there is "relatively significant’ different between API outages and Playground outages. If the Playground was “the same” as the API, and “just a UI and nothing more” then of course the outage numbers between the API and the Playground app would line up nicely, but they do not.

Not only is the application layer software different (the Playground is an app built on top of OpenAI APIs with user session and user dialog management token management, perhaps OpenAI filtering and moderation, etc., the APIs are just API endpoints), but the operational context is difference, see below:

I promise @AgusPG this is now my last post in this topic :slight_smile:

status.openai.com

:slight_smile:

2 Likes

Did you ever try to fix the problems?

I really don’t understand your patience comment. I’ve been spending my time to help your issue that you haven’t bothered to do any troubleshooting for, and then you pull the patience card? If all you wanted was to complain, just say so.

Believing that playground outputs differently is just plain wrong, and it’s essential to understand that before more people start to think “Playground somehow has better results (?)” When it doesn’t. The models in the playground are the same models that we have access to, there’s no differences. Even if playground had the fancy bells and whistles of context & token management ( it doesn’t ) it still doesn’t change the fact that the models and their output is the same. This is ridiculous

@ruby_coder your definitions are wrong. It’s as simple as that. Even ChatGPT is aware that the playground is a GUI. Of course it still needs a server to operate, I don’t understand your point. Even an HTML website needs a host such as apache to display. I’m seriously questioning your programming stack for not knowing what an interface is. Obviously the playground isn’t the API. It’s apples to oranges. It’s not even something to compare. My comments are saying that “Playground makes the exact same calls as we do”

I’ve seen a bunch of posts since as early as January about the API and the Playground returning different results (and experienced this myself, frustrating). AFAIK no one has been able to explain or resolve the differences. The API seems more generic and less “in-character” (stuck in “helpful assistant” mode maybe?)

1 Like

It’s all about the prompt. I think my “grumpy, experienced Dungeon Master” did well…

1 Like

Until recently, the http request to the playground was the exact same as any other request. Now it looks like Davinci is called using a different endpoint, but ChatCompletions is the same (besides a different authorization key)

You can view it in the network tab. It’s not made by the server, it’s made by your computer.
In my experience here, the majority of discrepancies are caused by incorrect formatting or just plain wrong parameter settings…

Interesting! Davinci was the one that was behaving strangely. I don’t think it would be incorrect formatting or wrong parameters in my case; I just copy/pasted the code it generates for my requests.

I’m with you on the difference between playground and API.

Its very frustrating to use the same variables, same system message, same prompt, and get completely different results on the API versus the playground.

And gpt-turbo API will leave the role and switch speakers WAY easier than the playground. Watch how fast the API switches speakers on this app: LMS - HR Interview