More detailed status updates are needed

As we move more OpenAI products into production environments it would be advisable to add greater transparency to status updates.

It is impossible to know if the issues that developers are experiencing are tied to the issues being investigated, which leaves us to guess on whether the issue is a solvable one relating to our design, or to OpenAI.

More detailed status updates are necessary for production environments. Simply saying that you know an issue and are working on it is not sufficient. This is a necessary consequences of the increasingly complex offering.

Hey! Can you share more? My original reading was looking for context related to status.openai.com incidents, but the wording here makes it seems otherwise. Can you clarify?

1 Like

Yes. I’m suggesting that status.openai.com have more fulsome updates on the impact developers can experience from current challenges.

For the past 48 hours the Assistants API has been hallucinating function calls, and telling users that it’s not configured to perform function in its current state, and that it is “simulating” a reply.

There’s no indication on the error report on whether or not the issue you are investigating relates to function calling, or something else. So in our own use case we don’t know if OpenAI is experiencing an issue that impacts us, or whether the model behaviour has.jsut changed and therefore we need to adapt.

I know the Assistants API is in a beta state. But for production we’d need more detailed updates to inform whether we take action or wait.

I get what you are saying.

In other words, you mean that how can we understand that the behavior of the Assistant is related to service disruption or something else?

Like in this case:

The API is suffering an outage as per status.openai.com but the user can’t be sure whether that is the reason for this unexpected behavior.

Is that correct?

1 Like

I dunno if it is related

It swoops the 34k tokens, then doesn’t give back anything

That is precisely my point. How would we know if the issues are related?

As the product matures I recommend also some greater transparency on how tokens will be compensated in these events. It’s a tough one since the model often will produce some output, and in alot of cases it might be adequate, when while an outage is ongoing.

But a true same time, it might not… Would be good to have a clearly articulated policy.

1 Like

What I think is described and needed is a (perhaps limited exposure) changelog. As API users, we can perceive the changes, and continued dysfunction after specific time, to model operations. We can’t directly address or pursue investigation of the change that caused it.

Example:

date exposure description
2024-02-02 all gpt-3.5-turbo(preview), production softmax mixture optimization
2024-02-02 assistants multi_tool_use token cutoff, adaptive temperature
2024-02-04 ChatGPT 4 eval model 2, 10% A/B feedback RLHF training update
2024-02-06 all gpt-4-0613 reverted embeddings masking update 2024-01-30
2024-02-13 ChatGPT, USA 20% memory, rollout step 2

You can talk with your configuration manager if particular deployments, such as about ‘rollout to those developers who opted in to beta stage 2’ should be proprietary still…when you’ve got both of those.

Forgot to mention, that could extend to other impactful things, like account management changes to the availability of user limits in the limits page or information presented in the usage, or even updates to tier limits.

I can’t speak as a developer. But from a management perspective You should be clear about your work. And I talk about this regularly in the forums and in emails. There should be advance notice, especially about the work of the team. Before I knew it, I wasted many hours. It is impossible to distinguish whether the problem is from AI or another effect of the team’s work. And it will become bigger if used in the business sector. it will cause problems for users to use immediately if they are not aware of the changes. Many times, when I contact them, I will talk about management issues. especially communication You guys have a website. used by millions of people The messaging is as clear as it gets if you put together a team-based package for months. It points out that you could but didn’t.

2 Likes

I really could not agree more. The communication should (and can be) much better.

To illustrate, our recent issue was resolved by recreating our “assistant”. We had to disbale code interpreter. For some reason the code interpreter was launching during ordinary dialogue (to do things like calculate relative dates, like “tomorrow” or “next week”). Once code-interpreter was launched, it seems that it would not use actual functions provided. Instead it would only “mock” results. This was a substantial change in operation affecting multiple models. It is disappointing that it was not communicated, but I am sharing it here in case anyone else has this issue, and also to demonstrate why communication is important.

We also noted the strange behavior of these failed functions consuming approximately 30,000 tokens per message as noted in this thread. Our actual contex is closer to 6,000. So whatever errors occurred were causing a substantial overstatement of token usage which I hope will be rectified.

I’d rather have fixed model versions. Versions have quirks, but they can be worked around. But not if the quirks keep changing.

1 Like