[REQUEST] Better outage handling in the API

GoldenJoe · December 12, 2024, 2:14am

Problem:
Currently the API is experiencing an outage. These are common enough that I believe it’s important to make it easier for API users to implement a fallback mechanism. In the event of an outage, the API simply hangs you forever. It’s a pretty lousy pattern to wrap all your API calls in a timer, and even if you do, you can’t cancel a request, and run the risk of the API simply being slow and not out of service, which would result in you performing your operation twice - once with OpenAI and again with your fallback. The status page offers the ability to set up a web hook, but this doesn’t support “polling” the service to see if it’s operational.

Idea:
Add a new error type to represent a service outage or incident, which is returned from any API endpoint when the service is down. It would make it much easier for developers to improve their OpenAI-driven services and handle errors gracefully. Remember that our customers are your customers. If apps that implement OpenAI services perform poorly, the greater consumer sentiment shifts to a disfavor able view of AI services in general.

_j · December 12, 2024, 2:46am

Protip: set the moderations call that you use on untrusted inputs to a short timeout without retry (the SDK defaults to two retries). This service offers fast responses. Inform the rest of the API call chain to not proceed without flag-free results.

anon25271712 · December 12, 2024, 4:34am

I support this, otherwise it has to be done manually and there’s a need to switch the LLM provider when there is an outage… one solution i can think of is web scrapping the status page if you notice requests taking too long… but I agree with the outage error handling would be nice in the future

Topic		Replies	Views
Best practices to handle API outage API api	3	1112	November 9, 2023
OpenAI Your Systems Are Failing, Nothing Is Working! Please Respond API	7	1746	December 17, 2023
Major Outage on API, Sora and ChatGPT Community outage , api-outage , chatgpt-outage	5	1234	December 28, 2024
More detailed status updates are needed Feedback	9	1269	February 14, 2024
Handling API calls when service is down API	4	384	June 4, 2024

[REQUEST] Better outage handling in the API

Related topics