Scaling OpenAI API for production use

Hi Everyone!

I am building a product for my company that heavily relies on OpenAI APIs for doing Data summarisation and formatting.
I am extracting the RAW test from my file (PRFs) and then using that text to call the chat completion API to summarise and format the data

Problem - OpenAI APIs are not reliable enough to use it on production as it fails sometimes and sometimes refuses to generate any answers even though I am well within the token and API rate limits.

Solution that I came up with - Use multiple OpenAI Accounts (let’s say 3) and internally route the requests between them.

Question for Community -

  1. Does this approach work, or I should try some different approach?
  2. Will OpenAI Block me for this?
  3. Any better suggestions to make the use of OpenAI APIs more reliable for use in production.

Thank you.

1 Like

NO! That’s a terrible idea.

This is not a new thing. This is a standard engineering challenge when external APIs are adopted and you don’t have direct control over their availability.

I have two websites in Production that rely heavily on Open AI APIs as well a widely adopted Chatbot and an open source summarisation plugin for Discourse.

The solution is to turn your calls into batch Jobs that automatically retry until a successful response is returned.

I use Sidekiq for this purpose which is excellent.

3 Likes

Hey @merefield thanks for your suggestion. The batch jobs sounds like a good idea.

Thank you.

1 Like

Great. It’s a bit of extra work but will save you a lot of stress and micromanagement in Production.

Note the standard retry lead time grows exponentially usually which is a really nice algorithm as can handle availability problems of different extents.

Best of luck with the project!

1 Like