Best Way to Structure Backend Architecture for OpenAI API Calls? Description:

I’m unsure whether OpenAI API calls should be handled directly in the backend, through workers, or via async jobs.
What architecture patterns are you using to keep things scalable, secure, and responsive?