Where to put ai serve? In need to prevent high concurrent for back-end server

As expected, I plan to deploy the AI service on the backend (to prevent users from bypassing payment), but the time it takes for the AI to return results is quite long, leading to high concurrency on the backend server. I want to use polling to address this issue, but I didn’t find similar operations in the documentation. I am looking for a solution.