I’m using the GPT-4o API with the useAssistant
hook in a Next.js project. I’ve provided a fairly detailed instruction and included two files for processing. The response must be in JSON format, as I parse the output into two separate boxes that display a batch response simultaneously (without streaming). The total response time is around 23 seconds, with an input size of about 20,000 tokens. The response quality is excellent, but I’m wondering if there are any methods or tricks to reduce the batch response time by a few seconds to improve the user experience (UX).