We are using the embeddings API to answer a question and the response is taking upwards of 20 seconds on average. Is there a way to speed this up?
Which part of your QnA is taking time? Embedding the user query? Searching for answer against the stored vector data? Final chat completions API call?
We have the same issue with voice response. Averaging 15-17 seconds for a response. No user will tolerate that. Would love to hear from others what they have figured out to speed this up.