Terrible Latency - Tier 3 User not close to Limits

[2023-11-27 02:13:11]Sanity Check for Thread 1- SEND REQUEST TO OPENAI
[2023-11-27 02:13:12]Exited Threads
[2023-11-27 02:13:12]Thread 2- RESPONSE FROM OPENAI
[2023-11-27 02:13:13]Sanity Check for Thread 2- SEND REQUEST TO OPENAI
[2023-11-27 02:13:15]Sanity Check for Thread 2- RESPONSE FROM OPENAI
[2023-11-27 02:23:14]Sanity Check for Thread 3- RESPONSE FROM OPENAI
[2023-11-27 02:26:25]Sanity Check for Thread 1- RESPONSE FROM OPENAI


[2023-11-27 02:26:28]Sanity Check for Thread 1- SEND REQUEST TO OPENAI
[2023-11-27 02:26:28]Thread 3- RESPONSE FROM OPENAI
[2023-11-27 02:26:28]Sanity Check for Thread 3- SEND REQUEST TO OPENAI
[2023-11-27 02:26:29]Sanity Check for Thread 1- RESPONSE FROM OPENAI
[2023-11-27 02:26:30]Exited Threads
[2023-11-27 02:31:28]Thread 2- RESPONSE FROM OPENAI
[2023-11-27 02:31:28]Sanity Check for Thread 2- SEND REQUEST TO OPENAI
[2023-11-27 02:31:30]Sanity Check for Thread 2- RESPONSE FROM OPENAI
[2023-11-27 02:36:31]Sanity Check for Thread 3- RESPONSE FROM OPENAI
[2023-11-27 02:36:31]Watch dog triggered scripts

From your opaque log, it looks like you are using assistants.

They are inherently “submit your question, check back later, don’t see the output as it is being generated”.

You also can’t observe the multiple calls the AI model may be making for retrieval before responding. You can only see “steps” which don’t reveal the purpose or language input or output or token usage.

The page you show is purposely designed to be usage-less, not revealing what the models have been up to or how many requests were actually made per even day, let alone per minute or per assistant run to total up to $0.25 that day.

Perhaps decode the information you are presenting to us. And then use chat completions with streaming to get an output that begins within seconds.