thank you for the suggestion but its the time from the request to the first delta delivery when streaming since the assistant sends all of the history with every request.
For example a new thread will return a message in 0.5 seconds but a thread with 10 message and 8k of text will take 20+ seconds, even if the current question is only a sentence.