How can I speed up response using stream on python

@app.route('/gpt/chat', methods=['POST'])
def chat():
    data = request.json
    thread_id = data.get('thread_id')
    announcement_id = data.get('announcement_id')
    message = data.get('message')

    if not thread_id:
        return jsonify({"error": "thread_id is not able"}), 400

    client.beta.threads.messages.create(thread_id=thread_id, role="user", content=message)

    def generate():
        stream = client.beta.threads.runs.create(thread_id=thread_id, assistant_id=assistant_id, stream=True)
        complete_message = ''
        
        for event in stream:
            if event.event == 'thread.message.delta':
                message_delta = event.data.delta
                for part in message_delta.content:
                    if part.type == 'text':
                        complete_message += part.text.value
                yield complete_message 
                complete_message = ''
            elif event.event == 'thread.run.requires_action':
                tool_call_id = event.data.required_action.submit_tool_outputs.tool_calls[0].id
                output = functions.information_from_pdf_server(announcement_id)
                tool_stream = client.beta.threads.runs.submit_tool_outputs(thread_id=thread_id,
                                                                            run_id=event.data.id,
                                                                            stream=True,
                                                                            tool_outputs=[{
                                                                                "tool_call_id": tool_call_id,
                                                                                "output": json.dumps(output)
                                                                            }])
                for event in tool_stream:
                    if event.event == 'thread.message.delta':
                        message_delta = event.data.delta
                        for part in message_delta.content:
                            if part.type == 'text':
                                complete_message += part.text.value
                        yield complete_message
                        complete_message = ''
            time.sleep(1)  # Delay for better streaming

    response = Response(generate(), content_type='text/event-stream')
    response.headers['X-Accel-Buffering'] = 'no'
    return response

-Using Model:
gpt-3.5-turbo-1106

-Code Explain:
I’m using assistant api.
When user writes a message and send, add the message and start the run.
In run process, there is two step. First, when it can send a respond without a tool call, receive a ‘delta.message’ and send it to client by SSE. Second, when it needs ‘thread.run.requires_action’, it runs ‘submit_tool_outputs’ and reply by using stream.

-Main problem:
There’s no problem with the code but the message for gpt api speed is too slow.
On average, it takes more than 20 seconds to complete the answer, and as long as a minute.

When I don’t use stream, the answer comes on 10 ~20 Sec.
But when I able the stream option, the answer speed is too slow.

Is there any option or solution to solve this problem?