How I can send user messages towards an assistant with less api calls?

I made an assistant using the Assistants rest api:

from openai import OpenAI
from dotenv import load_dotenv
import os

load_dotenv()

# Set your API key
client = OpenAI(
    # This is the default and can be omitted
    api_key=os.environ.get("OPENAI_API_KEY"),
)

instructions="""
You are an assistant aiding Public relations and making Linkedin posts
"""

assistant = client.beta.assistants.create(
    name="ATeram Assistant",
    instructions="instructions",
    model="gpt-4o-mini-2024-07-18"
)


print(assistant.id)

But In order to make a message towartds assistant I need to perform 3 api calls:

  1. Create A thread with messages
  2. Run the thread with my assistant id
  3. Poll if thread has run
  4. Get messages and display them

As you can see in the following demo:

from openai import OpenAI
from dotenv import load_dotenv
import os
import time

load_dotenv()

# Set your API key
client = OpenAI(
    # This is the default and can be omitted
    api_key=os.environ.get("OPENAI_API_KEY"),
)

assistant_id = "asst_0Fn4Gwfys5gJCzdHm80cGr7N"

thread = client.beta.threads.create(messages=[
    {
        'role':'user',
        'content':"Δημιούργησέ μου ένα linkedin post που προωθεί τις δραστηριότητες του A-team"
    }
])

run = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant_id,
)

print ("Wait for the thread to run")
while True:
    run_status = client.beta.threads.runs.retrieve(thread_id=thread.id,run_id=run.id)
    if run_status.status == "completed":
        break
    elif run_status.status == "failed":
        print("Run failed:", run_status.last_error)
        break
    time.sleep(2) 

messages = client.beta.threads.messages.list(thread_id=thread.id)
number_of_messages = len(messages.data)

for message in reversed(messages.data):
    role = message.role  
    for content in message.content:
        if content.type == 'text':
            response = content.text.value 
            print(f'\n{role}: {response}')

But In my case I want to modify these scripts into a realtime chat api and multiple requests for message extraction seem like a source for delay.

Best case if to have a the thread id somewhere stored ( like into a database) and use that to add extra messages. But still I find too many requests. Is there a way to send messages with less requests?