Faster response from client/assistants

Hey there, I’m trying to use client/assistants to get a response from a message that I create/send. The issue is that this process is taking more that 20s, way slower than the chat, it is very very slow right now.

I have created an assistant with vectore store attached to it so I get an answer about the files in the vector store. The way I create a message right now is quite simple following the guides. First I create a thread, and then I stick with that thread with the following code:

    def create_message(self, message, thread_id, role="user"):
        thread_message = self.client.beta.threads.messages.create(
            thread_id,
            role="user",
            content=message,
            )

        return thread_message

And for retrieving the response for this message , the only option is to get all the messages from the thread (and run?) and check until the last message is from assitants? This takes 20s+:

    def get_response(self, thread_id):
        
      run = client.beta.threads.runs.create(
          thread_id = thread_id,
          assistant_id = assistant_id,
          )

      messages = list(client.beta.threads.messages.list(
        thread_id=thread_id, 
        run_id = run.id
        )
      )

      while(not messages or messages[0].role != 'assistant' or not messages[0].content):
            messages = list(client.beta.threads.messages.list(
          thread_id=thread_id, 
          run_id = run.id
          )
      )
    
      return messages[0]

Because I always have to create a run of my thread if I created a message, and then wait for the last message according to my interpretation of the docs, and following the examples.

Is there anything I’m doing wrong or any way this could be optimized?

Thank you