Message Retrieval in Assistant Runs

ZeroBot.ai · November 13, 2023, 5:14pm

Hello OpenAI Community,

I’m currently utilizing the OpenAI API for threading and messaging, specifically using /threads and /threads/{thread_id}/messages. My process involves creating messages, initiating runs with assistants, and then fetching the latest messages from the thread. This approach works reasonably well with /threads/runs, but becomes cumbersome when adding new messages to existing threads, as it requires a multi-step process: creating a message, running an assistant, and then retrieving the message list.

It would significantly streamline the workflow if the assistant’s response could be included directly in the run endpoint’s response. This feature would eliminate the need for additional calls to fetch the latest message, thereby enhancing efficiency and user experience.

I’m curious if others in the community have similar experiences or suggestions. Any insights on this would be greatly appreciated.

Foxalabs · November 13, 2023, 8:09pm

This is unlikely to happen, inference takes time, keeping open connections for anything more than the absolute minimum period possible is bad practice and open to attack vectors. Polling for completed status is a standard best practice for lengthy operations.

ZeroBot.ai · November 14, 2023, 8:01pm

What would you say is the most efficient way to retrieve the generated message? Previously I had my application to check every second but felt this was way too long. I’ve reduced the check loop to every .5 seconds. I don’t want this to dramatically increase the API usage. Is there a rate limit or associated cost for calling the message list endpoint?

DarthFader · November 15, 2023, 3:03am

Polling wouldn’t increase your API cost.
You can see how I do it in this Langroid code:

github.com

langroid/langroid/blob/main/langroid/agent/openai_assistant.py#L487


      
                  self.process_citations(msg)
              return [
                  LLMMessage(
                      # TODO: could be image, deal with it later
                      content=m.content[0].text.value,  # type: ignore
                      role=m.role,
                  )
                  for m in thread_msgs
              ]
          
          def _wait_for_run(
              self,
              until_not: List[RunStatus] = [RunStatus.QUEUED, RunStatus.IN_PROGRESS],
              until: List[RunStatus] = [],
              timeout: int = 30,
          ) -> RunStatus:
              """
              Poll the run until it either:
              - EXITs the statuses specified in `until_not`, or
              - ENTERs the statuses specified in `until`, or
              """

I have a “wait_for_run” and an async version of it.

Topic		Replies	Views
Retrieve only Assistant's response from the last run API	4	8997	January 23, 2024
No assistant message in the thread API api , assistants-api	2	1421	November 23, 2023
Minimum time between .create() and .retrieve() calling assistant over API? API assistants-api	3	2585	March 1, 2024
Assistants API don't allow perform two concurrent runs on same thread API	8	3668	August 12, 2024
Possible to execute a callback on run end? API	7	2058	February 13, 2024

Message Retrieval in Assistant Runs

Related Topics