Function calling through the assistant API incessant polling

splotsh · November 25, 2024, 9:47am

Hello,

I’ve been building an application using the assistant API and I’m currently looking at optimizing the speed at which we’re able to get a response to the user - unfortunately I’m regularly losing 15s+ polling the API after submitting the tool response.

I’ve followed the instructions here exactly (https://platform.openai.com/docs/assistants/tools/function-calling) except for defining the assistants in the OpenAI dashboard instead of within my code.

Here’s an example, the POST request is the output of my function:

2024-11-24 19:16:20,729 INFO     HTTP Request: POST https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE/submit_tool_outputs "HTTP/2 200 OK"
2024-11-24 19:16:20,966 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:21,767 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:22,493 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:23,273 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:24,062 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:24,911 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:25,657 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:26,449 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:27,267 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:28,087 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:29,315 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:30,135 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:30,953 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:31,725 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:32,900 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"
2024-11-24 19:16:33,718 INFO     HTTP Request: GET https://api.openai.com/v1/threads/thread_KBxCEydBEwzgAu7WamCT1qK3/runs/run_N7JSwDFjFP7oJMgtAZZ7teiE "HTTP/2 200 OK"

That’s a 13s wait just for the API to acknowledge receiving the output and adding it to the context of the assistant, I assume.

Is there anyway in which I can speed this up? Would using the chat completions API be a better option for me at this point while the assistant API is still in beta?

Thanks!

_j · November 25, 2024, 10:38am

That is likely the time for the AI to generate the response.

Add instructions “your maximum response length is a single short sentence.”

Instead of polling, you can use streaming, and receive the output as it is being generated, for an appearance of more responsiveness.

splotsh · November 25, 2024, 10:51am

There is no response for it to generate, it only needs to return exactly what my function has sent it.
I am using streaming:

async with client.beta.threads.runs.stream(
            thread_id=thread_id,
            assistant_id=frontman_id,
            event_handler=event_handler
        ) as stream:

splotsh · December 2, 2024, 4:13pm

@edwinarbus Hi Edwin - can someone from OpenAI provide some feedback here please?

Topic		Replies	Views
Tips for Speeding Up Assistant Responses with Assistants API API assistants-api	2	741	September 6, 2024
Benchmarking Tools Again - SO SLOW API chatgpt , assistants-api	1	236	January 14, 2025
Long response times for Python method: client.beta.threads.runs.retrieve() API	2	2302	March 4, 2024
Delay in polling and assisntant API api	0	84	August 15, 2024
Async Streaming Run Sanity Check API assistants-api	2	164	September 10, 2024

Function calling through the assistant API incessant polling

Related topics