For my nocode app builder I will have a bunch of different runs going on against different assistants (each with specialized prompting) – in parallel, all within the same overarching project.
The API seems to allow different ways to do messages and runs (create message first on thread, then run it – create a run with its own additional messages etc.) – Not sure that is relevant to my question which is:
Can I have multiple runs concurrent active on the same thread? I know the way I’m current using the API the API I get an error that I – can’t add messages to thread while run is active – is this blanket 1 run at a time per thread rule or possible to have multiple parallel runs if I use the API differently?
Apols if this is unclear, trying to wrap my head around this and API keeps growing in terms of options I guess.
You’d most likely be better off using ChatCompletions
Oh my (slaps forehead considering sunken cost) – I had presumed assistants/thread/runs to be the better option, now I see chat completions does tool calls (and otherwise seems as featureful) perhaps I went the wrong direct - I guess if so a tangent q might be which use-cases would prefer assistant/thread/run?
Thanks for the pointer, any guess if using chat completions might be faster? my roundtrip add-message/run it/answer tool call/wait for run completion can take from 2 seconds up to 15 and I was getting concerned about time?
ChatCompletions is a component used by Assistants. OpenAI maintains a persistent state of the Assistant, along with the conversations. They also manage context length and RAG for you.
They’re great for bootstrapping. If they’re sufficient, they’re sufficient. If you find a need for greater control then moving to ChatCompletions may be ideal.
I use a combination of both.
Yes. As it’s just a single component of many. Although speed in my opinion shouldn’t be the dominating factor. Assistants can prevent a lot of headaches.