Completions API v/s Threads

In case of scenarios like processing 1000 records with OpenAI summary with chat completion. What is the best practice? running in different python process threads with “Completion API” v/s 2. create assistants/threads parallelly and threads will run in the background. Summarize the all the by collection thread.run messages after thread completion…Can you please provide insights here.Thank you

you mean chat completions api or assistants api, right? completions api is the legacy api. do you need to get the result right away? if not, you can use batch api. check the limits if it can suit your need.

without batch api…can we use assistant with muliple threads tro execute ? will assistant use less tokens when compared to completions API ?

your main problem is going over your tier limits. depending on the model, check your request per day, request per minute, tokens per minute. you can certainly use assistants api. it will not necessarily use less tokens. it actually has the potential to increase your token usage if you use file search. though there are token control properties that you can use.

1 Like

Assistants has a much lower API call per minute limit, starting at 60, with some reporting that they’ve been adjusted to 200 or so, perhaps an unwritten tier level. The mention of this limit has also been stricken from documentation.

That makes this beta product unsuitable for the much greater level of usage your tier level might otherwise suggest.

I would consider Chat Completions as not “legacy”, but “essential”. API calls to it is probably what Assistants is outputting exactly. There is no point in the overhead of a thread and multiple API calls to get a single response if you are not building a user chat and tool use history. Chat Completions is also the method by which you would batch process such a text transformation task overnight.

In Assistants, you should be able to approach the same token cost and same token input and output if using none of the built-in tool features for which you would use Assistants, and also immediately discard the thread.