Assistant functions: when to use, and how to chain?

Welcome to the community!

I’m a little rusty on assistants specifically because they don’t solve any use case of mine, but I can give you some general advice :slight_smile:

  1. IIRC, you add your messages to a thread - the run is just the execution. A completed run is what you want, that’s when you would want to add additional messages. So that seems ok

  2. I think you can just leave it, make a new thread. There’s discussions on how to delete threads (List and delete all threads), but if it’s completed it shouldn’t incur additional costs.

Regarding stuff I do know:

  1. cont.: I would recommend against putting multiple different things into the same thread. Three reasons:

    1. costs explode. You pay for all messages already submitted, every time you generate a new response. If you have a gigantic thread, you’ll pay for all of these tokens even though they do nothing for you.
    2. security: if you have multiple users using your service, it would be possible to extract previous users’ queries and responses. Probably not a good idea.
    3. confusion: I always advocate for keeping context as short and clean as possible (but there are other approaches, e.g. multishot) because anything in your context can be a source of confusion that can be integrated into your response and manifest as a type of hallucination.
  2. I would go for text chat completions. If the assistant framework feels ergonomic to you, and it helps you iterate faster, there’s absolutely nothing wrong with that. Under the hood, (apart from document search and the python interpreter) the assistants API just uses and is billed exactly like the completions API.

I am using an assistant for this, so I don’t have to re-send the instructions about how to construct the query and how to evaluate the results each time

If you think this will save you in terms of cost, it won’t. As mentioned above, you will be billed for that with every run anyways.

In terms of programming convenience, I can see it. But on the other hand, a system prompt is just a string - you can tack that onto your query like everything else in the context.

I can see how it’s easier to think about in terms of a conversation - but you’re not really having conversations with a model. It’s all just fake and simulated.

I hope that answers your questions!