Hi @info1158 ! You got some plenty of good advice here already. I can just add that from building chatbot experience - it’s usually best to just start from the simplest/cheapest model, until you reach a corner case that turns out to be quite an important one, and then either replace the model or evaluate a more complex model routing configuration.
For example - it is not uncommon to have a first layer (implemented usually using the cheaper model), whose sole purpose is to determine if it’s a simple response (and then send that response to the user), or route to a more complex model. There may be a 3rd option which is to route to the human. One example with the more complex model routing - if the first layer determines that there is insufficient information, a more complex model (such as a reasoning one) might be more effective at disambiguating information from the user, or providing a response back to the user with some options and asking the user to clarify.