What is the right way to configure assistant

I would like to configure an assistant that would help users with one specific task:
Our service has over a hundred document templates and the assistant should help the user choose the right template according to their use case. That is, the user describes their case, and the assistant selects a template for them.

I tried adding template files with their text and some metadata to “file_search” and in general this approach works great. The answers in most cases recommend the right template. But the problem is that with this approach, each request consumes a huge number of tokens (about 20K), which costs about $ 0.15 per request. Unfortunately, this is too high price considering the load we expect.

As far as I understand, I could fine tune the model, but it seems to me that this is not quite the right case, since I need to generate many question-answer pairs to do this.

What are the options to solve my problem?
Is there a way to make file search less expensive?
Or maybe completely different approaches to create an assistant?