You can train a classifier which classifies the question as “0” = Simple, “1” = Open, “2” = Decision. And then do something different in the back end depending on the classification.
Another approach is to create your own embedding based classifier. So embed all of these questions, with the 0, 1, 2 label, and if a new question has high cosine similarity with the already labeled questions, then go down the appropriate path. This has the advantage of being tuned on the fly, since it’s data driven, unlike the lock-in you get from a fine-tune (see below).
The fine-tune locks you in, and is hard to “untrain” if something changes. A dynamic data (database correlation → prompt) driven thing can be changed on the fly.
Your diversity in search is coming from MMR, or maximal marginal relevance. That could be a problem too if you only have 50k words total, since it will put you in the weeds quickly. IF your prompts are NOT diverse enough (if window size is hit), make multiple API calls with the various prompt variations, and consolidate on the back end.
Or go with a keyword search as well to get more diversity in search.
But more diversity can lead to unstable answers, so you need to average and consolidate (more work).
So less diversity on such a small corpus might be better? Maybe? You’d have to try.