Hi @jonah54,
I think with the current specifications your project will be difficult. There is (still) no possibility to train GPT 3.5 Turbo. Accordingly, you have to transfer all the questions you want to match with your answer to the model with every request. I have made the mistake in the past to not include this in my conceptual planning, so I hope I can help you by sharing my prior mistakes
If you find a low-token solution I would suggest a backend iterative procedure (added advantage that you do not have to build seperate solutions outside of the API):
- you train a low token model to perform the matching between user question and stored answer. As a result of this process step you get your answer. This answer is NOT displayed to the user.
- You transmit the user’s query and the determined answer to a model with higher costs. You submit the information with a request to answer the user’s question with the information provided from 1. You address any unanswered components of the question with a reference to your service desk.
I hope this helps you in your journey and all the luck in your undertaking,
Linus