Hey! I’m creating a voice ChatBot for an educational event using OpenAI API with Whisper and Chat GPT 3.5 but I’m struggling with some issues refering to this task I which I’ll explain above:
In the event, the user will interact with one of three pre-selected youtube videos about education and after that will chat with the ChatBot for some time and talk about 5 to 10 questions with it.
For this purpose the chat has to know the context of the videos and it’s details, which is something around 3.000 tokens for all videos and adittional context. The chat will play a guessing game with the User about what country the user’s choosen video was situated.
In this case, I am in doubt about the best strategy for building this model (I’m using Python now). Is it better to use fine-tuning, or to pass the context with more tokens, or to load a .txt file with some information about everything and use a LangChain TextLoader or something like that…
What do you think? Would appreciate any help