I need to train the GPT model on my University courses for the students. The data is in the form of handouts, slides, videos, assignments, quizes etc. Please guide about the best practical successfully approach to acheive this task with minimum budget, reliable response and control.
I have explore these option but still confused about the right directions to acheive the ultimate goal.
- Create/Maintain/Build our own vector database (i.e. PineCone) of training material and use the open-source frameworks to push data (e.g., PDF/text) into the database and query the data using the OpenAI GPT model.
- Fine-tune OpenAI model, which is using JSONL data for training (OPenAI Own Vector Database Backend).
- Utilize the OpenAI Assistant API beta, which accepts files (OPenAI Own Vector Database Backend).
Please share your valuable suggestions.