I am working on a project for my final year of university. The aim is for ChatGPT to generate questions and answers based on information provided by a Computer Science textbook. Should I be using embeddings to train the model on this new textbook data and then fine-tune the model to provide questions and answers in a suitable format?
If the book in question came before Nov 2021, gpt-3.5-turbo should likely have knowledge about it and should be able to generate QnA(s) about subjects covered.
However models tend to hallucinate, so it would be a good idea to use embeddings to verify the answer generated.
I’d also recommend to supply topics to generate QnA about, instead of simply prompting the model to generate some.
Fine-tuning is recommended when you want to reduce prompt size or want some specific style of response from the model in large volume. Here’s common use-cases for fine-tuning.