Hi everyone,
I want to create a chatbot that answers questions about customer support, I have 30000 Q&A datasets, so I guess I should use the embedding to insert our knowledge into the GPT model.
Also, I need the answer’s output language to be in a specific language and tone and prevent answering irrelevant queries so I guess I should use fine-tuning too. Still, the question is should I fine-tune the model with all of the 30000 data or just a sample of it?
Another question is if I had to add the system prompt into all new queries, is it necessary to fine-tune, or is the embedding sufficient?