How can I add "knowledge" for a specific topic to a finetuned model

I wan’t to use GPT-3 with very specific data f. e. about “Falaffel”. Therefore I want to submit hundreds of pages about the history of falaffel, recipes, … and then I want to generate content based of the input I gave and I want an option to create more content later on my submitted data. Is this possible?

2 Likes

Yes, this is possible, but your results might vary with just completion.

Here’s some links on fine-tuning…

You might also look into embeddings endpoint…

Hope this helps.

2 Likes

What I don’t understand about either example is how to get special knowledge into the fintuned model. After all, the examples show how to set a sample query with a perfect result. Where does the system get the “basic knowledge” about the area in the first place, from which you can then generate/query specific content?

A concrete example. I want to develop quiz questions about a city. I have documents about the history of the city, inhabitants, companies of the city, personalities of the city, … but the city is not relevant enough for the standard models to have information about it.

So as an example:

Generate 2 Quiz-Questions with 1 correct and 5 wrong answers about Berlin.

  1. What is the capital of Germany?
    A. Berlin
    B. Munich
    C. Cologne
    D. Hamburg
    E. Frankfurt

  2. In which year did Berlin become the capital of Germany?
    A. 1871
    B. 1789
    C. 1990
    D. 1945
    E. 1949

is great but

Generate 2 Quiz-Questions with 1 correct and 5 wrong answers about the village Breuningsweiler.

Question 1: What is the population of Breuningsweiler?

A. 10.000
B. 5.000
C. 1.000
D. 20.000
E. 50.000
F. 500

Correct Answer: B. 5.000

Question 2: How many schools are in Breuningsweiler?

A. 4
B. 2
C. 10
D. 6
E. 8
F. 0

Correct Answer: B. 2

Is completely wrong because GTP-3 won’t have data about that <1.000 residents village in south germany.

Would I have to create a promt on each information about the city OR is there any way to provide a lot of plain text (like from the wikipedia website about breuningweiler, the tourist information website, …) to give the system an information base?

You can fine-tune without a prompt, but GPT-3 is known to hallucinate at this number of parameters, so it’s not 100% great at being an expert on a specific domain.

Have you read up on fine-tuning and embedding? There’s a few posts on the forum here that might be helpful too.

Good luck.,

2 Likes