I was surprised to see how well GPT-3 worked in Danish, a small language spoken by just 6 million people. Not flawlessly, though: It will switch to Norwegian or even English without proper prompt engineering, and only really performs well with the DaVinci model. It generally falls short of equivalent prompts in English, which makes sense.
Commercial tools like copy.ai and Jasper boast support for smaller languages as well, but the results are very alike the Vanilla GPT-3 experience.
My instinct has been to create a fine-tuned version of DaVinci. My previous attempts (with Curie) have been disappointing. In these cases, I followed the fine-tuning guidelines on âinternal company jargonâ, which recommends empty prompts and long completions:
{"prompt":"", "completion":" <legal document>"}
{"prompt":"", "completion":" <company product catalogue>"}
The result was mostly gibberish, as if you had take an sledgehammer to Curie. It seems like this person had a similar experience.
My intention is to use a much larger data set than before, but before I break the bank on this Iâd like to do my homework properly, hence this post.
Iâve followed the advice in this post and have had success with better prompt engineering, but am still optimistic that a fine-tuned model trained on Danish data would be an improvement.
As I understand it, fine-tuning is mostly relevant to specific tasks that you want consistently executed (for lower cost and latency). Throwing 10,000 examples of Danish at it would be like retraining GPT-3 â but thatâs not necessary. It knows Danish, and it knows it very well. Just not consistently well.
My current hypothesis is: Take chunks of text, break them in half, and serve one part as the prompt and the second as the completion. I donât know if itâs best practice to keep prompt length the same as the completion length? Or how many tokens to use in each? Or even to standardize it (e.g âAlways use 1,500 characters in the prompt and 1,500 in the completionâ), or whether to make it random:
{"prompt":"<start of paragraph in Danish>", "completion":" <end of paragraph in Danish>"}
If anyone has had tried something similar (or can make sense of my ramblings), Iâm hoping to hear from your and pick up any lessons you have to share.