Fine-tunning davinci wrong output

Hey guys I’m fine-tuning davinci and I don’t know how to teach a particular thing, I’m trying a very simple thing Given: Peter Park | England I want the completion Peter Park lives in London, 6 (where London is the capital of the country input and 6 is the capital length)
it’s working correctly with the capital output but it never gets the length, the training dataset has 840 examples but it still guesses random numbers How would I train it to understand that the final number is the capital length?

Dataset example: {"prompt":"John Smith | Armenia ->","completion":" John Smith lives in lives Yerevan, 7 #END"}

1 Like

Welcome to the community.

Fine-tuning is super-finicky and because it’s the older Davinci, it’s even more troublesome at times. Add to that LLMs not being great at counting (without help), and it can be difficult.

In your case, I might try something like…

England (7 characters)
`Peter Park lives in London (6 characters)

etc in your dataset examples. …

Or maybe…

England (7 letters)
`Peter Park lives in London (6 letters)

etc… You’re asking it to guess that the number is tied to the city name, but you’re not giving it enough clues to make the assumption. Try giving it a bit more data in your prompt/examples.

LLMs are bad at counting characters in general, because they “think” in tokens, not characters.

If you ask ChatGPT 4 to limit length to a certain number of characters, it will also fail at this, so the problem is not fine-tuning specific.

You would be better off getting the completions in a format where you can simply use a character counting function on it afterward, in your programming language of choice. So you want your completion to return something structured like JSON, e.g.:

   capital: "London"

Then extract the value using a JSON library and count it’s characters.

Try to use LLMs for what they are best at, and traditional methods for everything else :slightly_smiling_face: