Fine tuning for comment rating


I’m trying a little for-fun project where I want to fine tune a model that rates/guesses the rating comments on a scientific Danish forum. I’ve just finished curie on 500 comments for 4 epochs, however, the inference is pretty garbage.
My data has had the format: “prompt”:“COMMENT\n\n###\n\n”,“completion”:" LIKES - DISLIKES".
Just 2 examples of the completion after finetuning is “1 - 2 - 0 - 0 - 0 - 2 - 0 - 2” and “1 - 15 - 0 - 0 - 9 - 0 +” where the correct rating is " 0 - 9" and " 10 - 1"
I’m wondering if the lack of stop token or natural text in my completion is causing the issues or if I just need more finetuning.
Also does anyone know the other models capability of other languages? ChatGPT is pretty good at danish

Thanks alot

Your prompt and completion must be meaningful when you read it, the more sensible the more fine-tuning will work. GPT models work based on the probability of what comes next. Use chain of thought like this:

Comment: Quantum entanglement, a phenomenon in quantum physics, allows particles to be connected in such a way that the state of one can instantaneously affect the state of the other, regardless of the distance between them.

Completion: Newton don’t like such claim.

Hi @Snanu . Welcome to the forum. In general, tasks like these are better suited to classification models and not a LLM, which has even after fine-tuning, will gather little understanding of why a number has been assigned to a comment in terms of like or dislike.

In terms of prompting, A more verbose prompt would be of better use as it will help the fine-tuned model get a bit more context as to the problem.

Thanks for the replies

Yes this is more like a classification/sentiment analysis problem. I just looked at the fine tuning documentation where it catagorizes e-mails where the output is only a number.
As far as I can see they still use a LLM and they do not add any natural text or end token to the completion.

Also the rating comes from other users.

I suggest to follow this best practices, It’s written by OpenAI