I like what you’re working towards.
For something simple as classifying if something is either cute or real aggression, you would have much better success with Davinci, or the other models with your own tuning. These obstacles are purposely there for cGPT