I am trying to classify social media posts in ads & non-ads, so far a basic task at which GPT-models excel.
Using a neutral prompt, the bigger models davinci & curie tend to strongly prefer judging False/no-ad in not perfectly certain cases. In a similar fashion, the small models tend to strongly prefer classifying True/is-ad. In both cases I really need to bend the models, to come somewhat close to 50:50 e.g. by adding “if you are uncertain, always prefer True” (or false, for the small models),
For the chatGPT API, this has gotten even more severe, chatGPT (behaving like davinci) basically refuses to classify posts as ads.
That’s what my prompt looks like right now for chatGPT:
If you notice the slightest indication that there might be any chance it could contain a (potentially non-obvious) promotion of a product, service, partnership, even if the promotion is not commercial or not tied to a specific vendor, let humans have a look at by returning “True”. Return “True” even if you are not certain. Always return “True” if business accounts are linked or products/services mentioned, even if there is not indication of a partnership. If you think it is very unlikely that it contains (potentially non-obvious) promotion of a product, service, partnership or anything the like and there are no business or specific products mentioned and there is no need for a human to crosscheck, predict “False”. If you are uncertain, err strongly towards “True”.
In a dataset of 50:50 ads and non-ads, this (in my opinion very extremely formulated prompt towards getting “True”) still has a bit of a preference to return “False” over “True” - although for this extreme prompt its only small. Minor changes in formulating it less biased for true, result in 80/20 False/True ratios. ChatGPT is here even stronger biased than davinci and curie, but same pattern. The same (albeit less extreme), applies the other way round to the small models, which always predict True, even for non-ads.
Does anyone have experience/thoughts on this?