Trying to make the switch from text-moderation-latest to omni-moderation-latest. Having big issues with how sensitive it is marking everything as having false trues. We normally run about a few thousand a day for this, with custom thresholds, to make it a little less sensitive. Since trying to test with the new model, it’s marking almost half of our items as true, where before we would average around 10 a day, this new model is marking about 1000.
I have tried to increase the thresholds to make it less sensitive, but it’s still marking a ton as true. Anyone else running into this or any suggestions on how to make this operate more like it did before?
The thresholds are all reset and rescaled for this model. You’ll have to reset any of your own thresholds. You could do that by taking the “flagged” AND “your threshold” = flag to start, and not over-moderate, but potentially letting flagged stuff thru.
This model, now released over a year ago, is supposed to be closer to 0.5 = flag or the probability the input is bad. It would be nice if OpenAI simply publishes the flagged value for every category, but they likely hit the model and its thresholds with changes silently just like they do other models. Then you could even know “how far beyond” the flag is.
You can send more or less context as a “retry” and see if it is the particular slicing that the model does internally that was flagged - and if you’d want to send the input if there is a alternate technique to get it to pass. Consider OpenAI also running that same moderation and more against your API account as a whole…
So they’re supposed to be all closer to .5 for true now? I haven’t see that but I can try. I’ve run multiple tests and keep incrementing the thresholds but so far they keep showing as true where before they weren’t.