Spam checking illegal content?

Hi im using the api on my bot in discord to detect spam messages and other harmful content.

recently a server of my friend was flooded with those message like “xxx leaks here mommy, …, … ,…” etc etc im sure everyone on discord has seen these illegal offerings.

so when i tried to detect using the api it straight up refused the entire response and canceled everything, which i can understand because its likely to prevent abuse to chatgpt, but it also kinda kills the possibility to try and check if a message is subject to those illegal things, making me wonder on how i can detect them to moderate discord servers.

tl;dr
the api blocks the entire request, making detecting these spam message impossible and therefore preventing me from improving the automod features

2 Likes

Welcome to the community!

Are you using the moderation endpoint API? Or just sending to the API?

1 Like

Hi right now im using this endpoint: ‘/v1/chat/completions’

2 Likes

That’s likely why. You need to send to the moderation endpoint…

https://platform.openai.com/docs/guides/moderation/overview

5 Likes

If it’s just to know if the message can be safely processed by GPT (if gpt based on your input will produce harmful output, the request will be blocked and flagged, risking the account penalties and/or suspension) then you can (must) use the moderation endpoint to see if the received message can be submitted to gpt…

Now if the whole goal of your GPT request is to moderate the message, then I guess (please somebody confirm) using the moderation endpoint might be seen as highjacking the convenient feature. I may be wrong on this. So do your own search.

To skip me the headache with guesses, when I built AI Comment Moderator plugin for WordPress: https://www.techspokes.store

I also used the chat completion API endpoint, except that my prompt is build not to produce anything other than a single digit indicating if the input string must be flagged as inappropriate or can be published. This way no matter what I send to the API endpoint, it never produces anything that doesn’t pass the security filter.

So far works well for me.

3 Likes

Since the plugin launch (about a year and a half) only the sites we manage produced about 10k comments…

2 Likes
if api_error:
    ban(user.id)

lol, but yeah, you probably want to use the moderations endpoint like paul said, there is more info on it on the api documentation

exactly im inputting the text and want to get back a score on how bad it is. I got it to work now finally.

I think the chat completions are better because you can customize stuff

Exactly, plus no need thinking about whether it’s ok using it that way.