Spam checking illegal content?

lyrikerus4 · September 23, 2024, 6:51pm

Hi im using the api on my bot in discord to detect spam messages and other harmful content.

recently a server of my friend was flooded with those message like “xxx leaks here mommy, …, … ,…” etc etc im sure everyone on discord has seen these illegal offerings.

so when i tried to detect using the api it straight up refused the entire response and canceled everything, which i can understand because its likely to prevent abuse to chatgpt, but it also kinda kills the possibility to try and check if a message is subject to those illegal things, making me wonder on how i can detect them to moderate discord servers.

tl;dr
the api blocks the entire request, making detecting these spam message impossible and therefore preventing me from improving the automod features

PaulBellow · September 23, 2024, 6:54pm

Welcome to the community!

Are you using the moderation endpoint API? Or just sending to the API?

lyrikerus4 · September 23, 2024, 7:15pm

Hi right now im using this endpoint: ‘/v1/chat/completions’

PaulBellow · September 23, 2024, 7:22pm

That’s likely why. You need to send to the moderation endpoint…

https://platform.openai.com/docs/guides/moderation/overview

sergeliatko · September 23, 2024, 10:29pm

If it’s just to know if the message can be safely processed by GPT (if gpt based on your input will produce harmful output, the request will be blocked and flagged, risking the account penalties and/or suspension) then you can (must) use the moderation endpoint to see if the received message can be submitted to gpt…

Now if the whole goal of your GPT request is to moderate the message, then I guess (please somebody confirm) using the moderation endpoint might be seen as highjacking the convenient feature. I may be wrong on this. So do your own search.

To skip me the headache with guesses, when I built AI Comment Moderator plugin for WordPress: https://www.techspokes.store

I also used the chat completion API endpoint, except that my prompt is build not to produce anything other than a single digit indicating if the input string must be flagged as inappropriate or can be published. This way no matter what I send to the API endpoint, it never produces anything that doesn’t pass the security filter.

So far works well for me.

sergeliatko · September 23, 2024, 10:35pm

Since the plugin launch (about a year and a half) only the sites we manage produced about 10k comments…

anon25271712 · September 24, 2024, 1:47am

if api_error:
    ban(user.id)

lol, but yeah, you probably want to use the moderations endpoint like paul said, there is more info on it on the api documentation

lyrikerus4 · October 2, 2024, 7:37pm

exactly im inputting the text and want to get back a score on how bad it is. I got it to work now finally.

I think the chat completions are better because you can customize stuff

sergeliatko · October 3, 2024, 11:31am

Exactly, plus no need thinking about whether it’s ok using it that way.

Topic		Replies	Views
Prevent illegal activities? API chatgpt , api	5	985	December 20, 2023
How to avoid being blocked when trying to filter potentially harmful content? API chatgpt , content-warning	0	108	March 18, 2025
User Content Review and Analysis API gpt-4	4	619	February 7, 2024
Question about moderation for API usage API gpt-4 , api , moderation	2	1423	October 20, 2023
GPT-3 API concerned users may get me banned Community	3	2693	December 20, 2023

Spam checking illegal content?

Related topics