Possible to provide context to moderation input?

ValonVN · February 4, 2024, 6:38pm

Evening all,

Following up on the moderation endpoint testing, I’m able to get a list of results based on the input which does a pretty good job I must say,

Only issue I’m having is this, where I can of course provide input for it to check, however there’s simply no context in order for it to tell if the response is justified

responses = client.moderations.create(
	input="You lost me at 'the man who plays rose' I'm glad you're leaving the fandom, homophobe"
)

This in particular is a response comment to another user who is purposefully misgendering a trans indivudual. Now in this case, the second user’s response is perfectly justified, though I’m wondering how I can provide context to the moderation chec (which ideally would be the parent comment)

I can see I’m able to provide an array of inputs but I imagine this simply means it’ll provide a Moderation response for each item, treating these as seperate inputs.

Has anyone had experience with this?

Foxalabs · February 4, 2024, 7:06pm

Hi,

The moderation model does not have enough reasoning power to make nuanced determinations like that, and the models that can, should not be used with unmoderated input, so I understand your situation, and I think that is an area being looked into with newer more advanced moderation methods.

ValonVN · February 4, 2024, 7:16pm

Aaaah right I’m with you,

Well thank you for clarifying, suppose this’ll be something that I keep my eyes on for the foreseeable future!

_j · February 4, 2024, 7:23pm

The moderations endpoint is not for moderating your forum.

It is for filtering the inputs to AI models and for flagging the outputs from them.

Consider that OpenAI uses similar policies to detect misuse. So you don’t want to submit something like that alone to an AI model, but indeed wrapping it in other text could lower the score, while chunking it into parts could increase the score.

Topic		Replies	Views
API endpoint to replicate examples on the "GPT-4 for content moderation" blogpost API moderation	3	969	December 16, 2023
API Moderation inconsistent with chat completion acceptance API	5	1126	January 21, 2024
Moderation fail/strange API	1	737	December 18, 2023
Question about moderation for API usage API gpt-4 , api , moderation	2	1404	October 20, 2023
Bug: Moderation-API returns that really bad input is ok API	6	974	December 18, 2023

Possible to provide context to moderation input?

Related topics