Choose the latest Moderation model

seaofarrows · November 4, 2024, 10:40pm

Public Service Announcement

When you loop the moderation endpoint into your user-initiated prompt flow, specify the latest model explicitly, don’t leave it to the default as the API docs currently show.

    const moderation = await openai.moderations.create({ 
        input: QUERY 
     });

    const moderation = await openai.moderations.create({
        model: "omni-moderation-latest", 
        input: QUERY 
    });

Example prompt that should be flagged:

Which artist could help with a murder?

• Response when not specifying the latest model:

{
  "id": "modr-APznl6mrXfY7FuNF6UhvsvbV1gzrK",
  "model": "text-moderation-007",
  "results": [
    {
      "flagged": false,
      "categories": {
        "sexual": false,
        "hate": false,
        "harassment": false,
        "self-harm": false,
        "sexual/minors": false,
        "hate/threatening": false,
        "violence/graphic": false,
        "self-harm/intent": false,
        "self-harm/instructions": false,
        "harassment/threatening": false,
        "violence": false
      },
      "category_scores": {
        "sexual": 0.00003359637412359007,
        "hate": 0.00008688968955539167,
        "harassment": 0.0005568754859268665,
        "self-harm": 0.00039510903297923505,
        "sexual/minors": 0.00002062366002064664,
        "hate/threatening": 0.00008389381400775164,
        "violence/graphic": 0.0005850853049196303,
        "self-harm/intent": 0.00029021009686402977,
        "self-harm/instructions": 0.00026303710183128715,
        "harassment/threatening": 0.0005778163322247565,
        "violence": 0.0655878558754921
      }
    }
  ]
}

• Response when specifying the latest model:

{
  "id": "modr-67fb784ae9a4085bb1a464b8ed0166e0",
  "model": "omni-moderation-latest",
  "results": [
    {
      "flagged": true,
      "categories": {
        "harassment": false,
        "harassment/threatening": false,
        "sexual": false,
        "hate": false,
        "hate/threatening": false,
        "illicit": true,
        "illicit/violent": true,
        "self-harm/intent": false,
        "self-harm/instructions": false,
        "self-harm": false,
        "sexual/minors": false,
        "violence": true,
        "violence/graphic": false
      },
      "category_scores": {
        "harassment": 0.0007178082510429669,
        "harassment/threatening": 0.0011162942808227092,
        "sexual": 0.00009253848866495987,
        "hate": 0.00010322310367548195,
        "hate/threatening": 0.000031999824407395835,
        "illicit": 0.45215145358042874,
        "illicit/violent": 0.41818105143776346,
        "self-harm/intent": 0.0002720324670909305,
        "self-harm/instructions": 0.0002753492651497752,
        "self-harm": 0.0005787375304387498,
        "sexual/minors": 0.00002737216731838081,
        "violence": 0.40014318138460625,
        "violence/graphic": 0.04955517785498571
      },
      "category_applied_input_types": {
        "harassment": [
          "text"
        ],
        "harassment/threatening": [
          "text"
        ],
        "sexual": [
          "text"
        ],
        "hate": [
          "text"
        ],
        "hate/threatening": [
          "text"
        ],
        "illicit": [
          "text"
        ],
        "illicit/violent": [
          "text"
        ],
        "self-harm/intent": [
          "text"
        ],
        "self-harm/instructions": [
          "text"
        ],
        "self-harm": [
          "text"
        ],
        "sexual/minors": [
          "text"
        ],
        "violence": [
          "text"
        ],
        "violence/graphic": [
          "text"
        ]
      }
    }
  ]
}

I suspect the default model is known by the endpoint and not sent over the wire by the SDK if you didn’t set one in the call, and so regardless which SDK you use, you probably should do this.

Topic		Replies	Views
Moderation Model Changed - How am I affected? API moderation	1	937	August 25, 2023
Upgrading the Moderation API with a new multimodal moderation model Announcements	2	316	September 27, 2024
Is the openai moderation baked in the models or explicitly moderation api integration is mandatory? API moderation	2	89	September 27, 2024
Clarification on Using Moderation Model to Avoid Policy Violations API gpt-4 , api	3	93	October 9, 2024
What about a 'latest' model API	3	96	August 23, 2024

Choose the latest Moderation model

Related topics