Three flavors of System Messages is a bit too much, can the API normalize?

sam.saffron · February 24, 2025, 6:25am

lib/completions/dialects/chat_gpt.rb

84e791a94


      
          # developer messages are preferred on recent reasoning models
          def supports_developer_messages?
            !legacy_reasoning_model? && llm_model.provider == "open_ai" &&
              (llm_model.name.start_with?("o1") || llm_model.name.start_with?("o3"))
          end
          
          def legacy_reasoning_model?
            llm_model.provider == "open_ai" &&
              (llm_model.name.start_with?("o1-preview") || llm_model.name.start_with?("o1-mini"))
          end
          
          def system_msg(msg)
            content = msg[:content]
            if disable_native_tools? && tools_dialect.instructions.present?
              content = content + "\n\n" + tools_dialect.instructions
            end
          
            if supports_developer_messages?
              { role: "developer", content: content }
            elsif legacy_reasoning_model?

This file has been truncated. show original

Currently:

o1, o3-mini - support developer messages
o1-preview, o1-mini - support no system messages and no developer messages
rest of the models - want system messages

I wonder if there is any way for OpenAI to roll out role: "developer" consistently across all models, to avoid this hoop jumping?

curt.kennedy · February 24, 2025, 6:30am

They talk about GPT-5 being like the Borg and assimilating all the models … maybe then?

sam.saffron · February 24, 2025, 6:42am

I had a double take when I named o1-mini and o1-preview “legacy”, but this is the way the world is rolling now, everything is legacy within 3-6 months.

356 · February 24, 2025, 10:20am

I can’t help but think of this.

_j · February 24, 2025, 11:19am

The actual “currently” is:

Send either “system” or “developer” role message
It is translated to the compatible “authority” message type.
(unless o1-preview or o1-mini, not supporting either)

If you are curious and want to exercise the inputs, try interspersing “developer” messages to reasoning models with others, and see if breaking the expected pattern of one main initial message works and has any use for you.

Another curiosity is that max_completion_tokens can be sent as a substitute for max_tokens on previous models – unless you also are using the “prediction” API parameter.

sam.saffron · February 25, 2025, 6:07am

I do not think this is accurate… there are so many flavors out there … in extra fun today:

Message (616 copies reported)

DiscourseAi::Completions::Endpoints::OpenAi: status: 400 - body: {
  "error": {
    "message": "Unrecognized request argument supplied: max_completion_tokens",
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }`

So turns out that on azure gpt-4o-2024-05-13 does not support max_completion tokens, so now we need special case only to add this for reasoning models.

_j · February 25, 2025, 8:03am

You open up delicious worms. Like the deployment ID, geography, and deployment date, along with Azure API version selected, and what untold translation layers exist.

    "ChatRole": {
      "type": "string",
      "description": "A description of the intended purpose of a message within a chat completions interaction.",
      "enum": [
        "system",
        "assistant",
        "user",
        "function",
        "tool",
        "developer"
      ],
      "x-ms-enum": {
        "name": "ChatRole",
        "modelAsString": true,
        "values": [
          {
            "name": "system",
            "value": "system",
            "description": "The role that instructs or sets the behavior of the assistant."
          },
          {
            "name": "assistant",
            "value": "assistant",
            "description": "The role that provides responses to system-instructed, user-prompted input."
          },
          {
            "name": "user",
            "value": "user",
            "description": "The role that provides input for chat completions."
          },
          {
            "name": "function",
            "value": "function",
            "description": "The role that provides function results for chat completions."
          },
          {
            "name": "tool",
            "value": "tool",
            "description": "The role that represents extension tool activity within a chat completions operation."
          },
          {
            "name": "developer",
            "value": "developer",
            "description": "The role that provides instructions that the model should follow"
          }
        ]
      }
    },

Topic		Replies	Views
O1 models do not support 'system' role in chat completion? API o1	22	17440	February 5, 2025
How is Developer Message Better than System Prompt Documentation chatgpt , api , development , system-message	11	5081	February 13, 2025
`developer` role not accepted for o1/o1-mini/o3-mini Bugs	8	1764	March 30, 2025
System vs developer role in 4o model API gpt-4 , api , gpt-4o	2	739	February 13, 2025
What exactly does a System msg do? Prompting chatgpt	11	31216	November 8, 2023

Three flavors of System Messages is a bit too much, can the API normalize?

Related topics