Seeking methods for enforcing strict response rules in OpenAI API – Any Tips?

I’m developing a chatbot using OpenAI’s API and need to enforce strict rules, such as preventing the use of specific words or phrases in responses. Despite using repeated commands in system prompts then trying JSON and XML for structured outputs, we’re encountering issues with consistency and adherence.

Are there any advanced features or techniques within OpenAI’s API that can help enforce these rules more reliably? Any advice on achieving stricter compliance beyond traditional system prompts and structured outputs would be greatly appreciated. I am happy to delve into any programming language for the call functions.

Let’s say for example; we need sentences that never use the conjunctions ‘not just / but’ constructions.
Thanks in advance.
Ps to add I am aware of post processing as a solution but this seems very wasteful of tokens, time and processing.

What you’re asking to do is break the training of the model. Any hacky solution will result in a higher chance of hallucinations and less quality of output.

To change the behavior of the model you should introduce fine-tuning. You will need to very analytical with your dataset. It’s not as simple as “instructing” a model not to say something. You need to implicitly train the model on information that deviates away from this pattern that you’re trying to eliminate.

This, in my opinion is the only solution.

8 Likes

I agree with @RonaldGRuckus. You could try and see if you can get by with a fine-tuned gpt-4o-mini. It’s a good timing now to try as training is free for up to 2M training tokens daily until end of September and it would still be comparably cheaper in use than the regular gpt-4o.

5 Likes

I wanted to say thanks I will try this. Do you know how I use JSONL to command it to ‘not say something’, it seems training involves telling it ‘what to say’ not the reverse. Do you or anyone have any tips on this?

Sticking with the example, would this work to stop something from happening? :

{
  "messages": [
    {"role": "system", "content": "You are a helpful assistant that avoids using over-complicated sentence structures like 'not just/but also'."},
    {"role": "user", "content": "Can you tell me the benefits of regular exercise?"},
    {"role": "assistant", "content": "Regular exercise not just helps you stay fit but also improves your mood.", "weight": 0},
    {"role": "user", "content": "Can you rephrase that without using 'not just/but also'?"},
    {"role": "assistant", "content": "Regular exercise helps you stay fit and improves your mood.", "weight": 1}
  ]
}

Why not just use those as few-shot (in context learning) and make some minor tweaks to the system instructions?

messages = [
    {
        "role": "system",
        "content": "You are a language assistant focused on clear, concise communication. Avoid using complex sentence structures with correlative conjunctions as the use of these will confuse your users. Instead, use simple and direct phrasing.",
    },
    {"role": "user", "content": "Can you tell me the benefits of regular exercise?"},
    {
        "role": "assistant",
        "content": "Regular exercise not only helps you stay fit but also improves your mood and energy levels.",
    },
    {
        "role": "user",
        "content": "Can you rephrase that to avoid using correlative conjunctions?",
    },
    {
        "role": "assistant",
        "content": "Regular exercise helps you stay fit, improves your mood, and boosts your energy levels.",
    },
    {
        "role": "user",
        "content": "Please generate a summary of the key benefits of renewable energy",
    },
]
1 Like

No need to include the behaviour you don’t want from the model. Simply show the model wide variety of examples of how it should “behave” when given the task.

2 Likes

I’ve faced the same challenge. As people have stated, it’s an uphill battle trying to get it to ignore its training. It doesn’t like to be told what not to do. You may already have tried this, but I’ve had success by combining telling it what not to do with telling it what to do instead. Essentially, “instead of doing x, do y”. Then I fine-tuned the model. Good luck!