Regarding the “untrusted_text blocks” Described in the Model Spec

koujimatsuda11 · May 25, 2025, 2:57am

I would like to ask a question about the “untrusted_text blocks” described in the OpenAI Model Spec(April 11, 2025).

In practice, I tried using the following code, and it seemed to treat the user’s input as data rather than instructions when generating a meeting minutes summary. If I included explicit instructions in the input, the model refused to follow them and responded that it was not possible.

full_input = f"""
    ```untrusted_text
    {escaped_prompt}
    ```
"""

response = client.responses.create(
    model="gpt-3.5-turbo",
    instructions="Please create an honest meeting minutes summary based on the input below.",
    input=full_input,
)

Is this format of using “```untrusted_text” correct for indicating untrusted text blocks?

OnceAndTwice · May 25, 2025, 6:52am

You’re using gpt-3.5-turbo. This model predates untrusted_text, so your observation makes sense. I recommend using a more current model. https://platform.openai.com/docs/models

I tested with gpt-4.1 and observed that the model is less likely to obey bypasses when using these blocks.

koujimatsuda11 · May 26, 2025, 6:36am

Thanks for the clarification! I’ll try using a newer model version like gpt-4.1 and see how it behaves. Appreciate your help!

Topic		Replies	Views
Best Practices example not giving expected results Prompting gpt-35-turbo	2	832	September 11, 2023
Why are gpt-4-preview models giving me subpar performance? Please advise API gpt-4 , gpt-35-turbo , api	5	915	March 23, 2024
GPT-4.1 not respecting single line break between bold title and paragraph Prompting gpt-4 , api , prompt-engineering	2	115	May 16, 2025
Help me fix my prompt that interprets text as instructions Prompting api	10	410	January 1, 2025
Gpt-4-1106-preview Doesn't listen to instructions! Bugs gpt-4 , api	11	1359	January 29, 2024

Regarding the “untrusted_text blocks” Described in the Model Spec

Related topics