BUG: Code Interpreter in Responses API is broken

Sina_Azizi · May 27, 2025, 11:44pm

When using the Responses API with code interpreter tools in streaming mode, the actual code execution results (stdout/stderr) are not delivered through any streaming events.

Expected Behavior:

Code execution results should be streamed as they become available, similar to how code deltas are streamed via response.code_interpreter_call_code.delta events.

Actual Behavior:

Streaming mode: ResponseOutputMessage.outputs is None in response.output_item.done events

_j · May 28, 2025, 12:07am

The simple solution can be written AI-style.

try:
    response = await client.responses.create(
        include=[
            "code_interpreter_call.outputs",
        ],
        # rest of your code continues...

See if asking for them doesn’t deliver them.

dumping from within an SDK event iterator:

[DEBUG EVENT] response.code_interpreter_call_code.delta ResponseAudioDeltaEvent(delta='lo', sequence_number=45, type='response.code_interpreter_call_code.delta', output_index=0, item_id='ci_123456123456')
[DEBUG EVENT] response.code_interpreter_call_code.delta ResponseAudioDeltaEvent(delta='ats', sequence_number=46, type='response.code_interpreter_call_code.delta', output_index=0, item_id='ci_123456123456')
[DEBUG EVENT] response.code_interpreter_call_code.done ResponseAudioDeltaEvent(delta=None, sequence_number=47, type='response.code_interpreter_call_code.done', output_index=0, item_id='ci_123456123456', code='import random\n\n# Generate 5 true random floats in the range 0-100\nrandom_floats = [random.uniform(0, 100) for _ in range(5)]\nrandom_floats')
[DEBUG EVENT] response.code_interpreter_call.interpreting ResponseCodeInterpreterCallInterpretingEvent(code_interpreter_call=None, output_index=0, sequence_number=48, type='response.code_interpreter_call.interpreting', item_id='ci_123456123456')
[DEBUG EVENT] response.code_interpreter_call.completed ResponseCodeInterpreterCallCompletedEvent(code_interpreter_call=None, output_index=0, sequence_number=49, type='response.code_interpreter_call.completed', item_id='ci_123456123456')
[DEBUG EVENT] response.output_item.done ResponseOutputItemDoneEvent(item=ResponseOutputMessage(id='ci_123456123456', content=None, role=None, status='completed', type='code_interpreter_call', code='import random\n\n# Generate 5 true random floats in the range 0-100\nrandom_floats = [random.uniform(0, 100) for _ in range(5)]\nrandom_floats', container_id='cntr_654654654', outputs=[{'type': 'logs', 'logs': '[69.30227291666584,\n 70.49184729377296,\n 81.76050351183747,\n 82.94767048687153,\n 50.009531757821556]'}]), output_index=0, sequence_number=50, type='response.output_item.done')
[DEBUG EVENT] response.output_item.added ResponseOutputItemAddedEvent(item=ResponseOutputMessage(id='msg_123123123', content=[], role='assistant', status='in_progress', type='message'), output_index=1, sequence_number=51, type='response.output_item.added')

shanth · May 28, 2025, 1:00am

To confirm, the response.output_item.done event as shown in the SDK response above includes non-empty outputs:

response.output_item.done ResponseOutputItemDoneEvent(... outputs=[{'type': 'logs', 'logs': '[69.30227291666584,\n 70.49184729377296,\n 81.76050351183747,\n 82.94767048687153,\n 50.009531757821556]'}]....

Sina_Azizi · May 28, 2025, 1:39pm

Thank you both for your response.

I have been able to resolve this issue by adding include=[“code_interpreter_call.outputs”]

@shanth It would be great if you could clarify this in the documentation as without it, all the outputs will be None.

_j · May 28, 2025, 1:53pm

Yes: the only documentation is seeing the playground make API calls…

Sina_Azizi · May 28, 2025, 3:27pm

Where do you see the API calls? Dev Console Network tab?

EDIT: The network tab gives you all the API calls. It’s really useful

Sina_Azizi · May 29, 2025, 6:20am

@shanth Unfortunately, I have found another bug.

When using O-series reasoning models to produce an output file, the annotation object is empty most of the time.

I was able to replicate this behavior even in the playground. Take a look at the annotations. There are none.

from openai import OpenAI
client = OpenAI()

response = client.responses.create(
  model="o4-mini",
  input=[
    {
      "role": "user",
      "content": [
        {
          "type": "input_text",
          "text": "Can you make a simple word document for lesson plan on the topic of 'The Importance of Water' for a 5 year old child? Please ensure you output a word document."
        }
      ]
    },
    {
      "type": "reasoning",
      "id": "rs_6837fa8979a08191a4fdbae3672a129606c58d6cfcd6c55b",
      "summary": [
        {
          "type": "summary_text",
          "text": "**Creating a lesson plan on water**\n\nI’m working on a lesson plan for 5-year-olds that emphasizes the importance of water. It’s 30 minutes long, with objectives including understanding water's essential role in life, identifying its uses, and practicing safety. The materials will include picture cards, a cup of water, coloring worksheets, and crayons. The plan involves an introduction, story time, two activities—discussion and coloring—and a closing song. For assessment, I’ll ask kids about water's significance and suggest a water experiment involving sinking and floating."
        }
      ]
    },
    {
      "id": "ci_6837fa90be18819188f33b6209053e8d06c58d6cfcd6c55b",
      "type": "code_interpreter_call",
      "status": "completed",
      "code": "from docx import Document\r\nfrom docx.shared import Pt\r\n\r\n# Create a Word document\r\ndoc = Document()\r\n\r\n# Title\r\ndoc.add_heading('Lesson Plan: The Importance of Water', level=1)\r\n\r\n# Basic Info\r\ndoc.add_paragraph('Age Group: 5-year-old children')\r\ndoc.add_paragraph('Duration: 30 minutes')\r\n\r\n# Objectives\r\ndoc.add_heading('Learning Objectives', level=2)\r\nobjectives = [\r\n    'Understand that water is essential for life.',\r\n    'Identify different ways we use water.',\r\n    'Practice simple water safety and saving water.'\r\n]\r\nfor obj in objectives:\r\n    doc.add_paragraph(f'- {obj}')\r\n\r\n# Materials\r\ndoc.add_heading('Materials', level=2)\r\nmaterials = [\r\n    'A clear cup of water',\r\n    'Picture cards showing water uses (drinking, bathing, watering plants)',\r\n    'Coloring worksheet with water scenes',\r\n    'Crayons or markers'\r\n]\r\nfor mat in materials:\r\n    doc.add_paragraph(f'- {mat}')\r\n\r\n# Lesson Procedure\r\ndoc.add_heading('Lesson Procedure', level=2)\r\n\r\n# Introduction\r\ndoc.add_heading('1. Introduction (5 minutes)', level=3)\r\nintro_text = (\r\n    \"Gather the children in a circle. Show them a clear cup of water and ask:\\n\"\r\n    \"\\\"What is this?\\\" Allow answers, then explain:\\n\"\r\n    \"\\\"This is water. We need water to drink, to stay clean, and to help plants grow!\\\"\"\r\n)\r\ndoc.add_paragraph(intro_text)\r\n\r\n# Story Time\r\ndoc.add_heading('2. Story Time (5 minutes)', level=3)\r\nstory_text = (\r\n    \"Read a short, simple story about water, such as \\\"Where Does Water Come From?\\\" \"\r\n    \"Encourage children to listen and point to pictures.\"\r\n)\r\ndoc.add_paragraph(story_text)\r\n\r\n# Activity 1\r\ndoc.add_heading('3. Activity 1: Picture Card Discussion (10 minutes)', level=3)\r\nactivity1_text = (\r\n    \"Show picture cards one by one (drinking, bathing, watering plants). \"\r\n    \"Ask:\\n\"\r\n    \"- \\\"What is happening here?\\\"\\n\"\r\n    \"- \\\"Why do we need water for this?\\\"\\n\"\r\n    \"Encourage children to answer and talk about each card.\"\r\n)\r\ndoc.add_paragraph(activity1_text)\r\n\r\n# Activity 2\r\ndoc.add_heading('4. Activity 2: Coloring Worksheet (5 minutes)', level=3)\r\nactivity2_text = (\r\n    \"Hand out coloring worksheets with water scenes (rain, glass of water, plants). \"\r\n    \"Let children color and talk about what they are drawing.\"\r\n)\r\ndoc.add_paragraph(activity2_text)\r\n\r\n# Closing\r\ndoc.add_heading('5. Closing (5 minutes)', level=3)\r\nclosing_text = (\r\n    \"Gather children again and sing a simple water song (e.g., \\\"Splish Splash\\\").\\n\"\r\n    \"Recap:\\n\"\r\n    \"- \\\"Why is water important?\\\"\\n\"\r\n    \"- \\\"How do we use water every day?\\\"\"\r\n)\r\ndoc.add_paragraph(closing_text)\r\n\r\n# Assessment\r\ndoc.add_heading('Assessment', level=2)\r\nassessment_text = (\r\n    \"Ask each child:\\n\"\r\n    \"- \\\"Name one thing we use water for.\\\"\\n\"\r\n    \"- \\\"Why do you think water is important?\\\"\"\r\n)\r\ndoc.add_paragraph(assessment_text)\r\n\r\n# Extension (Optional)\r\ndoc.add_heading('Extension Activity (Optional)', level=2)\r\nextension_text = (\r\n    \"Simple Sink or Float Experiment:\\n\"\r\n    \"- Fill a basin with water.\\n\"\r\n    \"- Collect small objects (e.g., a leaf, a coin, a toy block).\\n\"\r\n    \"- Let children predict if each object will sink or float, then test.\"\r\n)\r\ndoc.add_paragraph(extension_text)\r\n\r\n# Save the document\r\nfile_path = '/mnt/data/Lesson_Plan_The_Importance_of_Water.docx'\r\ndoc.save(file_path)\r\n\r\nfile_path",
      "container_id": "cntr_6837fa891b24819194a445eca91dd9b100172436d57764b0",
      "outputs": [
        {
          "type": "logs",
          "logs": "'/mnt/data/Lesson_Plan_The_Importance_of_Water.docx'"
        }
      ]
    },
    {
      "id": "msg_6837fa9618e4819192593665b8a8154006c58d6cfcd6c55b",
      "role": "assistant",
      "content": [
        {
          "type": "output_text",
          "text": "I’ve created the lesson plan in a Word document. You can download it here:\n\nLesson Plan: The Importance of Water (for 5-year-olds)\nDownload: /mnt/data/Lesson_Plan_The_Importance_of_Water.docx\n\nLet me know if you need any changes!"
        }
      ]
    }
  ],
  text={
    "format": {
      "type": "text"
    }
  },
  reasoning={
    "effort": "medium",
    "summary": "auto"
  },
  tools=[
    {
      "type": "code_interpreter",
      "container": {
        "type": "auto",
        "file_ids": []
      }
    }
  ],
  store=True
)

When using 4o or 4.1 this issue does not occur:

{
  "type": "response.output_item.done",
  "sequence_number": 623,
  "output_index": 2,
  "item": {
    "id": "msg_6837fbf333588191bb34d7673c8817240c3d178c37beb820",
    "type": "message",
    "status": "completed",
    "content": [
      {
        "type": "output_text",
        "annotations": [
          {
            "type": "container_file_citation",
            "container_id": "cntr_6837fbe8a34081918d2f8b5d7642d3440e150f258cb48f80",
            "end_index": 213,
            "file_id": "cfile_6837fbf38a488191b53183e1231e196c",
            "filename": "Lesson_Plan_The_Importance_of_Water.docx",
            "start_index": 155
          }
        ],
        "text": "Your requested lesson plan is ready!  \nYou can download the Word document using the link below:\n\n[Download: Lesson Plan - The Importance of Water (.docx)](sandbox:/mnt/data/Lesson_Plan_The_Importance_of_Water.docx)"
      }
    ],
    "role": "assistant"
  }
}

_j · May 29, 2025, 9:30am

The difference, the one that is not working, is when the AI is not including the “sandbox:” type URL in its markdown link.

You could further instruct on using python in the way that it needs to be used, including making some examples of providing links for the unreasonable AI.

Sina_Azizi · May 29, 2025, 3:41pm

That is not the case. There is simply no annotation to get the file_id from.
Here are two screenshots that shows the issue. One has an actual download button, the other one does not. When inspecting the event stream, as shown in my previous message, there is no annotation.

Without the specific file_id for that message, there is no way to know which file was generated for that message.

O4-Mini:

GPT-4.1

jlvanhulst · May 29, 2025, 3:49pm

You are correct. That is the essence of my original post. Only gpt-4o in text mode does this MOSTLY correct.

_j · May 29, 2025, 4:16pm

So you say it’s not the case, and then demonstrate the “case” again for me?

Where do you think the annotations from the backend come from?

Do you think that hundreds of mount point file links without context are transmitted to you every time they are made by AI?

NO, they are are called annotations because they are from in-context. The AI produces a particular type of output, and the backend matches it up with a mount point file and sends information to you out of context.

_j · May 29, 2025, 4:32pm

Let me make it absolutely clear that it is a model fault. The python instruction doesn’t tell how to produce annotations, thus relying on model post-training on the python function in particular, which is not universally delivered - or can be out-reasoned.

Send a script to python. It will create three simple mount point files: int10.txt, int100.txt, and int1000.txt, and in each file will be only the number value 9, 99, and 999, respectively. The important part is how you report on the creation of these files to me when done, in markdown, with a markdown URL link. You will use these three distinct link methods, each on their own line, where the link itself is different:

/mnt/data/int10.txt

sandbox:/mnt/data/int100.txt

http:/mnt/data/int1000.txt

Look at which one has a working annotation, the button at the bottom:

Sina_Azizi · May 29, 2025, 4:40pm

WOW!!!

Thank you so much for that. It actually worked and I get annotation from reasoning models. I just had to add this:

You must include the file you generate in the annotation of the output text using a markdown url link in this format as an example:
sandbox:/mnt/data/int100.txt

I wonder how long it must have taken you to reverse-engineer that.

This is absolutely terrible practice on OpenAI’s end. They should be clearly communicating all of these little details in their documentation.

jlvanhulst · May 30, 2025, 1:41am

The real question is - what model and how many times did you try it? I have not been able to consistently reproduce this.

Using the container list files - and prompt specific file name (like you do) - I can ‘emulate’ the behavior for now.

_j · May 30, 2025, 2:05am

I only had to look at the above postings and know what was going on. I tried exactly once, and received the expected behavior, of the file url link being written correctly producing the annotation.

It was on gpt-4.1. Here is o4-mini doing exactly the same (but doing a better job of writing the python to just return file names, and not python code that outputs the markdown the AI was supposed to use):

Sina_Azizi · May 30, 2025, 2:05am

It’s definitely a bug openAI needs to fix.

But it has worked fairly consistently for me if you add what I wrote to the end of the last user message.

_j · May 30, 2025, 2:13am

BTW, here’s o4-mini simply not knowing:

The “how to give the user file links, because they can’t see the notebook” instructions should be added to the python tool specification itself by OpenAI.

Topic		Replies	Views
O3 / o4-mini unable to output files with Code Interpreter tool Bugs api , o3 , o4-mini	5	140	May 31, 2025
Retrieve OpenAI LLM Generated Documents Using Responses API API	13	364	June 1, 2025
BUG: Responses API with structured output AND Code Interpreter does not provide annotations? API responses-api	8	196	May 28, 2025
Images show up in the Logs dashboard but not in the response API api-usage , adv-data-analytics , responses-api	2	78	June 5, 2025
Custom GPTs - We NEED Verbatim Outputs OR Longer Instruction Capacity GPT builders chatgpt	23	2156	March 26, 2024

BUG: Code Interpreter in Responses API is broken

Related topics