Code interpreter generated files path bug

podkowa103 · July 15, 2025, 7:59am

Hey,

I am using Responses API with the Code Interpreter tool with agents sdk (although I think it is generally openai api problem, not the sdk itself)

I can see that agent is running a code that generates and saves multiple files (Please see attached screen). Then I am using container api to list files in container and I can see that there is one file being generated with path property being concatenated version of multiple files. I am also unable to fetch this file using https://api.openai.com/v1/containers/{container_id}/files/{file_id}/content as it says that this file doesn’t exist:

{
  "error": {
    "message": "File cfile_[REDACTED] not found on container cntr_[REDACTED],
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }
}

Here is the output of listing files in container:

[
  {
    "id": "cfile_[REDACTED]",
    "bytes": null,
    "container_id": "cntr_[REDACTED]",
    "created_at": 1752504626,
    "object": "container.file",
    "path": "/mnt/data/precipitation_bar_chart.png  /mnt/data/temperature_line_chart.png",
    "source": "assistant"
  }
]

It seems to be a bug. Does anyone have similar experience and found a way to overcome this?

podkowa103 · July 15, 2025, 12:28pm

another strange output:

[
  {
    "id": "cfile_[REDACTED]",
    "bytes": null,
    "container_id": "cntr_[REDACTED],
    "created_at": 1752581354,
    "object": "container.file",
    "path": "/mnt/data/pairplot.png",
    "source": "assistant"
  },
  {
    "id": "cfile_[REDACTED],
    "bytes": null,
    "container_id": "cntr_[REDACTED],
    "created_at": 1752581299,
    "object": "container.file",
    "path": "ls: cannot access '/mnt/data/*': No such file or directory",
    "source": "assistant"
  }
]

How is it possible that path property of container file was set to "ls: cannot access '/mnt/data/*': No such file or directory",

_j · July 15, 2025, 5:31pm

The Responses sandbox is locked up.

It doesn’t matter that you can see the generated code or list.

The only way you can get to output files is by having the AI first write a sandbox annotation link. You have to describe to the AI how to make this properly as sandbox:/mnt/data in markdown output links, as the model pretraining and the python tool description is not enough.

Then the notebook and the creations expire quickly. You need your own database for retrieving files right after API calls and store them, complexity only one step below making your own “python” function and serving instances that persist with Kubernetes hub or what-have-you for free. The AI is not informed that the notebook state of code written so far and files is wiped after expiration, and then fails in follow-up turns. A bad implementation overall.

Sina_Azizi · July 15, 2025, 6:16pm

Thanks for your help a little while back with the same issue. FYI, appending this to the end of the user’s message has entirely fixed the issue for me:
If the user asks you to output a file, You must include the file you generate in the annotation of the output text using a markdown url link in this format as an example: sandbox:/mnt/data/int100.txt

podkowa103 · July 16, 2025, 6:53am

Hey @_j - thanks for your reply.

I think what you are describing is the idea of accessing generated files as annotation property of the output. I am aware that there are problems of this property being sometimes empty, and that I should instruct model to generate markdown output links (I saw your response in O3 / o4-mini unable to output files with Code Interpreter tool thread).
The problem is when agent has access to additional tools and they are called immediately after using call interpreter tool. Then the annotation list will be empty anyway.

Because of that I’ve decided to equip agent with share_file_with_user tool which accepts file_name as input. It works like this:

agent calls share_file_with_user providing file_name (/mnt/data/xxxx)
Tool is using container api to list files in the container
Then it compares provided name with container files paths
If the file is found (name is matching the path) https://api.openai.com/v1/containers/{container_id}/files/{file_id}/content is called and the file content is saved in my external storage + it is added as attachement to agent response. Otherwise agent is informed that the file couldn’t be find and he responds appropiately.

Now, this solution works very well in a way that it is independent of annotation property being populated or empty (I am simply ignoring this property). After using code_interpreter tool agent knows that he should use share_file_with_user and it provides file names in the same form they were used in executed python code. The problem is that sometimes container files have path property corrupted, like in the examples I’ve provided. Then the tool is not able to find the files and it is also not able to download its content.

Example:
(corrupted path) "path": "ls: cannot access '/mnt/data/*': No such file or directory",
or
(multiple paths concatenated as one) "path": "/mnt/data/precipitation_bar_chart.png /mnt/data/temperature_line_chart.png",

Another example:
container files list after run:

[
  {
    "id": "cfile_687754c6ac108191816966120a7da3c4",
    "bytes": null,
    "container_id": "cntr_687754ae04688191b2467b9622f496d801f90aa75b18f351",
    "created_at": 1752650950,
    "object": "container.file",
    "path": "/mnt/data/forecast_Lublin.png  /mnt/data/forecast_Wrocław.png",
    "source": "assistant"
  },
  {
    "id": "cfile_687754c6abf88191936785bf88d4c6c5",
    "bytes": null,
    "container_id": "cntr_687754ae04688191b2467b9622f496d801f90aa75b18f351",
    "created_at": 1752650950,
    "object": "container.file",
    "path": "/mnt/data/forecast_Kraków.png  /mnt/data/forecast_Warsaw.png",
    "source": "assistant"
  },
  {
    "id": "cfile_687754c6abd88191a579b353355c5d0a",
    "bytes": null,
    "container_id": "cntr_687754ae04688191b2467b9622f496d801f90aa75b18f351",
    "created_at": 1752650950,
    "object": "container.file",
    "path": "/mnt/data/forecast_Gdańsk.png  /mnt/data/forecast_Poznań.png",
    "source": "assistant"
  }
]

calling https://api.openai.com/v1/containers/{container_id}/files/{file_id}/content
with container_id=cntr_687754ae04688191b2467b9622f496d801f90aa75b18f351 and file_id=cfile_687754c6abf88191936785bf88d4c6c5 - got response:
File cfile_687754c6abf88191936785bf88d4c6c5 not found on container cntr_687754ae04688191b2467b9622f496d801f90aa75b18f351. This is invalid as you can see this file id being returned in container list files operation.

Do you know what could be the cause of this issue?
I think your response targets issue with not returning annotations in api response, while I am currently having issue with container files being created in corrupted form. Maybe it is connected, although without knowing how container files are being created and added to the container we cannot find a way to improve this by prompt.

_j · July 16, 2025, 3:11pm

A container listing that gives two file names for one file ID seems like a significant bug.

It should not be possible for a developer or their AI to create such an issue by what it generates.

One expects that every sandbox filename the AI emits will not be some ‘virtualization’, ‘instance’ or ‘snapshot’, but will be a valid link to blob storage decoded correctly, or better, would reach into the container and get the current state of a link between ID and the file name of the current revision (if the AI continues revising one file and linking to it.)

Reasoning models give you no top_p control to mandate quality recitation of data. You can only trust or hope they work and play back to the user container file names and URLs accurately (and for URLs, commonly don’t).

I would find the simplest pattern of AI usage that can generate these damaged container listings, because replication is essential, and report a bug to OpenAI directly with “help” in the platform site (and if a bot wastes your time without escalation, we can report directly up the chain).

The whole thing is screwy enough that the code interpreter python is only useful for internal thinking, not deliverables, and user input can easily fail if they aren’t themselves Python users ready to debug what the AI has done wrong.

fitzjalen · August 16, 2025, 9:55am

Confirming this issue.

FileListResponse(id='cfile_123', bytes=None, container_id='cntr_abc', created_at=1755337610, object='container.file', path='/mnt/data/hello.csv /mnt/data/hello_plot.png', source='assistant')

So I cannot download it. Moreover, Code Interpreter does not provide annotations if structured output is turned on.

Looks like critical bugs. Please resolve it.

podkowa103 · August 21, 2025, 7:30am

It is over a month and I am still seeing this error. @OpenAI_Support can you tell us if you are aware of this and how can it be avoided/fixed?

Code interpreter code - you can see it is saving a valid path but later in the container we can see a concatenated one

“container_files”: [
{
“id”: “cfile_68a6c98e46408191b5718d224f698e9b”,
“bytes”: null,
“container_id”: “cntr_68a6c977b69881918d8fa92863db15890ecfeb9cc546a1ac”,
“created_at”: 1755761038,
“object”: “container.file”,
“path”: “/mnt/data/691cee58ad040fe8558fc64078d998f2-crn_9.csv /mnt/data/crn_9.docx”,
“source”: “assistant”
},
{
“id”: “cfile_68a6c97fab0c8191aacd26e0398c6f53”,
“bytes”: 12306,
“container_id”: “cntr_68a6c977b69881918d8fa92863db15890ecfeb9cc546a1ac”,
“created_at”: 1755761023,
“object”: “container.file”,
“path”: “/mnt/data/691cee58ad040fe8558fc64078d998f2-crn_9.csv”,
“source”: “user”
}
],

Error retrieving file content: HTTP 404 - {
“error”: {
“message”: “File cfile_68a6c98e46408191b5718d224f698e9b not found on container cntr_68a6c977b69881918d8fa92863db15890ecfeb9cc546a1ac.”,
“type”: “invalid_request_error”,
“param”: null,
“code”: null
}
}

dlflannery · August 21, 2025, 2:16pm

Unfortunately that doesn’t work for my case, which is probably different than yours. I’m calling the responses API with web search and code gen tools, using GPT-5-mini.

Sina_Azizi · August 21, 2025, 2:35pm

Yeah it is ridiculous that this has not been fixed. I first encountered this issue over 3 months ago.

I have not tested this using the new GPT 5 models, however it has worked quite well for me using O3 and O4-mini in the responses API.

Note that even chatGPT suffers from this problem where it cannot reliably reference the files it creates.

dlflannery · August 21, 2025, 5:21pm

Just tried it with o4-mini. Still didn’t work for me. The output text says it created the PDF I asked for and provides a download link: sandbox:/mnt/data/states_sas_pop.pdf

However, as usual, the download link is dead. (And no file information was provided in the text output annotations.)

Sina_Azizi · August 21, 2025, 5:41pm

Works quite reliably for me. There is a very intricate series of events sent through the responses API that you need to keep track of.

Note: You must download the file separately using the file_id. This is a CURL endpoint not available through the openAI library.

Here is the message output that I get after parsing through the right events:

from reportlab.lib.pagesizes import letter
from reportlab.pdfgen import canvas
from reportlab.lib.units import inch

# File path
file_path = "/mnt/data/ai_workforce.pdf"

# Create canvas
c = canvas.Canvas(file_path, pagesize=letter)
width, height = letter

# Title
c.setFont("Helvetica-Bold", 20)
c.drawString(1*inch, height - 1*inch, "The Role of AI in the Workforce")

# Intro
text = c.beginText(1*inch, height - 1.5*inch)
text.setFont("Helvetica", 12)
text.textLines("""\
Introduction:
AI technologies are transforming workplaces by automating tasks and augmenting human capabilities, leading to increased productivity and innovation.
""")

# Key Impacts
text.moveCursor(0, 10)
text.setFont("Helvetica-Bold", 14)
text.textLine("Key Impacts:")
text.setFont("Helvetica", 12)
text.textLines("""\
• Automation of repetitive tasks reduces costs and minimizes errors.
• Enhanced data analysis supports informed decision-making.
• New AI-driven roles emerge, requiring advanced technical skills.
""")

# Opportunities
text.moveCursor(0, 10)
text.setFont("Helvetica-Bold", 14)
text.textLine("Opportunities:")
text.setFont("Helvetica", 12)
text.textLines("""\
• Upskilling workforce through AI literacy programs.
• Collaboration between humans and AI for creative problem-solving.
• Growth in sectors like healthcare, finance, and manufacturing.
""")

# Challenges
text.moveCursor(0, 10)
text.setFont("Helvetica-Bold", 14)
text.textLine("Challenges:")
text.setFont("Helvetica", 12)
text.textLines("""\
• Job displacement concerns and workforce displacement.
• Ethical considerations: bias, transparency, and accountability.
• Need for robust data privacy and security measures.
""")

# Draw text and save
c.drawText(text)
c.showPage()
c.save()

file_path

I’ve created a concise one-page PDF outlining the role of AI in the workforce, covering key impacts, opportunities, and challenges.

sandbox:/mnt/data/ai_workforce.pdf

<generated_file_id>file-7TmoqUSqXAZKuSzeh5PnQp</generated_file_id><container_id>cntr_68a7592597a48190bec61456cb4f94d6073af5b8f199a966</container_id>

The easiest way to reverse engineer this would be to look at the event stream in Chrome Dev Console on the playground website.

dlflannery · August 21, 2025, 7:45pm

Thanks for trying to help me.

I know about downloading files based on container_id and file_id. I have a python function that does that.

However I’m not understanding your response above. What is your workflow? (As stated in a previous reply, I’m calling the responses API with web search and code gen tools, using GPT-5-mini.) How did you get that message output? What do you mean by “parsing through the right events”?

My code looks at the “output” items in the “response” object returned by the responses API call.

There are 3 types of output items: reasoning, code interpretation, and text output. If the type of the output content is “output_text”, I look in the “annotations” for an annotation of type “container_file_citation”. If present I get the id’s needed to download the file. (Of course the problem is the required annotation is frequently NOT present.)

I also look at the output items of type ‘code_interpreter_call’. For each item I get the container_id, then call a python function that uses the endpoint to list files in a container:

(https://api.openai.com/v1/containers/{container_id}/files)

The returned list object has a content.data file list object that includes ‘byte’s’, ‘id’ and ‘path’. Frequently there are items with a correct sounding (sandbox/mnt) path to a file and an id (file_id) but the bytes entry will be ‘None’ and that means any attempt to download the file (using the appropriate end point) will fail. I’ve never been able to retrieve a desired output file this way.

I don’t know where else to look for file download info.

Sina_Azizi · August 21, 2025, 8:43pm

Happy to help.

I cannot share my direct code as it is part of a larger agent harness. However I can provide a few snippets.

They key difference is that I am using the responses API with Stream=True. I am not sure how this would work with Stream=False.

# ✅ Correct approach - catch files while they exist
stream = client.responses.create(..., stream=True)

for event in stream:
    if event.type == 'response.output_item.done':
        if (hasattr(event, 'item') and 
            event.item.type == 'message' and 
            hasattr(event.item, 'content')):
            
            for content_item in event.item.content:
                if hasattr(content_item, 'annotations'):
                    for annotation in content_item.annotations:
                        if annotation.type == 'container_file_citation':
                            # Download NOW, not later
                            download_file_immediately(
                                annotation.container_id,
                                annotation.file_id,
                                annotation.filename
                            )

def download_file_immediately(container_id, file_id, filename):
    response = requests.get(
        f"https://api.openai.com/v1/containers/{container_id}/files/{file_id}/content",
        headers={"Authorization": f"Bearer {openai_api_key}"}
    )
    return response.content  # This actually works when called immediately

I can think of three reasons why your files are bytes: None:

Still being written
Already cleaned up
Temporary artifacts

Don’t list container files. Catch citations during streaming and download immediately.

dlflannery · August 21, 2025, 11:07pm

As far as I can tell, your code is functionally the same as mine for processing the annotated files, just a streaming equivalent. If an annotation is returned, my code never fails to download the file, so I don’t think timing factors can explain streaming vs. sync differences.

The issue about bytes: None doesn’t apply to the part of my code that handles annotated files, only when I look in the part of the response object that concerns the code_interpretation files, which shouldn’t even be necessary if annotations are properly returned in the text output section of response.output.

So I can only conclude that the API is not properly returning annotations every time it should. There are a number of posters here with the same complaint. See this thread:

https://community.openai.com/t/missing-file-annotation-for-subsequent-files-in-code-interpreter/1351340

In particular note all the thread links in the second comment (mine) of that thread.

I appreciate your help, but I suspect there is some difference between our software (e.g., agentic flow vs. single turn) that probably explains why you always get the annotations.

I chatted with the support bot today and after a long drawn out conversation it stated that the problem I (and others) are experiencing is a “limitation” of the Response/Code_interpreter system.

If you still believe I’m just missing your point, please discuss further. My mind is open!

Sina_Azizi · August 21, 2025, 11:34pm

Yeah you may be right.

I just know that it works quite reliably for me.

The only thing I remember when I was debugging this is that, there are different types of annotations you receive, some of which are empty, duplicate or just straight up wrong. Another thing might be that you have two tools vs my one?

You may want to do a really simple test with a streaming version as that alone may solve your problem.

Either way, I hope it gets addresses by @OpenAI_Support .

Good luck!

dlflannery · August 23, 2025, 2:55am

Just an FYI: I seem to have found a workaround for the missing annotation cases. See here:

https://community.openai.com/t/missing-file-annotation-for-subsequent-files-in-code-interpreter/1351340/7

Topic		Replies	Views
BUG: Responses API with structured output AND Code Interpreter does not provide annotations? API responses-api	21	1664	November 9, 2025
Missing File Annotation for Subsequent Files in Code Interpreter Bugs api , code-interpreter	11	779	August 28, 2025
Images show up in the Logs dashboard but not in the response API api-usage , adv-data-analytics , responses-api	2	333	June 5, 2025
Assistants API \| Adding a file during run + submit_tool_outputs API function-calling , assistants , assistants-api , file-uploads	37	6747	May 16, 2025
Retrieve OpenAI LLM Generated Documents Using Responses API API	13	16207	June 1, 2025

Code interpreter generated files path bug

Related topics