I’m using the Responses API and o3 / o4-mini with the Code Interpreter tool.
When I request a document generation, I see in responses that the file was generated and saved in /mnt/data/sample.docx but in both the annotations array and when calling the container files API to list files, both return empty.
This happens about 99% of the time with o3 and o4-mini.
Only gpt-4.1 is able to consistently output file ID’s.
Is this a bug with o3 and o4-mini and code interpreter?
{
"id": "msg_683b2e634f7481a09f23839cf597132207891414bf26b07d",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "I’ve created a sample Word document containing headings, formatted text, lists, and a table. You can download it using the link below:\n\nDownload the sample Word document (sample.docx): /mnt/data/sample.docx"
}
]
}
Python internal tool instructions do not inform the AI how to produce links that generate annotations.
The reasoning models do not have quality post-training on generating annotation-style links for sandbox mount point files.
Solution
If using just the Python tool without other tools or functions, you can use your first system message to “extend” the tool definition before your own instructions, where this can be believed as being the same tool:
### python tool usage notes
- python has hundreds of useful preinstalled modules;
- stdio, print, logs, .show() etc are all for AI consumption only;
- user can only receive *presented* generated file output as a deliverable with a markdown file link or markdown image link (URL sandbox:/mnt/data/...);
- use `python` freely for math, calculating, and tests, for reliable answering;
- state persistence: 20 minutes of user inactivity
---
(more)
If you have a collection of tools and functions being passed into the AI so that “python” is not the last, especially your own functions with parallel multi_tool_use, you’ll need to talk in terms of your own tune-up:
# Responses
## python
- when producing a response for a user after generating python jupyter notebook sandbox files for the user, you must provide a markdown file link to the user, using URL style `sandbox:/mnt/data...)`
You can adapt these ideas to what actually performs for you.
What OpenAI must do is allow the developer to change the internal text of all tools in Assistants and Responses, as the instructions are non-performative and general-purpose (and the injections of system messages with counter-intuitive instructions right after internal tool outputs also must stop.)
I thought I’d add: what “python” comes with for instruction:
When you send a message containing Python code to python, it will be executed in a stateful Jupyter notebook environment. python will respond with the output of the execution or time out after 600 seconds. The drive at '/mnt/data' can be used to save and persist user files. Internet access for this session is disabled. Do not make external web requests or API calls as they will fail.
Users may also refer to this tool as code interpreter.
That’s like saying, “the corner bakery has donuts $5 a dozen on Fridays”. You have to figure out that delivering them to the office is expected.
The AI will know how to send to tools. It might not make good decisions about when python is useful for a task, or might misinterpret your want for code as your want to run that code.
The files might have been created by running python code, but a deliverable was not obtained for you to supply to the user via the annotations return parameter.
A markdown link written the way the instructions describe will be scraped out and linked to the file ID within the container.
You could also list all container files while it is still active (under 20 minutes), and even code up a browser for user download.