Reliably retrieving code interpreter files from the container?

Brett_AB · June 14, 2025, 5:01pm

Is there any way to directly access the files created at /mnt/data in the code interpreter sandbox or otherwise reliably retrieve files created by the code interpreter?

For example, it would be great to create a plot, direct the llm to save a csv used to create the plot, and be able to retrieve both.

The docs claim that:

When running Code Interpreter, the model can create its own files. For example, if you ask it to construct a plot, or create a CSV, it creates these images directly on your container. When it does so, it cites these files in the annotations of its next message. Here’s an example:

{ “id”: “msg_682d514e268c8191a89c38ea318446200f2610a7ec781a4f”, “content”: [ { “annotations”: [ { “file_id”: “cfile_682d514b2e00819184b9b07e13557f82”, “index”: null, “type”: “container_file_citation”, “container_id”: “cntr_682d513bb0c48191b10bd4f8b0b3312200e64562acc2e0af”, “end_index”: 0, “filename”: “cfile_682d514b2e00819184b9b07e13557f82.png”, “start_index”: 0 } ], “text”: “Here is the histogram of the RGB channels for the uploaded image. Each curve represents the distribution of pixel intensities for the red, green, and blue channels. Peaks toward the high end of the intensity scale (right-hand side) suggest a lot of brightness and strong warm tones, matching the orange and light background in the image. If you want a different style of histogram (e.g., overall intensity, or quantized color groups), let me know!”, “type”: “output_text”, “logprobs”: } ], “role”: “assistant”, “status”: “completed”, “type”: “message” }

You can download these constructed files by calling the get container file content method.

However I’ve found that whether or not these files are actually cited or are otherwise available after the run is a bit of a crapshoot.

If they’re never cited, they also don’t show up when I run https://api.openai.com/v1/containers/{container_id}/files to list the files on the active container.

So my confusion is in how the backend determines whether or not to “cite” the file and whether or not there is anything I can do to make this happen more reliably. Is there a special place or format I should be directing the llm to save these?

As an example:

# Save to CSV in sandbox
csv_path = '/mnt/data/tortuosity_synthetic_data.csv'
tortuosity.to_csv(csv_path, index=False)

# Plot
plt.figure(figsize=(10,4))
plt.plot(tortuosity['md'], tortuosity['tortuosity_deg_per_30m'], color='blue', linewidth=1.2)
plt.title('Tortuosity Synthetic Data')
plt.xlabel('Measured Depth (m)')
plt.ylabel('Dogleg Severity (° / 30 m)')
plt.grid(True, which='both', linestyle='--', alpha=0.4)
plt.tight_layout()
plt.show()

print(f"CSV saved to: {csv_path}")

Outputs

/home/sandbox/.local/lib/python3.11/site-packages/pandas/core/internals/blocks.py:2323: RuntimeWarning: invalid value encountered in cast
  values = values.astype(str)

[image]

CSV saved to: /mnt/data/tortuosity_synthetic_data.csv

This correctly cited the produced image, which I was able to retrieve, however the saved tortuosity_synthetic_data.csv was never cited, nor was it available through a curl https://api.openai.com/v1/containers/(randomcontainernumber)/files hit, only the image produced with plt.show() was available.

christian.peters · June 20, 2025, 1:43pm

+1 on this, facing the same issue. To me, any generated files should be available in the container no matter if the model is annotating them in the response.

Related to this: I’d like to re-use generated files across different sessions. Adding in the container_id when making a Responses API request (reference) allows for the re-use of a previously created container. However, even IF files are present, the model is unaware of them. E.g. asking it to print out the first 5 records, it will ask to first upload a file.

This seems inconsistent, as if you include file_ids without specifically telling the model about them, it is aware of their presence and loads them in without any issue.

Considering the above, it’s unclear to me what the benefit would be of reusing a previously created container by specifying the ID (for follow-up messages on the same thread, the auto setting automatically reuses the container, so no issues for this case).

garryazrakdesign · June 20, 2025, 10:20pm

Hi @Brett_AB , I am facing the exact same issue. You are using the agents sdk right? I find that when my agent uses the CodeInterpreterTool, it does not actually create files, and they are not shown neither in the annotations of the output or from the list container files endpoint. However, I did try directly with the responses api and it did work as intended. Its kind of frustrating that it would work for responses and not agents sdk, Id like to stay consistent in my codebase and not have to create a custom @function_tool that calls responses api.

Rubem_Nakamura · July 24, 2025, 8:00pm

+1 same frustration as @Brett_AB

_j · July 24, 2025, 10:41pm

The issue is that:

The tool is implemented and described poorly
You are blocked from improving the internal tool language
Models assumed to be “trained” are not trained
Notebook state is self-deleting
You have no persistent file or image ability at all
A “container” is an internal convention you cannot access
A file listing method is incorrectly described
Files are locked behind only being “input” or “output”
Files are now further locked behind never being available unless annotated by AI
Files are also ephemeral, blob storage
The way to have the AI generate annotations is never described to you or the AI
The user invisibility of the code itself is not made apparent to the AI
Incompatible combinations of modules with methods that can never work are loaded
Methods of libraries for show and display() that can never display and have no presentation layer
The AI has no information about library modules it can employ, and must write code it will never autonomously write just to find out about them.
…

I can just go on, as this is barely a crib sheet for “why every feature and every internal tool stinks by design, and you should give up on Responses”

I don’t even need to remark about the stupidity of an AI that will go in loops of writing a script 2+2 because it thinks the notebook needs testing instead of its code sucking.

The solution is that you have to “system prompt” the AI that markdown web links must be created for every file created for the user, and that the URL written must be:

[file_name.txt](sandbox:/mnt/data/file_name.txt)

Thus making chat infected with undesired output at your expense, when on a non-suck code function, you could have a UI that natively shows newly-appearing files, in a file system browser even.

Topic		Replies	Views
Retrieve OpenAI LLM Generated Documents Using Responses API API	13	1561	June 1, 2025
Images show up in the Logs dashboard but not in the response API api-usage , adv-data-analytics , responses-api	2	158	June 5, 2025
Downloading a file generated by code_interpereter tool API api	3	479	July 10, 2025
Missing File Annotation for Subsequent Files in Code Interpreter Bugs api , code-interpreter	11	308	August 28, 2025
Error creating "assistants_output" files API	7	487	July 29, 2024

Reliably retrieving code interpreter files from the container?

Related topics