Reliably retrieving code interpreter files from the container?

Is there any way to directly access the files created at /mnt/data in the code interpreter sandbox or otherwise reliably retrieve files created by the code interpreter?

For example, it would be great to create a plot, direct the llm to save a csv used to create the plot, and be able to retrieve both.

The docs claim that:

When running Code Interpreter, the model can create its own files. For example, if you ask it to construct a plot, or create a CSV, it creates these images directly on your container. When it does so, it cites these files in the annotations of its next message. Here’s an example:

{ “id”: “msg_682d514e268c8191a89c38ea318446200f2610a7ec781a4f”, “content”: [ { “annotations”: [ { “file_id”: “cfile_682d514b2e00819184b9b07e13557f82”, “index”: null, “type”: “container_file_citation”, “container_id”: “cntr_682d513bb0c48191b10bd4f8b0b3312200e64562acc2e0af”, “end_index”: 0, “filename”: “cfile_682d514b2e00819184b9b07e13557f82.png”, “start_index”: 0 } ], “text”: “Here is the histogram of the RGB channels for the uploaded image. Each curve represents the distribution of pixel intensities for the red, green, and blue channels. Peaks toward the high end of the intensity scale (right-hand side) suggest a lot of brightness and strong warm tones, matching the orange and light background in the image. If you want a different style of histogram (e.g., overall intensity, or quantized color groups), let me know!”, “type”: “output_text”, “logprobs”: } ], “role”: “assistant”, “status”: “completed”, “type”: “message” }

You can download these constructed files by calling the get container file content method.

However I’ve found that whether or not these files are actually cited or are otherwise available after the run is a bit of a crapshoot.

If they’re never cited, they also don’t show up when I run https://api.openai.com/v1/containers/{container_id}/files to list the files on the active container.

So my confusion is in how the backend determines whether or not to “cite” the file and whether or not there is anything I can do to make this happen more reliably. Is there a special place or format I should be directing the llm to save these?

As an example:

# Save to CSV in sandbox
csv_path = '/mnt/data/tortuosity_synthetic_data.csv'
tortuosity.to_csv(csv_path, index=False)

# Plot
plt.figure(figsize=(10,4))
plt.plot(tortuosity['md'], tortuosity['tortuosity_deg_per_30m'], color='blue', linewidth=1.2)
plt.title('Tortuosity Synthetic Data')
plt.xlabel('Measured Depth (m)')
plt.ylabel('Dogleg Severity (° / 30 m)')
plt.grid(True, which='both', linestyle='--', alpha=0.4)
plt.tight_layout()
plt.show()

print(f"CSV saved to: {csv_path}")

Outputs

/home/sandbox/.local/lib/python3.11/site-packages/pandas/core/internals/blocks.py:2323: RuntimeWarning: invalid value encountered in cast
  values = values.astype(str)

[image]

CSV saved to: /mnt/data/tortuosity_synthetic_data.csv

This correctly cited the produced image, which I was able to retrieve, however the saved tortuosity_synthetic_data.csv was never cited, nor was it available through a curl https://api.openai.com/v1/containers/(randomcontainernumber)/files hit, only the image produced with plt.show() was available.

4 Likes

+1 on this, facing the same issue. To me, any generated files should be available in the container no matter if the model is annotating them in the response.

Related to this: I’d like to re-use generated files across different sessions. Adding in the container_id when making a Responses API request (reference) allows for the re-use of a previously created container. However, even IF files are present, the model is unaware of them. E.g. asking it to print out the first 5 records, it will ask to first upload a file.

This seems inconsistent, as if you include file_ids without specifically telling the model about them, it is aware of their presence and loads them in without any issue.

Considering the above, it’s unclear to me what the benefit would be of reusing a previously created container by specifying the ID (for follow-up messages on the same thread, the auto setting automatically reuses the container, so no issues for this case).

1 Like

Hi @Brett_AB , I am facing the exact same issue. You are using the agents sdk right? I find that when my agent uses the CodeInterpreterTool, it does not actually create files, and they are not shown neither in the annotations of the output or from the list container files endpoint. However, I did try directly with the responses api and it did work as intended. Its kind of frustrating that it would work for responses and not agents sdk, Id like to stay consistent in my codebase and not have to create a custom @function_tool that calls responses api.