What is the best practice for keeping containers alive?

dchu · July 22, 2025, 3:54am

Hi community

I am using OpenAI’s Agent SDK and creating a code_interpreter tool or file_search bound to a container like this:

# Create a container
container = await openai_client.async_openai.containers.create(**container_params)

tool_config = {
    "type": "code_interpreter",
    "container": container.id,
}
tool = CodeInterpreterTool(tool_config=tool_config)

agent = Agent(
    name=f"Agent-{assistant.name}",
    instructions=instructions,
    model=openai_model,
    tools=[tool],
    model_settings=ModelSettings(
        temperature=assistant.temperature,
        top_p=assistant.top_p,
        tool_choice="auto",
    ),
)
await agent.run()  # Perform agent operations

To prevent the container from expiring (since we want to reuse the previous_response_id in subsequent calls), we attempt to keep it alive by periodically retrieving it:

# Every 5 minutes, ping the container to keep it alive
await openai_client.async_openai.containers.retrieve(container.id)

However, it appears that even with periodic retrieve() calls, the container may still expire after a certain period (in some observed cases, possibly around 24 hours).

Is there a recommended way to extend the lifetime of a container?
Is periodic retrieve() sufficient to prevent timeout?
Any best practices for long-lived agents or stateful workflows?

Thank you!

_j · July 22, 2025, 5:48am

It would seem the most reliable way would be to use a separate API call and cheap model that can still write code with top_p:0.0, referencing the container ID in the tool parameter, and ask for a very simple zero-impact script to be sent, such as “Give my your Python notebook environment’s system time by immediately sending a script to python tool”, every 10 minutes or so.

At “$0.03” per container as the method for billing, this should be cheaper than losing the session container, besides the data and work of a chat being lost and unable to continue. Probably undesired pattern by OpenAI, but oh, well.

Under 500 “nano” tokens, $0.00005 (plus $0.03 to show you)

_j · July 22, 2025, 6:17am

Here’s an idea you can automate, prompting the exact code, and use structured output to communicate a failure to maintain notebook state (although you should expect an API error for no container ID).

dchu · July 22, 2025, 7:06am

Thanks in advance for any insights you have shared!

But I am not sure if these can also make the container live longer though. From the document I understand the disclaimer that containers are ephemeral, but I think what we need from OpenAI is more clarity around the actual lifecycle limitations of a container. https://platform.openai.com/docs/guides/tools-code-interpreter

Expiration
We highly recommend you treat containers as ephemeral and store all data related to the use of this tool on your own systems.
A container expires if it is not used for 20 minutes. When this happens, using the container in v1/responses will fail. You’ll still be able to see a snapshot of the container’s metadata at its expiry, but all data associated with the container will be discarded from our systems and not recoverable.
Any container operation—like retrieving the container, or adding or deleting files—will automatically refresh the container’s last_active_at time.
You can’t move a container from an expired state back to active. Instead, you must create a new container and re-upload files. State in memory (e.g., Python objects) will be lost.

So that is why I’m calling retrieve() every 5 minutes to keep the container active and prevent the 20-minute idle timeout. This seems to work in the short term. However, I’ve observed that even with regular activity, containers still seem to expire after a longer duration — possibly around 24 hours. Lets why I need some guidances.

Topic		Replies	Views
"Container is expired" error in OpenAI `Responses API Stream` API	8	118	July 23, 2025
Should “Auto” Spawn a New Container After Timeout? Feedback code-interpreter , responses-api	7	164	August 1, 2025
Retrieve OpenAI LLM Generated Documents Using Responses API API	13	729	June 1, 2025
O4-mini + code_interpreter: 500 error after 2-3min on Fly io but works on Colab Bugs api	1	70	July 7, 2025
How to call code interpreter in agents sdk with a file created in the same run? API	3	248	June 28, 2025

What is the best practice for keeping containers alive?

Related topics