How to Reliably Use the Output from the Code Interpreter

I’m currently building an agent using the OpenAI Agent SDK, and the agent uses the code interpreter tool.

The problem is that I want to use the results from the code interpreter exactly as they are, but the agent seems to paraphrase or modify the output instead of using it directly.

Is there any way to ensure that the agent uses the raw output from the code interpreter without altering it?

In other words, is there a way to directly use the values computed by the code interpreter as-is?

1 Like

Hi @dlfdltkatk

On the python Agents SDK, you should be able to do that by examining the new_items which are the new items generated during the agent run. These include things like new messages, tool calls and their outputs, etc.

Everything hinges on reading role: "tool" messages (or their streaming deltas) and ignoring the assistant’s prose.

Here’s an example to get you started:

import asyncio

from agents import Agent, Runner, CodeInterpreterTool, ToolCallOutputItem


agent = Agent(
    name="Hello world",
    instructions="You are a helpful agent.",
    tools=[CodeInterpreterTool(tool_config = {
        "type": "code_interpreter",
        "container":{"type": "auto"}
    })],
    model="gpt-4o"
)


async def main():
    result = await Runner.run(agent, input="Which element in the fibonacci series is the 11th prime number in the series")
    print(result.final_output)
    # 1) All raw model/tool messages
    print(result.raw_responses);   # every LLM + tool message exactly as produced

    # 2) Just the code‑interpreter result
    for item in result.new_items:                  # each is a RunItem
        if isinstance(item, ToolCallOutputItem):
            print(item.raw_item)                   # exact tool output


if __name__ == "__main__":
    asyncio.run(main())

Hope this helps.

4 Likes

Thank you for your response.
I attempted to retrieve the output using the following code:

for new_item in result.new_items:
    if (
        new_item.type == "tool_call_item" 
        and new_item.raw_item.type == "code_interpreter_call"
    ):
        ...

However, I found that it does not capture the actual output of the code executed by the code interpreter.

To work around this, I’m currently having the code interpreter explicitly write results to a file, and then I try to read that file afterward.
But I’ve been running into frequent issues where the files created by the code interpreter are not retrievable via client.containers.files.list.

Do you happen to know any potential solutions or recommended approaches for handling this?