Python Backend File Operations Failing Intermittently

The Python backend used inside ChatGPT is intermittently failing to execute file operations, even though the code executes without syntax issues. Functionality that previously worked reliably — including reading/writing JSON, creating backups, and performing atomic file replacement — now regularly fails with no clear error message.

Symptoms

  • Python tool frequently returns the message:
    “The Python code did not successfully execute”

  • The backend fails to write files, copy files, or perform os.replace(...) even though identical code worked for months.

  • There is no actual Python traceback, leaving the user with no actionable debugging information.

  • Failures occur even with very small and simple file operations (JSON dump, text write, backup copy, etc.).

What Used to Work

Until recently, the Python sandbox handled file operations reliably, including:

  • Writing JSON files

  • Creating temporary files and atomically replacing originals

  • Copying files using shutil.copy2

  • Using file paths under /mnt/data/ without errors

  • Running multi-step operations within a single Python exec

All of these operated consistently without intermittent failures.

What Happens Now

Even simple operations like:

with open("/mnt/data/test.json", "w") as f:
    json.dump({"ok": True}, f)

can fail with the generic message above, and no file is created.

Likewise, common tasks such as:

  • shutil.copy2(...)

  • os.replace(...)

  • Writing temp files (/mnt/data/*.tmp)

  • Reading immediately after writing

may fail silently or produce the “did not successfully execute” message.

Anyone else have GPT’s now not working since the 5.1 push?

Not specifically GPTs. But, it is the worst AI for a long time. In ChatGPT, only Python tool is turned on to minimize distraction that still is kilotokens.

See if this is too hard a job to understand:

(image)
For the attached product brochure image, I need an accurate translation to English. It is from 1974, at the dawn of digital audio technology, so some of the terms may need careful reading and interpretation to be normalized to standard language about sampling and digitization today. Produce: 1. a german to english map of the upper-left diagram 2. An english translation of the full descriptive text from the left column 3. Specifications in English You can use all techniques necessary, the first will be to slice the document to half-width in python and return those images for high quality vision. Do not transcribe with code, use your own vision skill.

The lack of newlines is from ChatGPT’s broken copy-paste, even to itself on this retry.

What do I need? English translation.
What do I not need? No tool calling and made up filenames in mono font.

But GPT-5.1 pretends to deliver something, no links, no job completed:

(I inspected the </> mark, and there was some python code, but bad sandbox link produced and obviously no vision work done.)

You can see in thinking: the input is fuzzy at OpenAI’s vision downsizing; but AI won’t slice or understand.

Pushing the AI further to do its job and switching the next reply to 5.1 Pro to see if anything will get done - did I say I wanted a summary? GD disobedient AI, with its own agenda.

Absolutely insulting time-wasting AI. And yes, a more “thinking” ChatGPT AI can slice and dice images and return zoom-ins to itself.


Wind the clock back two years. Chat Completions on the API. gpt-4-turbo with vision. Job done, instant tokens, I don’t ask for code image slicing because no code interpreter (to then deny use of and disobey, seen in internal ChatGPT thinking).

GPT-4-turbo also can do its job:


(API gpt-4-turbo-2024-04-09)

Instead of ChatGPT crapping all over the output with commentary and formatting and input language leaks, and making the output useless:


(ChatGPT GPT-5 Pro)

So that’s a “this model is garbage”, on a single unchallenging everyday task. (besides more ridiculousness I’ve noted.)

Likely underpinning fault: ChatGPT can’t figure out whether to use its “python” (analysis channel, thinking coding that is likely where images can return) or its “python_user_visible” (commentary channel, code that you can observe as a deliverable.)