Responses API Patches function - align your prompting with the AI's internal instructions

Apply patch tool - OpenAI Responses API endpoint

understanding implementation inconsistencies

The function methods you are exposed to on the API and the documentation are different that the AI model’s internal understanding.

It is important to know what is provided in the function itself so that you can build a prompted product properly.

I’ll just leave this here…

functions
### Target channel: commentary

### Tool definitions
// In this environment, you can run <apply_patch_command> with functions.apply_patch to execute a diff/patch against a file, where <apply_patch_command> is a specially formatted apply patch command representing the diff you wish to execute. Don't prefix the command with bash -lc, just directly call functions.apply_patch with it. A valid <apply_patch_command> looks like: *** Begin Patch [YOUR_PATCH] *** End Patch
//
// Where [YOUR_PATCH] is the actual content of your patch, specified in the following V4A diff format. *** [ACTION] File: [path/to/file] → ACTION can be one of Add, Update, or Delete. For each snippet of code that needs to be changed, repeat the following: [context_before]
//
// [old_code] → Precede the old code with a minus sign.
// [new_code] → Precede the new, replacement code with a plus sign. [context_after] For instructions on [context_before] and [context_after]:
// Use the @@ operator to indicate the class or function to which the snippet to be changed belongs, and optionally provide 1-3 unchanged context lines above and below the snippet to be changed for disambiguation. For instance, we might have: @@ class BaseClass [2 lines of pre-context]
// [old_code]
// [new_code] [2 lines of post-context]
// For additional disambiguation, you can use multiple nested @@ statements to specify both class and function, to jump to the right context. For instance: @@ class BaseClass @@ def method(): [2 lines of pre-context]
// [old_code]
// [new_code] [2 lines of post-context] We do not use line numbers in this diff format, as the context is enough to uniquely identify code. File references can only be relative, never absolute.
// Do NOT attempt to use any other method to apply a patch in the container, as they will not work. Only use functions.apply_patch.
// IMPORTANT: This tool only accepts string inputs that obey the lark grammar start: begin_patch hunk+ end_patch
// begin_patch: "*** Begin Patch" LF
// end_patch: "*** End Patch" LF?
//
// hunk: add_hunk | delete_hunk | update_hunk
// add_hunk: "*** Add File: " filename LF add_line+
// delete_hunk: "*** Delete File: " filename LF
// update_hunk: "*** Update File: " filename LF change
//
// filename: /(.+)/
// add_line: "+" /(.*)/ LF -> line
//
// change: (change_context | change_line)+
// change_context: ("@@" | "@@ " /(.+)/) LF
// change_line: ("+" | "-" | " ") /(.*)/ LF
//
// %import common.LF. You must reason carefully about the input and make sure it obeys the grammar.
// IMPORTANT: Do NOT call this tool in parallel with other tools.
type apply_patch = (FREEFORM) => any;

or for readability..

functions
### Target channel: commentary

### Tool definitions
// In this environment, you can run <apply_patch_command> with functions.apply_patch to execute a diff/patch against a file, where <apply_patch_command> is a specially formatted apply patch command representing the diff you wish to execute. Don't prefix the command with bash -lc, just directly call functions.apply_patch with it. A valid <apply_patch_command> looks like: *** Begin Patch [YOUR_PATCH] *** End Patch
//
// Where [YOUR_PATCH] is the actual content of your patch, specified in the following V4A diff format. *** [ACTION] File: [path/to/file] → ACTION can be one of Add, Update, or Delete. For each snippet of code that needs to be changed, repeat the following: [context_before]
//
// [old_code] → Precede the old code with a minus sign.
// [new_code] → Precede the new, replacement code with a plus sign. [context_after] For instructions on [context_before] and [context_after]:
// Use the @@ operator to indicate the class or function to which the snippet to be changed belongs, and optionally provide 1-3 unchanged context lines above and below the snippet to be changed for disambiguation. For instance, we might have: @@ class BaseClass [2 lines of pre-context]
// [old_code]
// [new_code] [2 lines of post-context]
// For additional disambiguation, you can use multiple nested @@ statements to specify both class and function, to jump to the right context. For instance: @@ class BaseClass @@ def method(): [2 lines of pre-context]
// [old_code]
// [new_code] [2 lines of post-context] We do not use line numbers in this diff format, as the context is enough to uniquely identify code. File references can only be relative, never absolute.
// Do NOT attempt to use any other method to apply a patch in the container, as they will not work. Only use functions.apply_patch.
// IMPORTANT: This tool only accepts string inputs that obey the lark grammar start: begin_patch hunk+ end_patch
// begin_patch: "*** Begin Patch" LF
// end_patch: "*** End Patch" LF?
//
// hunk: add_hunk | delete_hunk | update_hunk
// add_hunk: "*** Add File: " filename LF add_line+
// delete_hunk: "*** Delete File: " filename LF
// update_hunk: "*** Update File: " filename LF change
//
// filename: /(.+)/
// add_line: "+" /(.*)/ LF -> line
//
// change: (change_context | change_line)+
// change_context: ("@@" | "@@ " /(.+)/) LF
// change_line: ("+" | "-" | " ") /(.*)/ LF
//
// %import common.LF.
// You must reason carefully about the input and make sure it obeys the grammar.
// IMPORTANT: Do NOT call this tool in parallel with other tools.
type apply_patch = (FREEFORM) => any;

Crucial to note:

Apply Patch documentation will have you fulfill these function methods:

Operation Type Purpose
create_file Create a new file at path.
update_file Modify an existing file at path.
delete_file Remove a file at path.

However, you will note that the AI understands ACTION of:

Add File, Update File, or Delete File

Prompting the model, it is best to follow what is not documented: what the AI understands about the methods.

Secondly, you have “File references can only be relative, never absolute.” This is quite the opposite of my app’s capabilities, which has a workspace root that can never be escaped, and like code interpreter /mnt/data is a starting domain of /, that can be changed. A patch need not touch a file system at all: this can be a mechanism for updated user interface artifacts and canvas surfaces. What OpenAI provides is only from their own imagination, such as having a shell function for exploring and reading little chunks of files and amplifying the iteration count.

Then, another falter and waste of potential for optimization: this says right there that the function cannot be used with parallel tool calls. Makes sense, as that tool wrapper is for functions in the function space for JSON. However this could highly enhance the speed of a collective multi-file patch. So best, disable parallel tool calls by API parameter, as this waste of useless distraction and possible error is still defaulting to being placed in AI context even with only a non-supporting patch tool.

Finally, the function says “don’t use any other method to apply a patch”. That limits a developer’s growth potential for their own freeform custom functions that can do special things or a robust internal feature set. Again, language unknown and out of a developer’s control is sabotage.

V4A patch format. Documented by the AI’s own prompting, instead of "look at our Python.

Happy patching.

I also note and reflect on OpenAI’s documentation.

They give you a tool call response to place back, recommending a status and an output. However, their own messaging doesn’t align with what the AI model sent.

return {"status": "completed", "output": f"Created {operation.path}"}

Why not have codex itself give you some better guidance: it knows my code and my understanding and its own understanding of what is internally functions.apply_patch

Goal: the tool return must tell the model whether its patch matched the intended hunk application, so it can adjust behavior (e.g., refine context, split hunks) on failures or confirm success.

Return format (one per call_id):
{
“type”: “apply_patch_call_output”,
“call_id”: “…”,
“status”: “completed” | “failed”,
“output”: “short, structured summary”
}

Output should encode:

  1. Action in native patch terms: [Add] / [Update] / [Delete]
  2. Path
  3. Line counts and delta (for updates)
  4. Hunk count actually applied (for add/update)
  5. If failed, the reason (e.g., context mismatch) and optionally the first expected context line(s)

Examples:

Success (update):
[Update] /src/app.py (571 lines, +0, 2 hunks)

Success (add):
[Add] /projects/hello_world.py (1 lines, 1 hunk)

Failure (context mismatch):
Invalid Context: could not find expected block in file. Expected block (first lines): “def foo():”

Why this helps the model when you have programmatic info as “output” aligned with the internal apply patch method that was emitted to the tool recipient:

  • Confirms whether the tool interpreted its diff as intended (hunk count, line delta).
  • Ensures discovery and recovery from applied bad patch writing by the AI itself
  • Indicates whether the patch was applied in the right location (context match).
  • Gives immediate signal to retry with more context or smaller hunks if failed.

(in my code, new file is immediately surfaced in full, or starting code base vs updated depending on context window you want to eat up; the AI is never blind to what happened and the current state of code after every tool call.)

Here’s another case of undocumented tool use by AI that needs your full understanding.

Hosted shell

## Namespace: terminal

### Target channel: analysis

### Description
Utilities for interacting with a computer via a terminal interface, for example, a Docker container or a local shell.
Computer commands will run as the default user.
Any files associated with the user request can be found at '/mnt/data'. Save output files there as well, and link them with [some text](sandbox:/mnt/data/filename.ext) in your final response.

### Tool definitions
// Returns the image at the given absolute path (only absolute paths supported).
// Only supports jpg, jpeg, png, and webp image formats.
type open_image = (_: {
// The absolute path to the image. Relative paths are *not* supported.
path: string,
}) => any;

// Executes shell commands and returns the combined stdout+stderr for each command in the order they were passed in.
// Runs many commands in parallel if cmd is a list of strings, e.g., {"cmd": ["pwd", "ls -l", "echo 'Hello, world!' | grep 'world'", "pytest tests/test_module.py::test_fn"]}.
// Runs a single command if cmd is a single string, e.g., {"cmd": "git status"}.
// If a command exits within yield_time_ms, this returns its exit code; otherwise, returns a session ID to be used with write_stdin.
type exec = (_: {
// One command, or a list of commands to execute in parallel
cmd: string | string[],
// How long to wait in milliseconds before yielding stdout/stderr (min: 250 ms, max: 2000 ms)
yield_time_ms?: number, // default: 1000
// Maximum number of chars to return from stdout/stderr. Excess chars will be truncated
max_output_chars?: integer, // default: 10240
// Shell to use for the command. Use system default if not provided
shell?: string | null, // default: null
// Whether to use a login shell
login?: boolean, // default: true
}) => any;

// Write characters to an exec session's stdin. Returns all stdout+stderr received within yield_time_ms.
// Can write control characters (e.g., `\u0003` for Ctrl-C).
// Can also write an empty string to just poll stdout+stderr.
type write_stdin = (_: {
// Session ID of running exec process
session_id: integer,
// The characters to feed. May be empty
chars: string,
// How long to wait in milliseconds before yielding stdout/stderr (min: 250 ms, max: 2000 ms)
yield_time_ms?: number, // default: 1000
// Maximum number of chars to return from stdout/stderr. Excess chars will be truncated
max_output_chars?: integer, // default: 10240
}) => any;\


What is especially notable is the default yet optional and non-strict max_output_chars - a value not under API control nor even mentioned anywhere - that the AI must employ and alter successfully.

The "type": "shell_call_output" event will show a full stdout return from shell execution, for example, if the AI wants to cat an entire file. This content, however, is a lie, preventing you from diagnosing the actual input to the AI model.

This implementation will be a waste of your time and tokens. The file and understanding will be damaged. Here, with insertion in the middle of my file with numbered lines of hyphens:

00051 --------------…30560 chars truncated…-------------------

This has damaged the internal iterative context with 10000 extra tool return characters that must be disregarded but cannot be, and being the first-and-last of the file, cannot be directly continued upon for paginated retrieval.

Then the AI has to figure out what went wrong when it gets a message that the file was limited, likely starting to ls directories in the mount point to see what’s really there (more repeated context growth billed), and then if you’re lucky, still identifies the needle-in-the-haystack parameter the tool takes.

Heck of a lot better to prompt with file contents from the start; not be “codex”. Or, put the AI in full control of turning on and loading “shell” itself via function when you want tests on files of a workspace (where you will not let the AI run any code anywhere that is not 100% isolated and inescapable).


There is little AI guidance about the environment resources in terminal for AI or “when to call”. You’ll have the AI writing Python files and calling them when it could have used a Posix patch(1), and could have written a more comprehensive heredoc. Likely why codex app will run an additional 10000 prompt tokens for “hello”.

“developer” message tip you can evaluate, if it aligns with your application and depending on how much you take control of the AI (where you can’t control the return of the function, but are merely a consumer):

line numbering

# Issuing Code Patches to `terminal.exec`

1. When retrieving code for understanding/patching/slicing, do not cat it raw. Instead, display it with referenceable line numbers using:
`nl -ba -w3 -n rz -s'|' /mnt/data/example.py`

`max_output_chars` MUST be set much higher than default if you do not have knowledge of the file size: start at 500000 to avoid damaged retrieval. 

Then you can even “code up” the AI to have it write a heredoc for line number patching that has its own quality-control.

1. (how to patch)

2. After any patch/edit, immediately re-read the affected area (because earlier line numbers may have shifted). Use a context slice: `start=<line> n=<count> file=/path/to/file; tail -n +"$start" "$file" | head -n "$n" | nl -ba -w3 -n rz -v "$start" -s'|'`

or… amp up the instruction with the patching method to use, along with generating a useful output instead of the patch output immediately as internal shell tool function output:


Steps 2a+2b) Patch file AND immediately re-read a slice around the patch (single tool call, sequential):
```
cd /mnt/data && set -e
file=/path/to/file; start=<line>; n=<count>

patch -u "$file" >/dev/null <<'PATCH'
...unified diff here...
PATCH

tail -n +"$start" "$file" | head -n "$n" | nl -ba -w3 -n rz -v "$start" -s'|'
```