I came across an example in the “GPT-4.1 Prompting Guide”, which recommends using “Prompting-Induced Planning & Chain-of-Thought” techniques. The provided SWE-bench Verified example uses the responses
API, and demonstrates how the model can reason step-by-step before invoking a tool.
Here’s a snippet from that example:
response = client.responses.create(
instructions=SYS_PROMPT_SWEBENCH,
model="gpt-4.1-2025-04-14",
tools=[python_bash_patch_tool],
input=f"Please answer the following question:\nBug: Typerror..."
)
response.to_dict()["output"]
And here’s the output:
[{'id': 'msg_67fe92df26ac819182ffafce9ff4e4fc07c7e06242e51f8b',
'content': [{'annotations': [],
'text': "Thank you for the report, but “Typerror” is too vague for me to start debugging right away.\n\n**To make progress, I need to:**\n1. Find the exact error message text (e.g. `'TypeError: ...'`).\n2. Find which file and which line/function/class the error occurred in.\n3. Figure out what triggered the error (test file, usage, reproduction steps).\n4. Find the root cause and details.\n\n**Next steps:**\n- Investigate error/log/test output files for a Python `TypeError` message.\n- Examine the relevant code sections for problematic type usage.\n- If possible, reproduce the bug locally.\n\n**Plan:**\n- First, I will search for test files and log output in the `/testbed` directory that may contain the full error message and stack trace.\n\nLet’s start by listing the contents of the `/testbed` directory to look for clues.",
'type': 'output_text'}],
'role': 'assistant',
'status': 'completed',
'type': 'message'},
{'arguments': '{"input":"!ls -l /testbed"}',
'call_id': 'call_frnxyJgKi5TsBem0nR9Zuzdw',
'name': 'python',
'type': 'function_call',
'id': 'fc_67fe92e3da7081918fc18d5c96dddc1c07c7e06242e51f8b',
'status': 'completed'}]
There is output_text before function_call!
How can I achieve similar behavior using the Chat Completions API instead of the Responses API? Are there any best practices or prompt patterns to encourage the model to do some reasoning before calling tools?
Another question: in the Chat Completions API, can a response include both content
and a function_call
at the same time? If not, and the first response only returns content
, how does the responses
API know whether to send that to the user as a final answer or to make another completion call to get the actual tool call?
Compare the outputs from different APIs using the same prompt:
response = client.responses.create(
instructions=SYS_PROMPT_SWEBENCH,
model="gpt-4.1-2025-04-14",
tools=[python_bash_patch_tool],
input=f"Please answer the following question:\nBug: Typerror...",
temperature=0.0,
)
print(json.dumps(response.to_dict()["output"], indent=2))
[
{
"id": "msg_685a9621cebc819d9729f778295afaf90c8bb82f11b4d9b1",
"content": [
{
"annotations": [],
"text": "Let's begin by clarifying and expanding on the problem statement:\n\n## Step 1: Deeply Understand the Problem\n\nThe only information provided is: \n**Bug: Typerror...**\n\nThis is not enough detail to know:\n- Where the error occurs\n- What the full error message is\n- What code is involved\n\nHowever, a \"TypeError\" in Python usually means that a function or operation was applied to an object of an inappropriate type (e.g., adding a string to an integer, calling a function with the wrong number or type of arguments, etc.).\n\n## Step 2: Codebase Investigation\n\nSince the error is a TypeError, and the only clue is \"Typerror...\", I need to:\n- Search for \"TypeError\" in the codebase and/or test output.\n- Check for any recent test failures or error logs that mention TypeError.\n- Review the test suite to see if any tests are failing with a TypeError.\n\n### Plan:\n1. List the files in the `/testbed` directory to get an overview.\n2. Check for a test runner or test files (e.g., `run_tests.py`, `tests/`).\n3. Run the test suite to see if any TypeError is reported.\n4. If a TypeError is found, read the full error message and traceback to identify the location and cause.\n5. Investigate the relevant code section.\n\nLet's start by listing the files in the `/testbed` directory.",
"type": "output_text"
}
],
"role": "assistant",
"status": "completed",
"type": "message"
},
{
"arguments": "{\"input\":\"!ls /testbed\"}",
"call_id": "call_6dDcCLcAPUbFVCrDwfiT35qr",
"name": "python",
"type": "function_call",
"id": "fc_685a962b5bf0819d8acf41c128d399400c8bb82f11b4d9b1",
"status": "completed"
}
]
response = client.chat.completions.create(
model="gpt-4.1-2025-04-14",
messages=[
{"role": "system", "content": SYS_PROMPT_SWEBENCH},
{"role": "user", "content": "Please answer the following question:\nBug: Typerror..."}
],
tools=[{"type": "function", "function": python_bash_patch_tool}],
temperature=0.0,
)
print(json.dumps(response.choices[0].message.to_dict(), indent=2))
{
"content": "It looks like your message is incomplete. You mentioned a \"Typerror\" bug, but didn't provide the full error message or any context about where it occurs, what code is involved, or what you were trying to do.\n\nTo help you fix the bug, I need more information. Please provide:\n- The full error message (including the stack trace, if possible)\n- The code snippet or file where the error occurs\n- A description of what you were trying to do when the error happened\n\nOnce you provide these details, I can investigate the issue and guide you through a solution!",
"refusal": null,
"role": "assistant",
"annotations": []
}