hey all, I honestly was pretty underwhelmed at first with GPT-5 when I used it via the API.. It felt slow, and the outputs weren’t great. But then I found this prompting guide and realized this model is very adaptive and requires very specific instructions.
That prompting guide however was a bit tough to parse, so I tried to simplify it in my own version: here’s the link. I probably cover ~18 approaches there, but here are the ones that made the biggest difference for me:
-
Set lower reasoning effort for speed: There is a new value for the
reasoning_effort= minimal. Using this value, you can cut latency and get fast executions. Then, have the model include 2–3 bullet points summarizing its reasoning before giving the final answer. -
For more complex tasks, define clear criteria: Set goals, method, stop rules, uncertainty handling, depth limits, and an action-first loop. (hierarchy matters here). For example you can add all of this in <context_gathering> brackets of your prompt:
<context_gathering>
Goal: Gather enough context fast. Parallelize searches and stop once action is possible.
Method:
- Start broad → branch to focused subqueries.
- Run varied queries in parallel; read top hits; deduplicate/cache; avoid repeats.
- If needed, run one targeted batch only.
Stop when:
- Exact change content is known, or
- ~70% of top hits agree.
Escalate once if results conflict or scope is fuzzy.
Depth: Trace only relevant symbols/contracts; skip unnecessary expansion.
Loop: Search → minimal plan → act. Search again only if validation fails or new unknowns appear.
</context_gathering>
- For complex tasks, increase reasoning effort: Use
reasoning_effort= high with persistence rules to keep solving until done. This will increase the model’s eagerness (proactivity) into solving the task at hand. You can add this in brackets like this:
<persistence>
- You are an agent
- Keep going until the user's query is completely resolved, before ending your turn and yielding back to the user.
- Only terminate your turn when you are sure that the problem is solved.
- Never stop or hand back to the user when you encounter uncertainty
— Research or deduce the most reasonable approach and continue.
- Do not ask the human to confirm or clarify assumptions, as you can always adjust later
— Decide what the most reasonable assumption is, proceed with it, and document it for the user's reference after you finish acting
</persistence>
- Add an escape hatch: Explicitly the model how to act when uncertain instead of stalling. Example:
<context_gathering>
- Search depth: very low
- Bias strongly towards providing a correct answer as quickly as possible, **even if it might not be fully correct.**
- Usually, this means an absolute maximum of 2 tool calls.
- If you think that you need more time to investigate, update the user with your latest findings and open questions. You can proceed if the user confirms.
</context_gathering>
- Control tool preambles: GPT-5 will now output tool preambles, where it explains how and why it chose to call a certain tool. You can now manage those preambles, but giving clear instructions in your prompt, explaining how tools should be called:
<tool_preambles>
- Always begin by rephrasing the user's goal in a friendly, clear, and concise manner, before calling any tools.
- Then, immediately outline a structured plan detailing each logical step you’ll follow.
- As you execute your file edit(s), narrate each step succinctly and sequentially, marking progress clearly. - Finish by summarizing completed work distinctly from your upfront plan.
</tool_preambles>
Anything else you’ve tried that worked for your usecase?