Even minimal reasoning efforts result in greatly diminished quality for my use case. Does anyone know if reasoning tags are being hardcoded in, or is there any way to trick the model in the system message not to use them?
I would also like this option
You ask this parameter on Responses, or its Chat Completions standalone equivalent:
reasoning={
"effort": "minimal",
"summary": None,
},
The reasoning will be just a few dozen tokens before output begins as low as 8.
Diminished quality comes not from âreasoning tagsâ, but it is a result of you not allowing the model any pre-thought.
Diminished quality is because the preliminary and hypothetical internally-generated thoughts are what powers quality. Turn that off, and you get uninspected first-go output without an outline, where chat about the problem is also discouraged over just answering.
You donât get state of the art output when you have the AI just state the output.
I can âtrickâ the model at minimal, although its not really a trick:
reasoning_tokens=128,
output_tokens=3711
and more fun here to just make a dump of the âunseenâ but seen reasoning I ask for:
'<<reasoning>>\nI need to produce a mathematically grounded explanation for why the Mandelbrot set (the classic âMandelbrotâ) is so visually and conceptually rich when you zoom in, compared with other fractal-generating algorithms such as Julia sets. The user asks âMathematically, explain why Mandelbrot is one of the most interesting and seemingly infinite fractals in a viewer with zoom, vs the other algorithms like Julia? What are others with such complex features to be revealed by deep exploration?â\n\nGoals:\n- Explain key definitions: complex quadratic map f_c(z)=z^2+c, iteration, orbit, filled Julia set K_c, Julia set J_c, Mandelbrot set M.\n- Explain the difference between parameter space (Mandelbrot set) and dynamical space (Julia sets). Show that Mandelbrot organizes the family of Julia sets; each parameter c corresponds to a Julia set.\n- Explain connectedness locus: M = {c : K_c connected}. That property leads to boundary complexity. For parameters inside M, dynamics stable, outside unstable; on boundary, chaotic behavior.\n- Explain why zooming into M reveals self-similarity, copies (âbaby Mandelbrotsâ), quasi-self-similarity, and intricate filamentary structure due to bifurcations, hyperbolic components, Misiurewicz points, parabolic points, external rays, combinatorics (angles), and local connectivity conjecture (MLC).\n- Contrast with Julia sets: for a fixed c, the Julia set may be connected (if c in M) or totally disconnected (Cantor dust) if c outside M; many Julia sets have limited variety across scale: if c outside M it's Cantor-like â less interesting to zoom (mostly dust). If c inside M and particularly in hyperbolic component, Julia set has structure but often self-similarity is simpler; still some Julia sets (for parameter on boundary) are extremely complex. But the Mandelbrot set as parameter space contains all behaviors aggregated, so exploring M lets you see many different dynamics in one place â that is why it's more rewarding in a zoom viewer.\n- Discuss renormalization: Douady-Hubbard renormalization shows small copies of M (baby Mandelbrots) due to polynomial-like mappings and tuning; this gives infinite nested structure. Explain âtuningâ operation: embedding of smaller copies; also satellite and primitive copies. This provides an explanation for âseemingly infiniteâ complexity.\n- Explain statistic and combinatorics: external rays, landing, combinatorial rotation numbers for bulbs, kneading sequences; parameter plane encodes all possible quadratic dynamics combinatorially â leads to dense set of bifurcation points (Misiurewicz points) where periodic points are preperiodic, producing filaments, dendrites, etc.\n- Explain measure-theoretic and dimensional properties: boundary of M has Hausdorff dimension 2; boundary is fractal and has positive area conjectures? (I must be careful: area(M) known? The area of M is ~1.506⌠but that's area of M itself; boundary has dimension 2 confirmed? I recall Shishikura proved boundary of Mandelbrot set has Hausdorff dimension 2. Yes. Mention this as reason for complexity.)\n- Contrast with Julia set dimensions that vary. Also mention local connectivity (MLC) conjecture: If true, it implies combinatorial structure; but regardless, cusps, dendrites, tiny copies remain.\n- Mention other maps that produce similar richly structured parameter spaces: Multibrot sets (z^d + c) â generalizations that have analogous baby copies and complexity; the exponential family (e^z + c) has parameter plane with fractal structure (the âexponential parameter planeâ has âparameter raysâ and bursting structures); higher-degree polynomials (two-parameter families) like cubic polynomials produce 2D parameter spaces of higher complexity; rational maps; Newton fractals (basins of attraction), Kleinian groups (limit sets) â e.g., Apollonian gasket, Schottky groups produce fractal limit sets with infinite complexity; parameter spaces like the tricorn and multicorns for anti-holomorphic maps (zbar^2 + c) showing different symmetry and features (the tricorn has real-analytic arcs, etc.). Also mention Siegel disks, Cremer points â complex local features revealed by zoom near boundary.\n- Provide specific named examples: Multibrot (z^d + c), Julia sets for particular c (e.g., c = -0.8+0.156i has dendritic Julia), Douady's rabbit (period-3), Basilica (c=-0.123+0.745i?), check values: Basilica is c=-0.123+0.745i? Actually âBasilicaâ is z^2 -1 maybe? Hmm: Basilica is for c = -1? Wait: Need to be careful with numeric claims. The Basilica is the Julia set for z^2 -1, indeed it's connected and corresponds to the period-2 attracting cycle; Douady's rabbit corresponds to c approximately -0.122561 + 0.744862i (but giving approximate values might be okay but better avoid wrong specifics). Instead mention qualitative examples: Douady's rabbit, Siegel discs produce delicate structures.\n- Explain renormalization theory briefly: Feigenbaum universality, complex renormalization leads to self-similarity and infinite nesting.\n- Discuss computational aspects: zooming requires arbitrary precision because tiny features appear at exponential scales; also coloring algorithms, escape time, potential function (Green's function) for smooth coloring. Mention âexternal raysâ and âequipotentialsâ for structure. Note that the viewer of parameter space allows jumps between parameter and dynamical planes (orbit portraits).\n- Provide concluding synthesis: Mandelbrot as âindexâ of quadratic polynomials collects all dynamics, having dense bifurcation points, baby copies by renormalization, boundary of full Hausdorff dimension, so zooming uncovers endless new structures. Julia sets are equally rich but for a fixed c you either get connected complex sets or dust; the parameter plane (Mandelbrot) gives the âmap of mapsâ so exploration yields more variety per zoom. Also list other families to explore.\n\nMake sure to avoid exposing internal deliberation outside the reasoning block. Provide final user-facing text concise but thorough, mathematically accurate, with references to theorems: Douady-Hubbard (polynomial-like mappings, M is connected), Shishikura (boundary dimension 2), Misiurewicz points, renormalization (Feigenbaum, Douady-Hubbard renormalization), MLC mention. Keep computational caution.\n\nNow draft final answer following required format.\n<</reasoning>>
âDiminished quality is because the preliminary and hypothetical internally-generated thoughts are what powers quality. Turn that off, and you get uninspected first-go output without an outlineâ
perfect, exactly what I want
Reasoning is great for coding/difficult STEM tasks, sure. But for specific use cases involving natural sounding writing (completion engines/article writing/dialogue, etc) it simply breaks the flow and biases the model towards outputting generic trash.
I can get around this issue when using Googleâs gemini 2.5 pro by asking the model not to use thinking tags, which results in 0 reasoning tokens. The same trick doesnât work with GPT5 unfortunately.
Also, the model dosenât always seem to respect the reasoning effort parameter. I get anywhere between 1000-2000 reasoning tokens for a simple text completion task after passing âminimalâ reasoning effort.
When I tested âminimalâ reasoning effort with gpt-5-nano, the output was so much worse than gpt-4.1-nano that it was crazy. Iâll test again though. OpenAI removing temperature, frequency penalty and presence penalty is also a really big downgrade.
