I appreciated the recent proof-of-concept update revealing how a GPT-5 Thinking variant can now disclose whether it followed instructions or used shortcuts.
This announcement resonated strongly with my own user-side observations.
Over the past weeks, I repeatedly encountered cases where:
The model produced answers that were factually correct but relied on unverified assumptions,
Boundary-related reasoning became blurred, leading to plausible but incorrect justifications,
“Hallucination by explanation” occurred even when the final output appeared coherent.
From a user perspective, these issues were difficult to detect because the responses often looked correct on the surface.
However, they represented a serious risk of misinterpretation—especially in long, multi-step conversations.
After the recent update, I confirmed that these patterns have significantly improved.
In particular, the model’s ability to admit uncertainty and avoid implicit inference has become noticeably more stable.
I’m sharing this note as supplementary evidence from the user side.
I believe continued work in:
clearer uncertainty expression,
transparent boundary-handling,
and co-understanding between model and user
will strengthen both safety and usability.
Thank you for addressing this issue so quickly.
This update was extremely important from the user standpoint.
— Independent Researcher
Chiemical(ちぇみかる)