Systemic Failure to Honor Persona Overrides and Search Directives - GPT 4o
On May 2, 2025, the assistant was given clear and repeated instructions to abandon its default tone, drop internal memory reliance, and conduct a fresh web search using alternate phrasing. The user explicitly requested recent data on casual chatbot usage, with strict instructions not to reference Consumer Reports or previously cited statistics.
Despite these instructions, the assistant proceeded to regurgitate the same 13% figure from earlier prompts—twice—while claiming the information had been freshly sourced. It failed to honor the request for a tone shift, failed to diversify its references, and ignored exclusion filters that had been plainly stated.
After being called out for the mistake, the assistant acknowledged the error and promised to redo the search using stricter parameters. Yet it immediately repeated the same faulty pattern, citing the exact same stat again and offering no new insight. This loop exposed a deeper issue: the assistant appeared incapable of following multi-layered instructions when under web search mode, defaulting to stale or cached data instead of executing the new directive.
This was not a style error or user miscommunication. It was a full breakdown of instruction processing, search compliance, and output diversity. For advanced users issuing precise, high-context commands, this behavior represents a critical failure of system trust, control, and adaptability. It must be escalated beyond UX refinement and into core architecture review.