I’d like to propose a feature where users can choose the depth of processing and expected response time before the model generates an answer.
This could look like a toggle or dropdown:
Fast (quick response, higher risk of hallucinations)
Standard (balanced)
Deep (5–10 minutes, with internal checks or retrying)
Or the user could just type tags in their prompt like #deep5min
or #quickreply
.
Why?
- In fields like medicine, law, history, or military analysis, users are often willing to wait for a more careful answer.
- This would reduce hallucinations and increase user trust.
- Think of the model like a student at an oral exam. When rushed, it may bluff. But if allowed time — it thinks and checks.
I think this would be a powerful and user-friendly way to improve answer quality and user agency.