Switching Models half way through a chat, out of convenience?

:light_bulb: Feature Request: LLM Flex Switch β€” Seamless Model Swapping in Chat

Summary:
Allow users to toggle between GPT models (e.g., GPT-4o, GPT-4, GPT-3.5) within an ongoing conversation, without losing chat history or starting a new thread.


Why This Matters:
Currently, switching models requires starting a new conversation and losing context. This adds friction and limits experimentation. A seamless in-chat model toggle would unlock:

  • Efficiency: Quickly rerun tasks in GPT-3.5 for speed/cost
  • Power: Deep dive with GPT-4o when needed for complex logic, visuals, or file interpretation
  • Comparisons: A/B test creative output, code, or tone between models
  • Control: Users regain flexibility depending on task priority (speed, quality, or cost)

Suggested Implementation:

  • A dropdown toggle at the top of the chat, labeled:

makefile

CopyEdit

Model: GPT-4o ⏷
  • Optionally auto-suggest model switches for performance-sensitive tasks.
  • Messages could include an indicator of which model generated them.

Bonus Ideas:

  • Tag specific prompts to rerun in a different model on demand.
  • Visual badge (e.g., :light_bulb: GPT-4o or :high_voltage: GPT-3.5) beside each response.
  • Hybrid Mode: Let users set a chat to β€œdefault to 3.5 unless advanced reasoning is detected.”

User Feedback (me):
As a power user juggling sales, technical documentation, AI consulting, and design, I often need both the speed of GPT-3.5 and the advanced capabilities of GPT-4o. A live switch would massively boost productivity and reduce repetitive context-setting