When a ChatGPT-rendered MCP App renders more than one widget then plays a <video> and the user toggles the host display mode (pip → inlineor vise versa), audio drops out and the seek bar either freezes or resets.
Note: We tested this with our app, The Princeton Review, as well as other similar apps, and observed the same behavior. For example, the Udemy app exhibited the same symptoms.
Things we’ve confirmed that are happening:
Switching from PIP → Inline triggers a new tool call and initialization of the MCP app widget.
Steps
-
enter first widget > start a video
-
Pause the video in inline
-
start a second widget > start a video
-
Navigate back to the first widgets video, hit play > Switch the video to PIP
-
Observe the video continues playback, but the scrollbar is no longer tracking the video.
-
If you pause the play/pause buttons no longer function and the video becomes unresponsive
We also noticed that you can refresh the chat and recreate the above scenario without having to launch the videos again.
Things we know:
- When there are two widgets on the screen when switching from pip → inline:
- There is a new tool call to the server.
- Seems like there is a reinitialize of the widget.
- Our theory is that these two items are related to the issue.