Lessons learnt from speedrunning ChatGPT Apps

I have been speedrunning for 10 days with a bunch of small ChatGPT Apps to experiment + see whats possible. So freaking fun and exciting :rocket:

Didn’t really fit the picture of a full on “app” in ChatGPT (more like POCs) but just wanted to share what I’ve done so you can see what you can do + get my lessons learnt from it all and what potentially OpenAI can do in the Apps SDK.

Lessons Learnt

Since things are moving so quickly, maybe some of the following has been fixed already, feel free to lmk!

  • CSP does not support anything other than https, for example, if I wanted to permit wss://example.com/ws, the Apps SDK would set this to be https://wss://example.com/ws
  • Write tool calls are not supported on mobile apps
  • Mobile PIP potentially makes the PIP window disappear on a subsequent tool call that does not render a widget
  • Since all widgets + resources and loaded at connection time, you have to make sure you wait for the render data to finish loading up. It would be great in the future for the Apps SDK to support rendering widgets from the response of the tool call itself, saving us the round trip from the render data (more aligned with MCP-UI)

Apps

Tech Used + Advice

  • Cloudflare for MCP hosting
  • React Server Components for widget resources using RedwoodSDK - this was a huge win for me for DX, didn’t need a separate build step, can render from tool response directly versus waiting for render data (MCP-UI approach) when it is supported
  • Optional - MCP-UI - Used MCP-UI to support other clients as well, but not required for ChatGPT Apps
  • Overall, don’t need to overthink, you can do any typical frontend setup + tools + frameworks here for Apps SDK

I haven’t felt so excited in a while. Apps in ChatGPT is really the future tbh. And Apps SDK was really well thought out.

7 Likes

Thank you for your sharing, it’s cool that you’ve built so many apps so fast!

interesting, do you mind elaborating more on what you meant by “at connection time” and “wait[ing] for the render data to finish loading up”? More specifically, if I’m understanding it right, at which point in time during the component render phase after an MCP tool call do these correspond to? Why is saving the round trip from render data going to be a significant performance improvement?

On another unrelated note, how did you record those crisp demo videos on those Twitter X posts? I like the part where you zoomed into specific areas of the screen. Those showcased your ChatGPT apps really clearly and I thought to learn from that.

So currently, how it works in the Apps SDK is the following:

  1. When a user connects to an App, before any tool is called, ChatGPT Apps SDK loads up all the static resources / widgets and caches them
  2. When a user makes a tool call that corresponds to a specific resource / widget, it will load up the cached resource
  3. When the resource is loaded, the resource needs to wait for the render data to get the tool output or any other metadata that was sent from the tool output if needed

From my testing, this render data thats providing the tool output takes a while to be propagated from the Apps SDK to the resource. But disregarding the performance, which isn’t huge, is just being able to generate the resource at tool output time would be a huge DX improvement

Thank you! Been using Cap :stuck_out_tongue:

1 Like

Very cool. I’ll look into Redwood myself. interesting.

I’m curious, does Redwood output the html + inline all the needed JS + css, so we don’t have to host different versions of those resources?

Since the document is heavily cached, you need to version any resources (which is good), but then you also need to keep a backlog of them for “older” versions to continue working (especially during a rollout).

Definitely some things I need to fix to get a better DX for versioning + cache-busting, but right now, HTML + JS + CSS all works on my end with Redwood

Didn’t think too deep with backloging older versions, which totally makes sense though, hmm need to see how it could work

1 Like

Yes, once we roll out our app, one thing we need to solve is how back behind we support the “old” UIs.

Since widgets are small, isolated entry points, one posible solution is having everything inline in the original html that is cached by ChatGPT. that way, this becomes a no problem.

LMK what you find. curios to learn more

3 Likes