How do pre-populated threads behave when used with runs

Hi everyone,

I’m trying to clearly understand how prepared threads work when used with Assistants runs.My use case is:

  • I create a thread in advance.

  • I store only my domain data there.

  • Later, I run an assistant against this thread multiple times, so I don’t have to resend all fields every time.

What I want to clarify:

  1. When a run is executed on an existing thread, is the assistant response always appended to that same thread as an assistant message?

  2. Are run metadata and run steps always persisted in the thread, even if I only care about the user messages?

  3. Is there any supported way to:

    • keep a thread as a “data memory” only, and

    • use it as read-only context for runs
      without the run writing additional messages/steps back into it?

  4. Or is the intended model that any thread used in a run becomes an execution log, and purity of thread content is not supported?

I’ve read the docs, but I want to confirm the intended design, not just what’s technically possible.

Thanks in advance for clarifying how this is supposed to be used in production systems.

In Assistants on the API:

  • you must place initial messages or a set of messages into a thread to run. That is the only place to land input to the AI model.
  • Then you run the thread against an assistant ID using its specified model.
  • The AI runs, and when done, appends the AI response in the same thread ID
  • you collect the response.

Thus, placing messages is expected, and you can have a sequence of messages (you initially weren’t allowed anything but “user”). The thread will be modified by the output, so it is not a surface to set-forget-use-retry.

1 Like