I am using the Assistants API with a longer prompt and several tool calls. I recognize that the stream response contains that longer prompt several times as the event data.
I found it for example within Thread.run.created, Thread.run.queued, Thread.run.in_progress| Thread.run.required_action.
When I look at the responses in the openai playground I see the prompt also come up which makes me think that it is not an issue with my backend proxy.
Question:
- is it really meant to let the API send tens of kb of prompt data as part of the streamed response?
- is there a way to reduce the amount of data sent?