Thanks for your responses on the use of Assistant instructions and all prior responses being submitted with each new message.
Suppose I can see why this would be the most straightforward way to implement a first version of Assistants. Simply (I know, nothing’s simple!) moves the responses and instructions to persistent storage on OAI’s servers, then submit it all same as if it was sent up by the client.
At first I just assumed that there would be a large benefit. Some kind of persistent summarization or relief from input token usage in this model, and went deep on implementing it as an evolution of my approach, which is quite Assistant-like but with state on my server. The tools are a differentiator, but I and other have found the retrieval of a file is erratic: myFiles_ browser seems to fail regularly or at best intermittently via API. Some report it works ok in playground.
Also, it’s unclear what costs are being incurred by file retrieval/RAG, with no token consumption reporting by the API.
So, net net, while it’s great to have this first cut and I really dig the way it was implemented and the Assistant architecture, it looks like a wait and see. Hopefully the team at OAI will take it to the next step, getting some cost benefits and token utilization optimization up and running, opportunities that appear to be inherent in the conception and implementation of the Assistant model.
Ron