For regular chat (not assistant/threads) are we sending the entire context every time still?

With the new threads feature you only have to send the new message and the context is stored on the server which makes the request smaller. I’m not seeing this with the normal chat api though, it still looks like we have to send the entire context window everytime we add a message including any functions/assistant responses. Is this true?

Yea, I think threads are only used in the context of Assistants API, and unfortunately, so far it doesn’t seem like we’re getting any price benefit with that. There’s already a conversation here.