For example, I send instructions (10k tokens) to API and begin a thread. I added messages to the thread from time to time.
Will I pay for 10k tokens in instructions whenever I send a message in the thread, or will I pay for instructions only once and later only for messages in the thread?
You pay for all tokens in and out of the API.
Sad. I thought it would be game changer
It is for some people who couldn’t manage to build their own back-end (with or without RAG) to handle it. The prices will keep dropping, I’m sure. They have to pay for all the tokens processed, so they can’t really give it out for free. The context windows are expanding and the prices are dropping, though, which is good. Stay tuned and keep building while keeping the future in mind.
A lot of details around this don’t seem to be documented anywhere. We’re talking about it here.
As this topic has a selected solution, closing topic.