For example, I send instructions (10k tokens) to API and begin a thread. I added messages to the thread from time to time.
Will I pay for 10k tokens in instructions whenever I send a message in the thread, or will I pay for instructions only once and later only for messages in the thread?
It is for some people who couldn’t manage to build their own back-end (with or without RAG) to handle it. The prices will keep dropping, I’m sure. They have to pay for all the tokens processed, so they can’t really give it out for free. The context windows are expanding and the prices are dropping, though, which is good. Stay tuned and keep building while keeping the future in mind.