Conversation context and quadratic billing

Hi @nir.01,

as @paul.armstrong mentioned this is the case.

There are some strategies you could deploy to help you on this, for example: OpenAI API: chat completion pruning methods this is a great way to reduce tokens. Or to limit the resubmitted messages to the last 5 ones in your request.

1 Like