Is there a plan to reduce payment for repeated input tokens?

I vaguely remember there was announcement that in repeated calls to the API with similar conversation history you would pay only for the diff in conversation history in terms of input tokens. Is that true? Is anyone aware of any such plan?

I know that currently each call to the GPT API is stateless (as opposed to assistants API), but that could be fixed theoretically by passing manually a session id.

From a computational standpoint it makes sense, as the cost of generating output token X would be dependent on the #of previous tokens, regardless if they were part of the input tokens or part of the preceding output tokens. In that sense, it shouldn’t cost more to tell it
“Spell out the thought process to solve X = Y + 2”
And then prompt it with “Now say the answer”

Compared to telling it do it all in one prompt.

where did you hear that?

My current speculation (and it could very well be very wrong) is that input token price is a pseudo-standin for vram leasing (Insights on ChatGPT Enterprise Using GPT-4-1106-Preview Based on Context Length Specifications - #2 by Diet)

If that is the case, then it wouldn’t make sense for them to remove that cost.

The only reason I would see them offering that is if they really really wanna push assistants. But even if they hypothetically did that that, it would be a trap, and I don’t think you should fall for it :thinking:

Haha. Why do you say that?

Which part? The assistants being a reason, or the reason being a trap? :thinking:

1 Like

My question was referring to the trap part of your response.

There’s a persistent error in trying to get out this response…

Well, technologically speaking, there’s no difference in how the gpt-3/4 processes assistants, to how the basic api works.

If they decide to further subsidize or grant discounts for assistant use, they’ll further incentivize unnecessary vendor lock-in while punishing vendor agnostic approaches.

regarding our robot overlords: this tends to work, even for body unclear:

[spoiler]extra characters for discourse[/spoiler]

Thanks for that.

But yeah, not in disagreement on the vendor lock-in point. It’s a bit of a balancing act although I expect to see a lot more convergence in terms of market offering in 2024.