Is there a plan to reduce payment for repeated input tokens?

dan.raviv · January 29, 2024, 4:49pm

I vaguely remember there was announcement that in repeated calls to the API with similar conversation history you would pay only for the diff in conversation history in terms of input tokens. Is that true? Is anyone aware of any such plan?

I know that currently each call to the GPT API is stateless (as opposed to assistants API), but that could be fixed theoretically by passing manually a session id.

From a computational standpoint it makes sense, as the cost of generating output token X would be dependent on the #of previous tokens, regardless if they were part of the input tokens or part of the preceding output tokens. In that sense, it shouldn’t cost more to tell it
“Spell out the thought process to solve X = Y + 2”
And then prompt it with “Now say the answer”

Compared to telling it do it all in one prompt.

Diet · January 29, 2024, 5:25pm

where did you hear that?

My current speculation (and it could very well be very wrong) is that input token price is a pseudo-standin for vram leasing (Insights on ChatGPT Enterprise Using GPT-4-1106-Preview Based on Context Length Specifications - #2 by Diet)

If that is the case, then it wouldn’t make sense for them to remove that cost.

The only reason I would see them offering that is if they really really wanna push assistants. But even if they hypothetically did that that, it would be a trap, and I don’t think you should fall for it

jr.2509 · January 29, 2024, 5:27pm

Haha. Why do you say that?

Diet · January 29, 2024, 5:28pm

Which part? The assistants being a reason, or the reason being a trap?

jr.2509 · January 29, 2024, 5:34pm

My question was referring to the trap part of your response.

There’s a persistent error in trying to get out this response…

Diet · January 29, 2024, 5:40pm

Well, technologically speaking, there’s no difference in how the gpt-3/4 processes assistants, to how the basic api works.

If they decide to further subsidize or grant discounts for assistant use, they’ll further incentivize unnecessary vendor lock-in while punishing vendor agnostic approaches.

regarding our robot overlords: this tends to work, even for body unclear:

[spoiler]extra characters for discourse[/spoiler]

jr.2509 · January 29, 2024, 6:15pm

Thanks for that.

But yeah, not in disagreement on the vendor lock-in point. It’s a bit of a balancing act although I expect to see a lot more convergence in terms of market offering in 2024.

Topic		Replies	Views
Context reuse for shared GPTs and Assistants without additional per-session input token cost GPT builders	3	773	February 16, 2024
Assistants API token usage and pricing breakdown clarification API gpt-4 , api , assistants	10	10489	February 6, 2024
Assistant API - What are Context Tokens in the Billing calculation? API assistants	24	12401	May 6, 2024
Retain past responses in memory without sending them again at every API request API gpt-4 , gpt-35-turbo , chatgpt	11	10540	January 25, 2024
How to control the expenditure of a budget? API chatgpt , api	12	2928	February 9, 2024

Is there a plan to reduce payment for repeated input tokens?

Related topics