Feature Request: step-by-step completion

Imagine the following API call (i assume 1 token = 1 word for simplicity):

Prompt: this is a very long prompt.
Tokens: 4
Response: This is its completion

The same result could be achieved using 4 calls:

  1. Prompt: this is a very long prompt.
    Tokens: 1
    Response: This

  2. Prompt: this is a very long prompt. This
    Tokens: 1
    Response: is

  3. Prompt: this is a very long prompt. This is
    Tokens: 1
    Response: its

  4. Prompt: this is a very long prompt. This is its
    Tokens: 1
    Response: completion

Both methods trigger exactly the same 4 inferences. However, the single call bills at N + M tokens, while the M calls bill at (N + M/2)*M tokens, where N is the length of the prompt and M is the length of the completion. Now, the API calls certainly have some overhead, but for inreasing M the second method becomes prohibitively expensive, increasing prices up to 2-3 orders of magnitude.

Why would anyone do that? More output control for the user. Instead of feeding just generated tokens, you can allow the user to edit some before feeding them back.

It would be nice to have a way to make smaller generation steps without costs exploding. It could be some prompt preprocessing step, or just a different pricing model that is closer to the actual resources used.