Introducing Predicted Outputs

nikunj · November 5, 2024, 5:18am

@_j - the intended use case for this for tasks related to rewriting code or documents with minor changes. e.g. “refactor this code to change the variable name from x to y” or “rewrite this blogpost while only changing the name of the product from a to b”. In these cases, you pass the original draft as the prediction and then see inference speed up any time the model output and the predicted tokens match.

You shouldn’t expect this to help you with tasks where you don’t have a good sense of a long response before the model produces the response (which is what your prompt above about a story related to cute kittens is attempting to do).

Topic		Replies	Views
When OpenAI predicted outputed input content is large, the effect is average? API gpt-4	1	117	December 16, 2024
Using predicted outputs for proofreading Feedback gpt-4o , predicted-outputs	1	215	January 22, 2025
Hypothetical Token-increase Strategy . Community gpt-4 , chatgpt	21	253	March 17, 2025
Feature Request: Token Adaptive Model API chatgpt , api	25	2069	August 8, 2023
Do 'MAX tokens' include the follow up prompts and completion in a single chat session API token	22	5273	August 25, 2023

Introducing Predicted Outputs

Related topics