ApiGateway - Stream Buffering Token Response Header

andrew.butson · April 21, 2023, 1:49pm

We are using openai behind our enterprise API-gateway. Some of our consumers are interested in using SSE (streams). However, we face a challenge when trying to capture token information from the request because our gateway buffers the response when we introspect the response body, which disables the stream to the consumer.

To address this issue, I suggest a solution to the OpenAI product team.
We propose adding the

model
prompt_tokens
completion_tokens
total_tokens

By adding these headers on the response would provide the necessary data without requiring us to introspect the body. Furthermore, this approach is simple and non-intrusive since it can be received in the initial response without interruption allowing proxies and gateways to read the data without service interruption to our consumers.

andrew.butson · June 20, 2023, 12:53pm

Please could someone review the proposal as we have a pressing need within our organisation

Topic		Replies	Views
Feature request: get generated tokens back with request API chatgpt , api , feature-request	5	918	December 15, 2023
Feature request: Query token counts via API Prompting	3	1640	May 24, 2022
Important feature requests for GPT-3 API (presigned requests & keys) API	4	1074	June 14, 2023
Enhancing completions Endpoint Response with Maximum Available Tokens API api	0	306	February 29, 2024
Add a token limit attribute on api.openai.com/v1/models API api	5	1263	June 15, 2023

ApiGateway - Stream Buffering Token Response Header

Related topics