Important feature requests for GPT-3 API (presigned requests & keys)

In Tefeno we developed some applications that utilise GPT-3, we are facing common struggles:

  1. Cannot create pre-signed requests: we use serverless architecture which does not support streaming. Therefore, if the response of GPT-3 is long… the waiting time for the client is also long as the entire request has to be loaded before sending the response. We cannot show the prompt details as well as send back the secrets to the client. We do not want our services to act as gateways to openai.

Solution: generate a pre-signed request by providing the prompt options and the openai key, our services will return this URL and the client or the browser can directly send the request to openai. OpenAI will quickly decrypt this request using its private key and perform the request normally. This way the client can directly read the stream from openai and read the generated text one at a time reducing loading time and enhancing user experience.

  1. Usage not tracked per access key and cannot figure out which is which: since we have many apps using different keys… we have to measure ourselves in a database the usage whereas this could have been done automatically by openai. Furthermore, if we want to delete keys such as development keys… we cannot figure out which is which and have to check the codebase to correctly delete the right keys.

Hope this helps!


I couldn’t agree more! Many developers (including me) are dealing with the same issues - having multiple API keys and streaming. It would be great to see this fixed :slight_smile:

These are really important features. Having the ability to create pre-signed requests will really help us with streaming and not touch most of our existing architecture stack

For integrating with OpenAI APIs in a serverless architecture this is a must, it would be awesome to see this happening

In order to overcome this problem, I created this, which allows to use pre-signed URLs and redirect them to APIs like OpenAI’s.

With this I was able to stream OpenAI’s response directly in the client, without the need for leaking API keys or anything like that.

Hope it can be useful for anyone building an LLM app with a serverless architecture.