Do you use streaming? Any difficulties with that?

mikeborozdin · May 19, 2023, 12:47pm

Do you use API streaming? Any difficulties with that?

I personally love streaming as it lets me show results straight away and decrease the perceived latency.

But I found it a bit tedious to implement.

We were calling OpenAI from a GCP cloud function. And it doesn’t support any way of streaming responses (neither Transfer-encoding: chunked, nor server-side events, nor web sockets).

So we had to use deploy our streaming service to Cloud Run.

Then we also return JSON data instead of text. And when you stream a response you won’t get a correctly formed JSON, so JSON.parse() would fail. Luckily I discovered an optimistic JSON parser (best-effort-json-parser on NPM )that solved this problem.

Then I thought there could be a better way and I packaged it as a JavaScript SDK and a cloud service (https://aistream.dev/), so that you don’t need to build anything to use streaming. You can find a demo and code examples on the website (link in the previous sentence).

But… I’m not sure if it’s really a big problem. People seem to be happily using streaming already.

What do you guys think? Is that type of the SDK/service something useful* or would you build your own streaming?

bear in mind it’s not production ready, as you would expose your Open AI key to the public.

b0rked_rebase · August 12, 2023, 1:51pm

Yeah, streaming is a big problem for a lot of people, and whenever I check NPM, there’s someone rolling out their own way of handling streaming. There’s a popular conversation about streaming from the OpenAI community actually, with different implementations of streaming (and I’ve tried out most of them, and they do have shortcomings).

So yeah, I’d say it’s a pretty big problem. Incidentally, using OpenAI streaming with LangChain makes it a breeze to implement — you now have to worry about stream jank on the destination/client.

b0rked_rebase · August 12, 2023, 1:53pm

Really love the landing page for AI Stream dot dev! Especially as users can try it directly in-browser.

PS: email validation isn’t implemented — without submitting an email address, you can click the button and receive the ‘Thanks for your interest!’ message.

mikeborozdin · October 7, 2023, 3:33pm

Hi,

Thank you! I’ve abandoned the idea though. It didn’t generate enough interest. Plus, packaging it as a cloud service could create security risks - you have to expose an API key in the frontend.

That said, I’ve recently released a library that helps to process streamed JSON - npmjs.com/package/http-streaming-request

alexis779 · July 3, 2024, 5:21am

I have implemented an OpenAI API compatible proxy in AWS Lambda, using a streaming function.

You can switch from OpenAI to Mistral interchangeably, by updating the server prefix in the API Gateway url.

Lambda still buffers some of the chunks for performance reason. The reactivity was still pretty good in the frontend, despite lambda not flushing every single intermediate chunks.

Feel free to deploy your own Lambda using Github repo called lambda-openai-proxy. Let me know any feedback …

Topic		Replies	Views
Completion time limit parameter? API	4	1938	December 15, 2022
Streaming OpenAI response API streaming	0	95	July 29, 2024
Streaming Responses - Exploring Cost-Efficient Alternatives to SSE with AWS Lambda & API Gateway API api	17	16403	February 29, 2024
HTTP Calls Excessive Delay Waiting for Server Response API gpt-4	8	1164	January 9, 2024
Stream responses in next js without open ai package API gpt-4 , chatgpt , fine-tuning , api , chatgpt-plugin	1	920	March 18, 2024

Do you use streaming? Any difficulties with that?

Related topics