Streaming completion in Python


Does anyone have a working code snippet for how to make streaming work in python? All the discussion I’ve seen is about doing this in JavaScript.

Basically, I want the counterpart of the following where stream=True:
r = openai.Completion.create(
prompt= prompt",
stop=[“I:”, “O:”]

I tried using sseclient and urllib3, but can’t get this to work.


1 Like

I also looking the answer for this question.
Could you share with me, how they do it with JavaScript?

Yea just pass stream=True and handle with a generator

could you give example please, or point me to tutorial, thats would be helpful.

1 Like

Here’s a very quick example that streams tokens and prints out each token as it comes in:

for resp in openai.Completion.create(model='code-davinci-002', prompt='def hello():', max_tokens=512, stream=True):


So, this part works with the content. But what about the token cost? It is sent via server-sent event. Any way to obtain it? Thanks

I know this is old (providing samples for the first question, not returning token cost), but I came across the post while trying to figure this out myself with the ChatCompletion api. I got it working today, it’s the one labeled on GitHub - trackzero/openai: Experiments with the OpenAI API. You'll need your own API keys..

That particular example is authing using AWS Secrets Manager, but you can just delete the get_secret function and pull an environment variable with openai.api_key = os.getenv("OPENAI_API_KEY")

I’ll see about adding token cost on the exit function in the next day or two.

1 Like

How to count usage count while stream=True ?

1 Like

you have to estimate it with OpenAI’s tokenizer, tiktoken

I have added an estimator to my demo repo, openai/ at main · trackzero/openai · GitHub.

I have added an estimator to my demo repo, openai/ at main · trackzero/openai · GitHub .

thanks, but this calculates the prompt tokens, not just the completion tokens.

Are there any plans to support getting token usage when using streaming? :confused: