What is this new Streaming parameter?

tventura94 · September 23, 2023, 11:58pm

What is this new Streaming parameter? I’m just seeing this new parameter and have no idea what it is

udm17 · September 24, 2023, 7:30am

It basically allows you to receive tokens back in batches so that you can give the appearance of generation like with chatgpt

tventura94 · September 24, 2023, 2:00pm

what does that mean appearance of generation?

_j · September 24, 2023, 2:09pm

Streaming is the sending of words as they are created by the AI language model one at a time, so you can show them as they are being generated.

(technical: Subscription to a server-sent push event)

tventura94 · September 24, 2023, 2:38pm

oh no shit huh? Does that cost more or its just a fun thing you can add?

_j · September 24, 2023, 2:46pm

It is the same cost per token.

Instead of waiting 30 seconds for the complete answer, you start receiving almost immediately.

It is not “just fun”, imagine if you had to stare at ChatGPT for half a minute wondering what it was going to say.

tventura94 · September 24, 2023, 2:47pm


  const response = await openai.chat.completions.create({
    model: "gpt-4",
    messages: [
      {
        role: "system",
        content: message,
      },
    ],
    temperature: 1.1,
    max_tokens: 600,
    top_p: 1,
    frequency_penalty: 0.3,
    presence_penalty: 0.5,
  });

So this works fine, but when I add stream: true…

  const response = await openai.chat.completions.create({
    model: "gpt-4",
    messages: [
      {
        role: "system",
        content: message,
      },
    ],
    stream: true,
    temperature: 1.1,
    max_tokens: 600,
    top_p: 1,
    frequency_penalty: 0.3,
    presence_penalty: 0.5,
  });

  res.json({
    message: response.choices[0].message.content.trim(),
    usage: response.usage,
  });
});

All of a sudden i get this error

    message: response.choices[0].message.content.trim(),
                             ^

TypeError: Cannot read properties of undefined (reading '0')
    at C:\Users\tvent\OneDrive\Desktop\gpt\server\index.js:204:30  
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

_j · September 24, 2023, 3:04pm

Yes, you’re going to get chunk objects, a format documented in the API reference, or dumped out for me just now.


{
  "id": "chatcmpl-82dsfjjaofldfa3mOIDJFOIJ",
  "object": "chat.completion.chunk",
  "created": 1695524999,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "delta": {
        "content": " for"
      },
      "finish_reason": null
    }
  ]
}

You get them until finish_reason: stop or length

tventura94 · September 24, 2023, 3:14pm

right im saying when i adjust my code though I don’t know how to get it to work.

  const response = await openai.chat.completions.create({
    model: "gpt-4",
    messages: [
      {
        role: "system",
        content: message,
      },
    ],
    stream: true,
    temperature: 1.1,
    max_tokens: 600,
    top_p: 1,
    frequency_penalty: 0.3,
    presence_penalty: 0.5,
  });

  for await (const chunk of response) {
    console.log(chunk.choices[0].delta.content); // This correctly streams it in the terminal 
  }

  res.json({
    message: response.choices[0].delta.content.trim(), // does not stream to the front end
    usage: response.usage,
  });
});

Basically how do i get the res.json to stream? Im sending the message variable to the front end, that works and is all set up. So how do I make it work within instead of that for await loop in my res.json?

tventura94 · September 24, 2023, 3:41pm

const response = await openai.chat.completions.create({
    model: "gpt-4",
    messages: [
      {
        role: "system",
        content: message,
      },
    ],
    temperature: 1.1,
    max_tokens: 600,
    top_p: 1,
    frequency_penalty: 0.3,
    presence_penalty: 0.5,
  });

So this works fine, but when I add stream: true…

  const response = await openai.chat.completions.create({
    model: "gpt-4",
    messages: [
      {
        role: "system",
        content: message,
      },
    ],
    stream: true, // adding stream
    temperature: 1.1,
    max_tokens: 600,
    top_p: 1,
    frequency_penalty: 0.3,
    presence_penalty: 0.5,
  });

  for await (const chunk of response) {
    console.log(chunk.choices[0].delta.content); // this code from the doc runs
  }

  res.json({
    message: response.choices[0].delta.content.trim(), // this is causing the script to fail
    usage: response.usage, // What happened to the usage object?
  });
});

The console.log will run but It doesnt go to the res.json
All of a sudden i get this error

    message: response.choices[0].delta.content.trim(),
                             ^

TypeError: Cannot read properties of undefined (reading '0')
    at C:\Users\tvent\OneDrive\Desktop\gpt\server\index.js:208:30  
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

Im trying to make sure that the streaming object gets attached to the message object in my res.json

any help is appreciated

really bad backend programmer here

Also, what happened to the usage object if you choose to do streaming?

_j · September 24, 2023, 3:52pm

The “all of a sudden” is likely that you are not handling the finish_reason case.
Or that you get a function_call, which just appears at the object root.

End of stream:


{
  "id": "chatcmpl-jdfj933jaf03kar03raf",
  "object": "chat.completion.chunk",
  "created": 1111111111,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "delta": {
        "content": "?"
      },
      "finish_reason": null
    }
  ]
}
{
  "id": "chatcmpl-82ojfoaFsD7",
  "object": "chat.completion.chunk",
  "created": 112222222,
  "model": "gpt-3.5-turbo-0613",
  "choices": [
    {
      "index": 0,
      "delta": {},
      "finish_reason": "stop"
    }
  ]
}

tventura94 · September 24, 2023, 4:07pm

so i should add stop to the end of the function?

It works on the console.log it is streaming and logging it

  for await (const chunk of response) {
    console.log(chunk.choices[0].delta.content); // this code from the doc runs
  }

its not working in the res.json

  res.json({
    message: response.choices[0].delta.content.trim(), // this is causing the script to fail
    usage: response.usage, // What happened to the usage object?
  });

_j · September 24, 2023, 4:12pm

I provided the second to the last and the last object received in a stream.

See that part in your code where it says .delta.content?

See any “content” in the last chunk?

You need to short-circuit out by detecting the finish_reason first.

tventura94 · September 24, 2023, 4:16pm

You’re truly just not being clear.

It IS WORKING in the log. in my terminal.

its not passing to the front end, its the same chunk.


app.post("/j", async (req, res) => { ... ....
...

  const response = await openai.chat.completions.create({
    model: "gpt-4",
    messages: [
      {
        role: "system",
        content: message,
      },
    ],
    stream: true, // adding stream
    temperature: 1.1,
    max_tokens: 600,
    top_p: 1,
    frequency_penalty: 0.3,
    presence_penalty: 0.5,
  });

  for await (const chunk of response) {
    console.log(chunk.choices[0].delta.content); // this code from the doc runs
  }

  res.json({
    message: response.choices[0].delta.content.trim(), // this is causing the script to fail
    usage: response.usage, // What happened to the usage object?
  });
});

the difference is this for await
but how do i use that within my res.json?

_j · September 24, 2023, 4:34pm

I don’t know the purpose of res.json in this context. The tools/environment you’re using are not my forte.

Typically what you’ll need to do is have two different mechanisms going

if not a finish_reason or massive_error:

display the content of the chunk
append the content of the chunk to your variable for capturing the message

then do message stuff like chat history with the completed message.

Here’s a little python chatbot at least.

tventura94 · September 25, 2023, 12:25am

what happens to the usage object if you convert to streaming?

PaulBellow · September 25, 2023, 12:30am

You have to count tokens on your own with streaming which is the trade-off…

tventura94 · September 25, 2023, 12:34am

How do I count my own tokens without being told what my tokenage is? -_-

PaulBellow · September 25, 2023, 12:46am

If you search the forum, you can find out more on tiktokken

curt.kennedy · September 25, 2023, 12:56am

Also, if you are embedding the input and outputs, which you should do anyway to maintain good context within the bot, a nice side effect of Ada-002 is that is reports out the token counts.

Topic		Replies	Views
OpenAi API - get usage tokens in response when set stream=True API	28	24221	April 20, 2024
Why there is no USAGE object returned with Streaming Api Call? API api , chat-completion , completions	19	3113	February 23, 2024
How to get token usage for each API call in streaming model? API	9	6806	December 14, 2023
Print message token by token API	8	1538	December 19, 2023
Streaming completion in Python API	11	19059	December 13, 2023

What is this new Streaming parameter?

Related Topics