Print message token by token

abk · March 19, 2023, 3:46pm

Hello,

at the moment my tool sends a AJAX request using a PHP file that sends it to https://api.openai.com/v1/chat/completions and generates a JSON message that is rendered via JavaScript.

The problem is that the output is rendered at a strech and not token-via-token.

You could i implement that?

Best regards
Andreas

abk · March 24, 2023, 11:18pm

Meanwhile i got it to work. Here’s a small working example:

index.html:

<html>

    <head>
        <title>Stream Demo</title>
    </head>

    <body>

        <div id="content"/>
        <script>
           var eventSource = new EventSource("api.php");

           eventSource.onmessage = function (e) {
              if(e.data == "[DONE]")
              {
                  document.getElementById('content').innerHTML += "<br><br>Finished.";
                  eventSource.close();
              } else {
                  if (JSON.parse(e.data).choices[0]['delta']['content']) {
                    document.getElementById('content').innerHTML += JSON.parse(e.data).choices[0]['delta']['content'];
                  }
              }
           };
           eventSource.onerror = function (e) {
               console.log(e);
           };
        </script>

    </body>
</html>

api.php

<?php

    /* Andreas Koch, abkoch (at) posteo.de */

    $api_key = '<put-your-key-here>';
    $api_url = 'https://api.openai.com/v1/chat/completions';

    $ch = curl_init();

    $messages[] = array("role" => "user", "content" => "Tell me a story about you.");

    $post_fields = array(
        "model" => "gpt-3.5-turbo",
        "stream" => true,
        "temperature" => 0.7,
        "max_tokens" => 512,
        "top_p" => 1,
        "frequency_penalty" => 0,
        "presence_penalty" => 0,
        "stop" => ["\\n"],
        "messages" => $messages
    );

    $header  = [
        'Content-Type: application/json',
        'Authorization: Bearer ' . $api_key
    ];

    header('Content-Type: text/event-stream');
    header('Cache-Control: no-cache');
    header('Connection: keep-alive');
    header('X-Accel-Buffering: no'); 

    curl_setopt($ch, CURLOPT_URL, $api_url);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($post_fields));
    curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
    curl_setopt($ch, CURLOPT_WRITEFUNCTION, function ($ch, $data) {
       echo $data;
       echo PHP_EOL;
       ob_flush();
       flush();
       return strlen($data);
    });

    curl_exec($ch);

?>

PaulBellow · March 25, 2023, 5:28am

stream (defaults to false…)

If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message. See the OpenAI Cookbook for example code.

https://platform.openai.com/docs/api-reference/chat/create#chat/create-stream

Here you go. Hope your weekend is going well…

abk · March 25, 2023, 11:01am

Thanks. I already found a solution and posted it yesterday. But Askimet blocked it.

EricGT · March 25, 2023, 11:53am

Did you paste in the entire response and then send it?

As a Discourse admin on another site Askimet will flag post by users with low Discourse trust levels that were not entered by hand. The reason for flagging such post is that this is a key indicator of spam bots as bots are much faster than humans at creating post. If it has been more than 24hrs since posting and if the admins and moderators here were doing what they need to then your post should have been reviewed and a response sent back already.

I was once on another well known site where the all the admins and moderators were not active for months, so I turned the use of the site into a social experiment, I hope the same does not happen here.

abk · March 25, 2023, 12:17pm

Yes, i copy&pasted my code out of two files i’ve created.

My trust level was increased some minutes ago. So i give it another try:

api.php:

<?php

    $api_key = '<put-your-key-here>';
    $api_url = 'https://api.openai.com/v1/chat/completions';

    $ch = curl_init();

    $messages[] = array("role" => "user", "content" => "Tell me a story about you.");

    $post_fields = array(
        "model" => "gpt-3.5-turbo",
        "stream" => true,
        "temperature" => 0.7,
        "max_tokens" => 512,
        "top_p" => 1,
        "frequency_penalty" => 0,
        "presence_penalty" => 0,
        "stop" => ["\\n"],
        "messages" => $messages
    );

    $header  = [
        'Content-Type: application/json',
        'Authorization: Bearer ' . $api_key
    ];

    header('Content-Type: text/event-stream');
    header('Cache-Control: no-cache');
    header('Connection: keep-alive');
    header('X-Accel-Buffering: no'); 

    curl_setopt($ch, CURLOPT_URL, $api_url);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($post_fields));
    curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
    curl_setopt($ch, CURLOPT_WRITEFUNCTION, function ($ch, $data) {
       echo $data;
       echo PHP_EOL;
       ob_flush();
       flush();
       return strlen($data);
    });

    curl_exec($ch);

?>

index.html:

<html
    <head>
        <title>Stream Demo</title>
    </head>

    <body>

        <div id="content"/>
        <script>
           var eventSource = new EventSource("api.php");

           eventSource.onmessage = function (e) {
              if(e.data == "[DONE]")
              {
                  document.getElementById('content').innerHTML += "<br><br>Finished.";
                  eventSource.close();
              } else {
                  if (JSON.parse(e.data).choices[0]['delta']['content']) {
                    document.getElementById('content').innerHTML += JSON.parse(e.data).choices[0]['delta']['content'];
                  }
              }
           };
           eventSource.onerror = function (e) {
               console.log(e);
           };
        </script>

    </body>
</html>

itsvnk · December 19, 2023, 8:17am

How do we get the tokens count in this example?

Any ideas please?

Thanks

_j · December 19, 2023, 8:24am

If you want an estimation, you can count the number of network SSE delta chunks you receive. That only works generally when language is only ASCII English; anything Unicode or emoji may have multiple tokens per chunk to send you a complete character, so it would be an underestimation of costs.

The correct way is to reassemble the AI language into a single string, and use the correct token encoder, tiktoken, to count tokens in the response.

The better correct way: If OpenAI can send you a finish reason chunk, they can send you usage - but just don’t.

itsvnk · December 19, 2023, 8:29am

Ah ok

So it is by design

Because, when I don’t stream it, i get the tokens usage

Topic		Replies	Views
What is this new what is this new what is this new API gpt-4	18	18829	January 11, 2024
How to get token usage for each API call in streaming model? API	9	8424	December 14, 2023
Chat completion "stream" API token usage API api	3	6440	May 6, 2024
How get api total_tokens with reactjs API api	10	1048	November 5, 2023
OpenAi API - get usage tokens in response when set stream=True API	31	38754	August 3, 2024

Print message token by token

Related topics