Chat completion token count

Is there any plan for token count to work over chat completion with stream=true?
how can I instead count them correctly in typescript in case that this will never happen?

Welcome to the developer forum!

Have you looked into using tiktoken?

You could use that to calculate the token count after the message has been received. There was also a comment that each SSE event was a single token, but I did spot that not being the case when the GPT-3.5-16k model was in alpha and very quick replies send multiple tokens per event.

Thank you for the welcome :smiley:

Yes I’ve looked at it, and there should be a Typescript version of it too.
Unfortunately it seems that the SSE event is not about a single token very often when considering function name. other than this it`s not a matter of minute to get all the prompt token count correctly.

I mean, I could try to implement it by myself but is there an official timeline for this missing feature on streaming? or a motivation about why this shouldn’t work in stream mode?

Thank you for your help

The tokens used in streaming should be the same as when not, there could be some minor discrepancies of a token or two over non streaming, but it should not be significantly different. Tiktoken is fairly trivial to implement , you can see a demo over at Tiktoken Web Interface cl100k_base the server side code for which is just

from flask import Flask, request, jsonify, send_from_directory, abort
from flask_cors import CORS
from tiktoken import get_encoding

app = Flask(__name__, static_folder='static', static_url_path='')
CORS(app)  # Enable CORS
tokenizer = get_encoding("cl100k_base")

@app.route('/tokenize', methods=['POST'])
def tokenize():
    text = request.json['text']
    # Limit to ~10MB of text
    if len(text.encode('utf-8')) > 10000000:  # encode the text to bytes and check the length
        abort(413)  # HTTP status code 413: "Payload Too Large"

    tokenized_text = tokenizer.encode(text)
    tokenized_text = [{'token': token, 'text': tokenizer.decode([token])} for token in tokenized_text]
    return jsonify(tokenized_text=tokenized_text)

def home():
    return send_from_directory(app.static_folder, 'tokenizer.html')

def tokenizer_page():
    return send_from_directory(app.static_folder, 'tokenizer.html')

if __name__ == '__main__':'', port=5000)  # Listen on all interfaces

Ok, so about the prompt function to count them i need to do a JSON stringify of the function object and then put it inside the tokenizer function?