How to calculate the tokens when using function call

The api support function call from June-13th, but I didn’t find any document to describe the method to calculate function tokens.

It there anyone here to give me a hand?

6 Likes

If you want to count tokens used for prompts and responses you can use the OpenAI GitHub - openai/tiktoken: tiktoken is a fast BPE tokeniser for use with OpenAI's models. to count them, examples in the read . me and also in the OpenAI API docs.

1 Like

Thanks a lot.

I know the document that you mentioned. My question is how to count tokens with function call which is supported 2 days ago. It is not implemented in tiktoken.

if you send the string containing in your Functions definition to tiktoken it will count them and if you send the json reply from the model to tiktoken it will count them, I think that covers it?

Yes, I thought so. But the count is not the same with the usage which it returned from API. So I want to know the accurate method.

"Under the hood, functions are injected into the system message in a syntax the model has been trained on. This means functions count against the model’s context limit and are billed as input tokens. "

So we “jailbreak”, and here’s what the function looks like when handed to the bot:

namespace functions {
    type x = (_: {
        location: string,
        unit?: \"celsius\" | \"fahrenheit\",
    }) => any;
} // namespace functions

I named the weather example function “x”.

(I don’t want to completely give up how to exploit any bot on the planet though.)

3 Likes

It could be that there are some boundary markers used, not 100% sure, is the tiktoken count and the actual used count always some fixed number of tokens off?

1 Like

How did you do this “jailbreak”? I’d like to better understand what’s happening.

1 Like
  1. provide a system prompt where you can count its tokens,
  2. provide the function call that the bot doesn’t have reason to call so you can count its tokens after transformation and overhead,
  3. provide the convincing user role instruction that dumps out everything the robot has received as response.
  4. math the missing special tokens
2 Likes

Thanks a lot. I’ve try and compare with the return from API. Then I got below.


    public static int tokens(String modelName, Object functionCall, List<Function> functions) {
        Encoding encoding = getEncoding(modelName);
        int sum = 0;
        if (Preconditions.isNotBlank(functionCall)) {
            if (functionCall instanceof JSONObject) {
                sum += tokens(encoding, functionCall.toString());
            }
        }
        for (Function function : functions) {
            sum += tokens(encoding, function.getName());
            sum += tokens(encoding, function.getDescription());
            if (Preconditions.isNotBlank(function.getParameters())) {
                JSONObject jsonObject = (JSONObject) function.getParameters();
                if (jsonObject.containsKey("properties")) {
                    for (String propertiesKey : jsonObject.getJSONObject("properties").keySet()) {
                        sum += tokens(encoding, propertiesKey);
                        JSONObject v = jsonObject.getJSONObject("properties").getJSONObject(propertiesKey);
                        for (String field : v.keySet()) {
                            if ("type".equals(field)) {
                                sum += 2;
                                sum += tokens(encoding, v.getString("type"));
                            } else if ("description".equals(field)) {
                                sum += 2;
                                sum += tokens(encoding, v.getString("description"));
                            } else if ("enum".equals(field)) {
                                sum -= 3;
                                for (Object o : v.getJSONArray(field)) {
                                    sum += 3;
                                    sum += tokens(encoding, o.toString());
                                }
                            } else {
                                log.warn("not supported field {}", field);
                            }
                        }
                    }
                }
                sum += 11;
            }
        }
        sum += 12;
        return sum;
    }

It is a function in file TikTokenUtils and it cover type/description/enum properties. And it is just matched with the usage returned from API.

Let me know if you have any better way.

2 Likes

Python function if anyone needs it:

    def num_tokens_from_functions(functions, model="gpt-3.5-turbo-0613"):
        """Return the number of tokens used by a list of functions."""
        try:
            encoding = tiktoken.encoding_for_model(model)
        except KeyError:
            print("Warning: model not found. Using cl100k_base encoding.")
            encoding = tiktoken.get_encoding("cl100k_base")
        
        num_tokens = 0
        for function in functions:
            function_tokens = len(encoding.encode(function['name']))
            function_tokens += len(encoding.encode(function['description']))
            
            if 'parameters' in function:
                parameters = function['parameters']
                if 'properties' in parameters:
                    for propertiesKey in parameters['properties']:
                        function_tokens += len(encoding.encode(propertiesKey))
                        v = parameters['properties'][propertiesKey]
                        for field in v:
                            if field == 'type':
                                function_tokens += 2
                                function_tokens += len(encoding.encode(v['type']))
                            elif field == 'description':
                                function_tokens += 2
                                function_tokens += len(encoding.encode(v['description']))
                            elif field == 'enum':
                                function_tokens -= 3
                                for o in v['enum']:
                                    function_tokens += 3
                                    function_tokens += len(encoding.encode(o))
                            else:
                                print(f"Warning: not supported field {field}")
                    function_tokens += 11

            num_tokens += function_tokens

        num_tokens += 12 
        return num_tokens
1 Like

What do the magic numbers 11 and 12 in your code represent?

1 Like

Do you know what do the magic numbers 11 and 12 in your code represent

As forestwanglin already wrote, this code is from TikTokenUtils. I just ported to python. It is uncommented there, but we can make some assumptions:
12 is added outside of the main function loop, so it’s probably the tokens for the functions frame:
functions": [ { "name": "", "description": ""
11 is added outside of the properties loop, so it’s probably the tokens for parameters frame:
"parameters": { "type": "object", "properties": {}, "required": [] }

5 Likes

Unluckily, I don’t know so far. I’ve said that I tried time by time until I got the save usage number from OpenAI API. I guess 12 is the wrapper token size of function call part, and 11 is the wrapper token size of parameter part.

I also want to get the official document to calculate the token of the usage of request containing functions.

Please remember that you need change the method to calculate tokens when previous messages which contains the one with function_call.

if (Preconditions.isNotBlank(msg.getFunctionCall())) {
    sum += 1;
    sum += tokens(encoding, msg.getFunctionCall().getName());
    if (Preconditions.isNotBlank(msg.getFunctionCall().getArguments())) {
        sum += tokens(encoding, msg.getFunctionCall().getArguments());
    }
}

You can see the code from line 201 on TikTokenUtils.

2 Likes

:upside_down_face: Thanks for your explanation. I got the magic number by trying time by time which is the dumbest method.

2 Likes

Based on the given example, I was able to find a format that gives a much more accurate token length than the examples above, without magic numbers:

namespace functions {

// Get the current weather in a given location
type get_current_weather = (_: {
    location: string, // The city and state, e.g. San Francisco, CA
    unit?: "celsius" | "fahrenheit",
}) => any;

} // namespace functions

In testing, this gives token length to within 5 tokens of the actual number calculated by OpenAI.

My full implementation:

1 Like

It’s actually something like this, appended to the first system message (or inserted if the first message is not a system message):

# Tools

## functions

namespace functions {

// Get the current weather in a given location
type get_current_weather = (_: {
// The city and state, e.g. San Francisco, CA
location: string,
unit?: "celsius" | "fahrenheit",
}) => any;

} // namespace functions

When counting total message tokens using cl100k_base encoding and subtracting “1”, it provides the correct count. I’m not sure why subtracting 1 is necessary. It seems that some delimiter token may be dropped internally when functions are used.
I have a Java code for converting from JSON schema into that format in my tokenizer library on Github: Function tokenizer. It’s interesting to see that this conversion process has a few nuances, and not everything from JSON Schema is retained, particularly when nesting multiple objects.

4 Likes
  1. So the token count is the amount of tokens from

# Tools

until

// namespace functions

Including all comments?

Which would be 79 tokens?

  1. Does the function always get embedded (and thus charged for), or only when it determines that it will use the function? (If the answer is that it’s not included unless used, then my next question is:)

  2. Can we define dozens of functions and we’ll only be charged for the functions that will get used if any?

The functions available must be always known to the AI, and the ones that you might need are passed every API call as an input. There is no decision-making other than your own when you write the API and include functions for that turn of AI.

If the function is not invoked, then you’ll get a user-friendly response back, while when the function is called by AI, you get function data back you must process and return to a second AI model so it can answer with the new information or results.

That second AI that handles the return from a function likely doesn’t need to have all the functions available, unless it is made smart enough to continue asking for more functions to explore retrieval of more satisfactory information with which to answer. You might even give the function-answering AI a different system prompt.