Disproportionate amount of Input Tokens (471k for 89 API requests)

I’m running a very simply C# method (in Unity) calling the API to create tags based on two strings, a name and a path, which together would constitute about 10-15 input tokens per item max. After about 30 calls, the API has registered about 471k input tokens, which seems entirely disproportionate. It averages at ~5.3k per request).

A system message is sent once during initialization, with instructions for how to handle inputs and outputs.

I would have expected the entire routine to be a maximum of 75,000 input tokens, spread over probably half an hour of processing time. Even the output tokens of 7k seem a bit excessive, considering it only processed 89 items in total, and it is returning at most about 50 tokens per item, usually less. I would have expected that to be about 4.5-5k. But least it is within 20%, the 450k of input tokens makes absolutely no sense.

private void InitializeGPT()
            ApiKeyManager.LoadOpenAIKey(out string openApiKey);
            OpenAIAPI api = new (openApiKey);
            chat = api.Chat.CreateConversation();
            chat.Model = Model.GPT4_Turbo;
            Debug.Log($"<color=#FEC90B><b><size=14>GPT Icon Tags initialized</size></b></color>");

private async void CreateTags()
            int count = 0;

            foreach ((string iconName, string iconPath) in spriteList)
                chat.AppendUserInput($"{{name: {iconName}}}, {{path: {iconPath}}}");
                    string response = await chat.GetResponseFromChatbotAsync();
                    (string iconLabel, List<string> tags) = response.ParseIconTags();
                    iconDatabase.AddTagsWithLabel(iconLabel, tags);
                    Debug.Log($"Tags processed | Item: <color=#FEC90B><b><size=14>{iconLabel}</size></b></color>");
                catch (Exception ex)
                    Debug.LogError($"Error processing tags for {iconName}: {ex.Message}");

            Debug.Log($"<size=18><color=#40FE0B><b>All items processed: {count}</b></color></size>");

The instructions are about 400 tokens, sent as a single system message. As I’ve run the method about 10 times in early testing, then I would expect that system message to consume about 4000 input tokens in total.

The requests for the day is showing as 89 requests, which does make sense. But the number of input tokens makes zero sense.

Any thoughts?

It’s probably a good idea to log all requests to the API before they’re sent. That makes debugging considerably easier.

What does the request that is actually going out to the API look like?

With the instructions out of the way, a typical input will look like this:

{{name: bone}}, {{path: Health/Hospital/bone}}

The folder depth in the path is a maximum of four levels, usually less. The ‘name’ is sometimes two words instead of one.

I’m just saying - after working 1.5 years with this crap in particular I’ve made it a habit to directly log the request at the rest call. There’s so much that can go wrong anywhere in the code.

Anecdotal stupidity: i’ve sent 10000 items out to be embedded only to be surprised how related they all seem. Until I realized that I forgot to include the actual content with the prompt.

It could of course be an OpenAI fault. In recent memory, there was the batch processing double billing thing. But for all of OpenAI’s faults, this stuff is rarely their fault.

1 Like

I’m just looking at the API wrapper to see if it’s appending the user input and sending the whole conversation history up to that point each time, meaning the entire history is being sent with each API request. Even then, that would only account for about 50k input tokens.

1 Like

If there’s any possibility of a leak, it would also be a good idea to rotate your API keys as it takes a while for that to take.

1 Like

Yes, it’s possible the issue is on my end. I’m going to find a way to add a log of the call . The wrapper itself isn’t reporting, though it is registering the error message being returned.

1 Like