Artificially Low Token Usage in API Calls

zach10 · June 14, 2024, 7:25pm

I’ve created an assistant and tested it in the playground. And now I’m calling it using an httpRequest object, but the results come back mangled.

API results show token usage ~850 in.

When I use the same assistant in the playground, it works correctly and shows token usage ~2150 in.

I’m assuming the low token usage is why the results are bad. But I’m not sure why the usage is low. This is the call I’m sending to the API:

{ “assistant_id”: “asst_blahblah”, “model”: “gpt-4o”, “max_prompt_tokens”: 3000, “max_completion_tokens”: 3000, “temperature”: 1, “top_p”: 1, “response_format”: { “type”: “json_object” }, “thread”: { “messages”: [ { “role”: “user”, “content”: “stuff I’m asking for”} ] } }

How do I get the API call to perform like the playground?

RonaldGRuckus · June 14, 2024, 7:53pm

There must be an issue with your code. Can you show more? It looks like you’re using Java (wild guess). Are you using a client library?

For debugging it may make more sense to remove all optional parameters. Setting a max completion token for a JSON response doesn’t seem like a good time either.

Also. There is no thread key. If this is what you’re sending verbatim

zach10 · June 14, 2024, 8:10pm

Okay, I removed all of the optional parameters and it worked correctly (consuming 2150 in). So that makes me wonder if I’m formatting the request wrong? 3000 is enough as the total is only 2546. There are no examples in the API documentation for any of the optional bits. But what I posted above IS verbatim what I’m sending (other than the headers). Is there some other format that’s required?

I’m writing this in a Domino database, so I’m using lotuscript using the curl documentation. For this particular call, I’m using the Create Thread and Run process. I pull the thread_id and run_id from the initial response. And then when the status shows complete, I request the messages.

Edit: Okay, it isn’t exactly working… It is using the correct number of tokens, but it is returning the response as text rather than json. (Both the assistant itself, and the instructions specify json.) When I add the json parameter back to the API call, it ends up sending this:


{ "assistant_id": "asst_abc123", 
"response_format": { "type": "json_object" }, 
"thread": 
{ "messages": [ { "role": "user", "content": "stuff I ask for"} ]
 } }

But the return contains this:

  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": {
        "value": "{\n ........

Topic		Replies	Views
Issues with High Token Usage in Assistants API for Chatbot Responses API assistants-api	0	199	May 28, 2024
Assistant API - consumes too much prompt tokens. What is the reason and how can I reduce it? API assistants , assistants-api	4	214	August 19, 2024
Assistants API token usage and pricing breakdown clarification API gpt-4 , api , assistants	10	9974	February 6, 2024
Assistants API Token Inconsistencies API api	0	365	January 25, 2024
GPT API Token Usage Higher Than Expected API assistants-api	1	53	October 1, 2024

Artificially Low Token Usage in API Calls

Related topics