Artificially Low Token Usage in API Calls

I’ve created an assistant and tested it in the playground. And now I’m calling it using an httpRequest object, but the results come back mangled.

API results show token usage ~850 in.

When I use the same assistant in the playground, it works correctly and shows token usage ~2150 in.

I’m assuming the low token usage is why the results are bad. But I’m not sure why the usage is low. This is the call I’m sending to the API:

{ “assistant_id”: “asst_blahblah”, “model”: “gpt-4o”, “max_prompt_tokens”: 3000, “max_completion_tokens”: 3000, “temperature”: 1, “top_p”: 1, “response_format”: { “type”: “json_object” }, “thread”: { “messages”: [ { “role”: “user”, “content”: “stuff I’m asking for”} ] } }

How do I get the API call to perform like the playground?

There must be an issue with your code. Can you show more? It looks like you’re using Java (wild guess). Are you using a client library?

For debugging it may make more sense to remove all optional parameters. Setting a max completion token for a JSON response doesn’t seem like a good time either.

Also. There is no thread key. If this is what you’re sending verbatim

2 Likes

Okay, I removed all of the optional parameters and it worked correctly (consuming 2150 in). So that makes me wonder if I’m formatting the request wrong? 3000 is enough as the total is only 2546. There are no examples in the API documentation for any of the optional bits. But what I posted above IS verbatim what I’m sending (other than the headers). Is there some other format that’s required?

I’m writing this in a Domino database, so I’m using lotuscript using the curl documentation. For this particular call, I’m using the Create Thread and Run process. I pull the thread_id and run_id from the initial response. And then when the status shows complete, I request the messages.

Edit: Okay, it isn’t exactly working… It is using the correct number of tokens, but it is returning the response as text rather than json. (Both the assistant itself, and the instructions specify json.) When I add the json parameter back to the API call, it ends up sending this:


{ "assistant_id": "asst_abc123", 
"response_format": { "type": "json_object" }, 
"thread": 
{ "messages": [ { "role": "user", "content": "stuff I ask for"} ]
 } }

But the return contains this:

  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": {
        "value": "{\n ........