Assistants api token counts

ashwinthandu03 · September 5, 2024, 2:23pm

I’m using assistant api for classifying playstore reviews. I’ve give some instructions to Assistant so that it will classify the reviews as per the instructions. When I run in iteration like say if there are 500 rows of reviews and if I run it in 10 batches of 50 per batch, will the system prompt (instruction) l get counted for every batch as tokens (generally the credits)?

Munna23 · September 5, 2024, 3:08pm

Yes your system prompt will be included for token calculation each time you run for inference.

_j · September 5, 2024, 7:24pm

Assistants is meant more for continuing a chat session than for batching input/output operations, with its thread memory of a conversation. If you keep adding more messages to the same thread, the costs will inflate even more than just repeating the instructions again.

If you are using it solely because it has a document extractor, then be aware that you are doubling your costs or more, because the entire conversation and system prompt is sent once to produce an AI that attempts to search a vector store made of document chunks on the first run, and then internally resubmits the conversation with the added document chunk response for another model run autonomously, continuing until the AI decides internal operations do not need continued invocation and finally writes a response to you.

The chat completion API format is more efficient and performative, and can also be actually batched with OpenAI’s 24-hour return window multi-job batch API for 50% savings.

ashwinthandu03 · September 17, 2024, 11:09am

Hey, have a doubt. I’m using chat completions method. Now I need to know about batching. Which is best to use in code batching using loops or OPENAI BATCH API. Time is not a restriction for me.

_j · September 17, 2024, 11:25am

The batch API allows you to submit a file, where each line is an entire API call to chat completions, as were the request to be sent to the API in JSON (which is how SDK’s like OpenAI’s python or node.js libraries send behind the scenes over the network.) You then get a file back within 24 hours that has similar lines of response to that API call that was performed for you.

That means, instead of code that sends over and over from your own system, you need code that creates a file in the right format over and over, a JSONL (where the L is lines, and each line is a standalone JSON). Then upload that and initiate the batch processing order over API.

Since you have less time concern, saving 50% by sending for off-hours processing would make sense if the technical part is within your capabilities. (Or you can ask an AI how to make code that produces one-line JSON requests to the API if you teach it enough from the API reference.)

ashwinthandu03 · September 17, 2024, 11:40am

So this is my use case and I’ve also posted this. Could you help me with this. I have a set of comments like 1000 review comments in rows and I want to classify it using chatgpt api. I can’t directly put the 1000 comments into the api because the gpt might struggle along with the system prompt that holds the classification instructions. So I want to reduce the cost and token usage and so I’m running it in batches in 100 rows per batch and a total of 10 batches. I’m using python. Which is best? To run it directly in for loop batches in code itself or to run it using OPEN AI BATCH API command? (This is the normal loop that I follow "batch_size = 100
def batch_process_and_update(comments, batch_size, worksheet):
start_time = time.time()
num_batches = (len(comments) + batch_size - 1) // batch_size # Calculate number of batches

for batch_num in range(num_batches):
    start_index = batch_num * batch_size
    end_index = min((batch_num + 1) * batch_size, len(comments))

    # Prepare the batch of comments for the API
    batch_comments = comments[start_index:end_index]
    generate_response(system_instructions, batch_comments)")

_j · September 17, 2024, 11:52am

I’m sorry, but as an AI language model, I can’t assist with that.

I’m more inclined to help people understand how to use the API then to produce billable work for others.

OpenAI has lots of batch documentation for you to read before you look for personalized consultation.

Basically, to batch,

find out what single chat completions API call will fulfill the task on one input.
find out how to write that as a JSON API message body so the API accepts it (you can use the requests library to send the exact JSON and test.
1000 of those JSONs, without line breaks within, each on a single line, with your data to process inserted into the user prompt for each.
receive your answers, and match them back up with the questions by ID number.

Topic		Replies	Views
Batching in OPENAI API Python API api , batching , api-batch , batch-api	0	248	September 17, 2024
Assistant API / costs / where do I find my token consumtions in assistants\|messages\|threads API	4	2023	December 14, 2023
Token Optimization for Assistants API - Excesive token count API gpt-4 , assistants , assistants-api	2	2816	May 24, 2024
Can Instructions be reused at no cost? Or, how to save on tokens API	4	2959	January 1, 2024
Batching with ChatCompletion Endpoint Documentation gpt-35-turbo , chat-completion , batching , rate-limit	11	33626	December 13, 2023

Assistants api token counts

Related topics