Can I save tokens if I preprocess my data?

uqdjq · May 12, 2021, 6:52pm

Yes exactly. That’s my issue. I have about 3000 phrases in my dataset which I could size down for testing purposes a bit, but it’s still problematic for my main use case. So I am thinking about how i can mitigate the implications of my underlying design which is very token hungry/inefficient. But since I am currently bound to that setup, the only thing I can do for now is mitigation.

I have users that need to dig through my dataset to define labels and everytime they find a part of a (new) label, they have the opportunity to label a few instances through a keyword-search. However, since these labels are not well defined yet, they can’t find all instances through obvious keywords which is where AI comes into play.
Furthermore, I am aiming for a collaborative situation, where the model takes a similar role to a real assistant who digs through the data and presents what it considers to be appropriate. Like a very primitive version of the dynamic that is shown here: https://www.youtube.com/watch?v=BdHj210v9Yo

I have a labeled gold-standard dataset for testing purposes, but in my real-world scenario the labels would evolve on the fly which is why I have decided on a binary classification for each new label.
To examine whether the performance is better or worse for some labels I have to test them all (e.g. because some might have more structural markers).

I have ~40 Labels * ~3000 Classifications of 1000 Tokens which is >18$.

Of course I can scale that down a bit, but I am still exploring other ways to improve efficiency. I probably can’t make that many API calls in a reasonable time anyway…

Topic		Replies	Views
Answering lots of questions from one large chunk of text without paying tokens to input the big text chunk for each question? API api	16	9849	December 24, 2023
How to make classification and information extraction task cost-effective API	4	1092	March 20, 2024
Estimating costs of O1 queries API api , cost	10	3015	September 21, 2024
Need help on how to approach the API usage metric for user of the app API	16	1643	January 3, 2024
Assistant API / costs / where do I find my token consumtions in assistants\|messages\|threads API	4	1907	December 14, 2023

Can I save tokens if I preprocess my data?

Related topics