I am trying to find efficient ways to minimize the number of tokens in the context stages to maximize results and be able to send a larger number of parameters, with all the benefits that this implies. Regarding this, I came up with an idea that might have already been addressed in some thread, but honestly, I couldn’t find it in the forum. It is to abbreviate words to their simplest form, as I understand both GPT-3.5 and GPT-4.0 understand abbreviated text and can return texts without abbreviations even if the input was abbreviated. So I set out first to test how much a text can be abbreviated. With the GPT-3 encoder tool in Node (here’s a very useful graphical implementation as well: https://platform.openai.com/tokenizer), it obviously depends on the specific text, but the token savings were about 20% to 35% (I clarify that it was with the texts I provided), which seemed significant since this is strictly reflected in its use, especially in context files.
Then for the tests, I used these hierarchical and schematized files from a main file. For example, main file: Human resources in administration.pdf. With GPT-4.0, I hierarchized and schematized this file to the most relevant, then I summarized each schematization with GPT-4.0. After this, I abbreviated each summary with GPT-4.0. Once I did all this, I assigned a title to each file that referenced the topic addressed in the file, for example:
1- performance management, Robbins and Coulter, Idalberto Chiavenato, performance evaluation, evaluation method.txt
2- human resource management.txt
3- staff orientation, types of training, training methods.txt
4- human resource planning process, Robbins and Coulter, job specification, recruitment.txt
Subsequently, we instructed the model as follows: “You are an expert professor in answering questions, you are provided with files from which you must derive your answers, they should be without abbreviations, references, and notes.” This way, we indicated what to do with those files.
This scheme has given me very good results, although I am still testing it to try to achieve the most accurate results possible. But I wanted to propose it because this approach of leveraging the model’s ability to understand abbreviations and using it to our advantage in reducing tokens seems interesting to me.
I’m new to this world and I really appreciate your comments, as there is surely a better or more abstract way to do this. Thank you very much in advance!