I have achieved very good results using GPT3.5-Turbo-0125. First, I make a schematic separation of the most important topics (I ask GPT4.0 to do this). By hierarchizing the most important topics, I then proceed to summarize them (summarized by GPT4.0). After summarizing them, I assign each to a document. What we observed is that it is very important to properly establish the name of the file. For example, if the topic of the file is traffic rules, I do not name the file “traffic rules”; instead, I specify elements from the file itself, such as “stop sign,” “traffic light,” etc. We observed that by incorporating key elements of the file into the name, the model will use only the context of that particular file. Additionally, if the file is summarized, leaving only the “important” content that we truly need, then the responses are much more precise and solid, achieving very good results. Something very important is to instruct the model on what to do with these files, something like this:
“You are an expert teacher in answering questions. You will be provided with files from which you should draw your answers. The answers must be without abbreviations, references, and notes. All the answers you give should be explanatory and include examples.”
Main file: human resources.pdf. This document is schematized, hierarchized, and fragmented into:
1-Performance management, Robbins and Coulter, Idalberto Chiavenato, performance evaluation, evaluation method.txt
2-Human resources management.txt
3-Employee orientation, types of training, training methods.txt
4-Human resources planning process, Robbins and Coulter, job specification, recruitment.txt
5-Human resources planning process, Robbins and Coulter, job specification.txt
6-Roles of human resources.txt
7-Taylor, Fayol, Mayo, Drucker, history of HR.txt
8-Location, contact information, and schedule.txt
9-Enhancing and projecting the competitive organization of the future, fundamental challenges facing executives.txt
I hope this is useful to you!