How can I make GPT-4 learn from a PDF how to use a coding system?

Hi everyone! I would like to know if there is any way I can “teach” the model how certain coding system works. The coding system is a way to give “codes” to certain food items so they follow a controlled terminology.

The coding system has peculiarities and it is not straightforward. For example, there are “facets” such as the Process (describes if the food item has been processed) or the Packaging of the food. It is all described in a 60-ish-long PDF.

Then, I also have the list of available terms for the base terms and each of the facets.

Is there any strategy I could follow so the chat could produce coherent valid codes for this coding system? Thank you all in advance!

Probably what you would need to do is forgo the “talk to my pdf” idea, and instead distill the documents down to a hierarchy that the AI can browse and drill-down to particular facets. Then provide that access via indexed function tools.

Just as I would look at the table of contents, put a checkmark next to the likely relevant sections and page numbers, and read those pages to come up with a solution to the task and respond with the most likely sequence of words.


Thank you very much for your answer!

Maybe, I can be a bit more specific with my needs. The terms that can be used to codify the food items are not present in the PDF itself. They are available in a table (xlsx, csv, JSON or the format that could be needed) and follow a hierarchical structure. There are multiple lists, one for the base term and one for each facet.

The full code is composed of a “base term” (the food item itself), a “#” sign, and then 0 or more facets (there are 30-ish different facets. Some of the facets are repeatable (e.g. for the “Process” facet, a food can be both smoked and frozen), while others are not (e.g. a food item has only one “Source”). There are also concepts about “explicit” vs “implicit” facets; reportable vs non-reportable… These kinds of things are what is described in the PDF document. It would be amazing if I could get the model to internalise these concepts.

Maybe I am too ambitious and it is not possible to go in this direction, but I would love to give it a go!