LlamaIndex is a good resource:
https://gpt-index.readthedocs.io/en/latest/guides/primer/usage_pattern.html
Ultimately, you’ll need to use a vector DB like Pinecone to store the embeddings. It’s trivially simple to store and query…
# store docs to an index
from llama_index import GPTSimpleVectorIndex
index = GPTSimpleVectorIndex([])
for doc in documents:
index.insert(doc)
Query the index
response = index.query("What did the author do growing up?")
print(response)
That’s the most simple case. Store your taxonomy mappings as a simple store of documents, then you pass in a few hundred need-to-process lines and the first query you make is to the index to get only the needed taxonomies, using a lower threshold (you can finetune the threshold to query)., and finally, you feed those into GPT4 api. No need to send a full taxonomy into the model each time; only the matching taxonomies.