Text extraction based on complex rules

I’m brainstorming an idea for a research project and wanted to hear other opinions on it.

I have documents from which I want to extract some elements according to a certain set of rules. These rules are very complex, from a legal code. Basically, I want the model to extract all violations of the law from a collection of potentially quite long texts, based on the legal code.

My current idea is to somehow feed in the legal code as a file to the AI assistant, then put the task in the description and then prompt it with the document containing text with violations of the law. Do you think this is the best/most cost effective approach? What other ideas do people have? Thanks!!

1 Like

Your idea seems like a straightforward approach to the implementation. You can create the knowledge base using the complex legal code with the added GPT builder functionality recently released. You can then define the analysis task and feed the data as you’ve laid out and it should function well.

Your approach to the development is pretty streamlined. However, in terms of cost effectiveness, you would want to have a clear understanding of the size of the potential input text, what ways you could reduce that or potentially call functions from less costly models and understand the new token pricing. A good bet would be to feed the development requirements to GPT with the recent update documentation and have it layout potential cost savings options during the development.