Help with prompt engineering - Contract analysis

Hi, I wonder if anyone can give me some help crafting a prompt. I have a system message that supplies a contract text and a JSON object detailing a list of standard contract clauses. I then supply my user message:

“”"
Analyse the supplied contract and determine whether or not each contract clause is included in the document.

Each clause may be missing from the contract, present exactly as defined, or present but modified the some degree.

Output the results as a table with 3 columns.

The first column should list the clause id.

The 2nd column should list the clause category.

The 3rd column should provide a string value indicating the extent to which the clause is determined to be in the supplied contract.

The following should be a guide for evaluating this value:

  • An exact text match for a clause should be “Exact”"

  • A close match for a clause, with only minor differences that do not alter the meaning or scope of the clause, should be “Included”

  • A close match for a clause, but with changes that have a material affect on the scope of the clause should be “Modified”

  • If the clause is deemed to be missing from the contract, return “Missing”

Some clauses are very similar, but they should still each be evaluated independently.

This means that it is possible for the same section of text from the contract to have matched one or more clauses to some degree.

As none of the clauses is identical, only one of any similar clauses can ever be an “Exact” match.

Any remaining similar clauses should be evaluated as “Included” or “Modified” as appropriate.

It may help to compare clauses first to be aware of any similarities between them.

Provide as a footnote, explanatory notes for clauses that are “Included” or “Modified” outlining how they differ from the reference clause.
“”"
It’s almost there:

Clause ID Category Extent
JC2010/014 Sanctions Limitation Exact
LMA3100 Sanctions Limitation Missing
LMA3100A Sanctions Limitation Missing
LMA3200 Sanctions Suspension Missing
JC2020-011 Communicable Disease Exact
JC2020-012 Communicable Disease Missing
LMA5403 Marine Cyber Exact
CL370 Bio-Chemical Weapons Modified

The issue I have is that top 3 clauses are almost identical, with extremely trivial differences (e.g. a missing apostrophe). So whilst the 1st value is correct, the next 2 should return “Included”.

If I point this out and ask the model to re-evaluate, it invariable gets it right the 2nd time. I feel I’m not quite crafting the prompt correctly to get the correct result the 1st time. I tried all sorts of tweaks but don’t seem to be getting any closer. Any help would be appreciated.

(shouldn’t be relevant but I’m interacting with the model using a Python script).

Thanks in advance for any assistance.

1 Like

Your prompt does already seem robust. Try to add section with a list of “trivial differences (e.g. a missing apostrophe)” to let AI lay on that background information during composition of response.

Many LLMs including OpenAI LLMs are trained to treat such information differently and may either flag it as not suitable and not respond or not give you a correct response so take that into consideration. I can’t provide more details as I don’t try prompts with such as I don’t need to and don’t want to get a permanent ban for trying.


Also see

Thanks. Filters shouldn’t be an issue - I’m working with a private LLM instance with no filters applied and can see from the response that filters weren’t triggered.

I’ll take a look at that post.

Interestingly I’ve noticed that if I change the order of the reference clauses I get a different response:

Clause ID Category Extent
JC2010/014 Sanctions Limitation Exact
LMA3100 Sanctions Limitation Missing
LMA3100A Sanctions Limitation Missing
LMA3200 Sanctions Suspension Missing
JC2020-011 Communicable Disease Exact
JC2020-012 Communicable Disease Missing
LMA5403 Marine Cyber Exact
CL370 Bio-Chemical Weapons Modified
Clause ID Category Extent
LMA3100 Sanctions Limitation Exact
JC2010/014 Sanctions Limitation Exact
LMA3100A Sanctions Limitation Missing
LMA3200 Sanctions Suspension Missing
JC2020-011 Communicable Disease Exact
JC2020-012 Communicable Disease Missing
LMA5403 Marine Cyber Exact
CL370 Bio-Chemical Weapons Modified

Notice the top 2 results. Not sure what this means but it gives me something to look into.

1 Like

I’m going to continue to work on this tomorrow morning for a more explicit illustration but if this helps, I believe your user message is strongly prescriptive. Because you have so many constraints and specific minutia, I am already impressed you get the results in only the second shot. If you can compromise a few of the lower priority items, remove them because it gives the model less to do less to remember and less to get incorrect. Also, it might be helpful to use a more personal tone- explain your problem and goals more directly and explicitly. “I’m having issues with this and I want it to look like this. Our goal is to …” it might be worth a shot. Right now what I’m seeing as the main issue is the narrow window you give the model to succeed. I hope this helps and I can find this thread again tomorrow lol happy prompting