Fine-tuned model for German court cases

Hi! I am thinking about creating a fine-tuned GPT model for german court cases.

The idea: The user should be able input a fictional case and the model should return the expected ruling of German courts.

My approach: I have access to 600 000 German court rulings, from which I could build a dataset to fine-tune a model with. I would put the case-text as the prompt and then the ruling as the completion.

My question: Do you think this would work out. Is the effort worth it?

Also: I noticed that if I currently ask ChatGPT how a fictional case might be ruled in Germany, it tells me that it has no idea + that it does not provide legal advice and I should consult a lawyer. So legal questions do get filtered currently I guess. Does this pose as a problem for the fine-tuned model I am trying to build?

Would love to get some feedback!

Greetings from a German law student, Jan


I have been looking at trying to do the same. In my country.

I thought German court records weren’t public. Are the lawyers/judges name visible in the docs as well?

i am intersted in this topic as well. any news?

1 Like

You cannot finetune for this use case.

However, I do have an answer.

1 Like

Hi, interesting use case. Maybe worth a chat.

id be in. I am trying around with options right now. would be nice to join forces

1 Like

hm - so its not possible ? that sucks. Ill have a look into the video. thanks

I would say, possible, but not the way started here. Something like:

Summarize the case into primary idea, circumstances, applicable laws, then get court/judge details, put everything into one condensed prompt then match against the ruling and fine tune in that to see if model find patterns and will be able to produce valuable predictions.

Cut the all data into case categories. Each case category into 4 chunks where you train on one of those chunks and validate against the other 3. Over 500k cases would be enough data to pay with.

1 Like

1 Hard to say until you test. I would pick the most common category of cases and do some tests to see if something viable goes out on a highly specific application.
2 Definitely worth it because you’ll learn so damn much that it might even change your vocation. Then if it works out technically, the application will be definitely very lucrative. So try.


I had the similar experiment but rather with Swedish; the best option so far was to translate the doumnets; turn them into smaller pieces of texts preferably with the same length and structure; embed these texts; find similarities between the query and the rest of the documents using embedding models; what I’m talking about is a knowledge base questions answering.

This is a very hard use case that many people have been trying to solve for years, at least for U.S. cases. Established legal tech giants like Thomson Reuters plus start-ups and scale-ups galore. GPT-3’s best completion endpoint accepts a finite number of tokens (~4000). Those tokens must cover the facts and issues of your hypothetical case, plus sufficient information about relevant precedents, plus your instructions, plus GPT-3’s completion/prediction. That’s a very tough ask, given the complexity of case law and the low tolerance for errors in legal applications. GPT-3 does not provide a method of fine-tuning by which you can teach the model the facts, issues and reasoning of a body of case law. Not to discourage you from trying, but predicting the outcome of a legal dispute involves so many variables, requiring reading multiple documents in tandem, weighing the relative importance of different kinds of information within those documents (facts, issues, reasoning, obiter dicta), and hardest of all determining not just the outcome but the judge’s reasons that most persuasively influenced the outcome. One Canadian company, Blue J Legal, built a product covering JUST ONE legal issue, specifically, predicting whether a judge would rule that someone is an employee or an independent contractor for tax purposes. Blue J had to manually classify thousands of cases just to develop a rules-based system, where the user fills out a questionnaire asking things like whether the person sets their own hours, has to wear a uniform, etc. Based on the answers in the questionnaire, the product would give a prediction. That’s a lot of work for a single legal issue in a single area of law, let alone the whole body of case law of a country. To tackle that, I think you have to break down your use case into several steps. Perhaps break up the cases into paragraphs, then instruct GPT-3 to assign each sentence in the paragraph to a category: F for fact, I for issue, R for reasoning, O for other or unknown. Then you could match your hypothetical case to the closest group of cases, separately for facts and issues. Then find the particular cases where both the facts and issues are a good match for your hypothetical case. Then use the sentences assigned R from those particular cases and instruct GPT-3 to read the facts of your hypothetical case and the reasoning from the cases you’ve pinpointed, and make a prediction. Please let me know if the above is helpful. I was a lawyer for a long time before starting my company and I’m working on a similar use case but for regulations rather than case law. Best, Leslie


Thanks to JA to launch the legal question and lmcallum to help come back on earth,
Would it be possible ( may be as a start app prior to cases in the future) to browse the laws (search), inside public repositories (stable data) , to help citizen to understand their law (as texts are somewhat complicated in structure, content, and form) ?
Question based on current language
Example …what eatable oil is authorized ? …
System would answer with text reference (legal) and content , example [ ISO xyzt]
Some countries have open access to their law (example France/Legifrance) …I do expect US, Germany and most UE countries do provide the same
Final aim : accessible law to citizens, , and why not discuss in a common forum. Each discussion being indexed by legal public reference (the ID of the law or rule ) ,

1 Like

Hi Jan

I think it is worth a try.
Already about 20 years ago, there were attempts to develop programs that compiled the relevant legal terms and the corresponding legal provisions from colloquial terms. They were often surprisingly accurate.

Today, this is certainly incomparably better, but there are likely to be new , different problems.
(retired legal advisor of a large public authority)

Love that idea sir. I‘m experimenting with it as well.
When you asked chatGPT how to train itself on German law, it gives you a more complex answer. Have you tried that?

Thank you for the great answer. You gave me a way better understanding.

According to whats been said in Finetuning for Domain Knowledge and Questions, fine tuning a model does not teach it about knowledge, instead it tries to learn about text patterns, which enables it to answer in a more human-like vibe.
That said, I assume what your dataset might acheive is teaching it to write a formatted court case. I guess you might notice it creating perfect and complete reports but with errors in rules.

1 Like


I’m doing something similar. Different continent but same idea.
My team has realized that it needs to be done in baby steps.

We have adopted a combination ElasticSearch & Semantic searches via embeddings as a initial search engine. So when someone searches “My landlord raised rent by 25%, is this legal?” It pulls all the articles that are related. This is our current product that we are tuning with machine learning on the search query. We are also considering using semantic similarities to refine the initial search query. Kind of how ChatGPT will prime a question. So “Landlord raising rent laws”

We believe the next step will be to have a language model collect the contradictions, exceptions, and similar cases to argue against itself. Each different letter is a slightly different model (maybe?):

A: “Article 24 states that a landlord cannot raise rent more than 10% per year”
B: “Article 11 states that they CAN if they have done x amount of work”
A: “Nothing to add”
C: “There are 255 similar cases that ultimately depended on the amount of work”
C: “Therefore, it ultimately depends on if there were renovations done on the property, here’s some statistics:”
S: Provides any noticeable patterns in cases

Hope this helps, and would love to hear of anyone else’s input.

1 Like

I’m interested as well. Where did you get the court rulings from?

Hi! I’m wondering where we are at with the idea of finding case law that is close to your particul;ar case. All my idseas start from a need, in my case im getting a divorce and the issues are quirtew complex… We happen to have quite a good api for searchiong case law and acts and al that but im stuck at how to approach embedding the ai? each caae has a number of primary subjects that it deals with and in some cases creastes binding law on them.

I was thinking if the ai could listen to our story, perhaps even guide us by asking questions, then extract the main points of our case, the do a similarity search with binding law on the total similar points discussed in the case?

Soorry i wrote this fast… is this where we are at or am i way behind?


Do you have any news on this subject? I would really be interested in that. Do you think that one could get better results in semantic search by using, for example, an tokenizer or embedding model trained on German legal texts?