Create chatbot to assist lawyers based on uploading the law file (Saudi law)

Hello,
i have a case that i have all the Saudi law as (pdf, txt)
i need to upload the whole law files and by the time i will have to upload cases

what i want to ask i want the best approach to achieve this

Use Case

CASE 1

The Lawyer will ask about specific law and the chatbot must answer it very accurate cuz we are talking about law

CASE 2

the lawyer will give the chatbot case (txt, pdf) and the chatbot must answer with the laws that the lawyer will look about and must be true

CASE 3

the lawyer will send file of the case
and the chatbot give him judgment based on the law and other similar case
and also show him the source that he use

all the cases are very sensitive since the data is law data

so what i need i need help and guide to achieve this

i already try vector story and embedding but the answers not accurate

i believe i miss something and looking for ur assistant

thank u

1 Like

Hi there, this can be tricky since we are talking about laws in general. I am not a ML or NLP engineer; I am a software engineer who have already worked on something similar but for french real estate law; something less critical I guess. we did something simple; we’ve built an assistant api ; we gave it data as PDFs and we achieved a connection to it from Nodejs server.

2 Likes

I would suggest using embedding-based routing (classification) as your first step to perform high-level classification, then implement RAG within the specific branch of law identified. Alternatively, you could adopt an agentic approach (if users are not time-sensitive and the chatbot can take 30-90 seconds to process) where the LLM first determines:

  1. Which legal branch the question pertains to,
  2. Which action category applies (e.g., answering the user’s query directly or citing relevant laws for further research).

Once the legal branch and question type are identified, your LLM can proceed to generate the answer, equipped with relevant answer examples or templates (that you’d add into your) prompt and a narrower knowledge base focused on the relevant area of law.

1 Like

Ideally, for something like this I think a hybrid RAG + KG in conjunction with a model trained on the area of law it would be used for.

When we want to find similarities in text using vector embeddings, especially for longer documents, how we create those embeddings matters a lot. Let’s compare two common approaches:

Approach 1: Single Embedding for the Entire Text

How it works: You take the entire piece of text (e.g., a whole document, a long paragraph) and generate one single vector embedding to represent it. If your model outputs 1024-dimensional vectors, this will be a single array of 1024 numbers.

Example Text: [This is a sample sentence to show how a sliding window works in its most basic form.]

Resulting Embedding: [vector_of_1024_dimensions_for_the_WHOLE_sentence]

Pros:

Simple to implement.

Represents the overall “gist” or dominant themes of the entire text.

Cons & Considerations:

Loss of Granularity: This approach “averages out” the meaning of the entire text. If a query (what you’re searching for) is only relevant to a small part of a long document, the single embedding for the whole document might not show high similarity. The specific information gets ‘diluted’ by the rest of the content.

Consequently, even if a part of the text is a strong match, the overall embedding might not be close enough to the query vector to exceed a similarity threshold, potentially causing relevant information to be missed.

Approach 2: Multiple Embeddings using a Sliding Window with Overlap

How it works: Instead of one embedding for everything, you break the text into smaller “chunks.” A “sliding window” moves across the text, creating these chunks. Crucially, these chunks overlap by a certain percentage. Each of these (potentially overlapping) chunks then gets its own vector embedding.

Example Text: [This is a sample sentence to show how a sliding window works in its most basic form.]

Window Size (example): 5 words

Step/Stride (example): 2 words (this creates overlap)

Resulting Chunks & their Embeddings:

[This is a sample sentence to] → [vector_1_for_chunk_1 (1024 dimensions)]

[sample sentence to show how] → [vector_2_for_chunk_2 (1024 dimensions)]

[to show how a sliding window] → [vector_3_for_chunk_3 (1024 dimensions)]

[a sliding window works in its] → [vector_4_for_chunk_4 (1024 dimensions)]

[works in its most basic form.] → [vector_5_for_chunk_5 (1024 dimensions)]

You now have an array of embeddings, where each embedding represents a smaller, more focused piece of the original text.

Pros:

Preserves Local Context: Each chunk embedding strongly represents the meaning of that specific part of the text. This makes it much more likely to find matches for queries targeting specific details or phrases.

Robustness to Phrasing: Overlap ensures that key concepts aren’t accidentally split between chunks and lost. If an important idea spans the boundary of one chunking method, the overlap ensures it’s captured whole in another.

This approach generally results in a greater number of potential similarity matches because the query is compared against many specific segments rather than one general representation.

Cons/Considerations:

More Embeddings: You generate and store more data.

Post-processing for Relevance: Because a query might match multiple chunks from the same original document, a strategy is required to interpret these results. This often involves calculating the significance of these matches, such as counting the number of chunks from a document that meet or exceed a given similarity threshold. Other methods include taking the highest similarity score among the matched chunks or averaging their scores to determine the overall relevance of the original document.

Why Overlap is Important:

Imagine your window doesn’t overlap, and you chunk by sentences:

Sentence 1: “The cat sat.”

Sentence 2: “On the mat.”

A query for “cat on the mat” might not strongly match either embedding alone. With overlap, a chunk like “The cat sat. On the mat.” (or parts of it combined by a sliding window) would capture the full concept.

In Summary:

Single Embedding: Good for short texts or when you only care about the very broad, overall topic of a long text. Prone to missing nuanced or localized information, especially if similarity thresholds are strict.

Sliding Window with Overlap: Better for longer texts where specific details matter. It provides more granular matching, increasing the likelihood of finding relevant information. However, it necessitates a clear strategy for handling and aggregating multiple chunk matches from the same source document.