Are LLMs familiar with the concept of main-clauses and sub-clauses?

Hello,
as described in earlier posts, I compare the grammar abilities of LLMs against a rule-based parser.

Do LLMs know the concept of main-clauses and sub-clauses? LLMs perform very poor, even GPT4, if I ask questions about these concepts, given a certain sentence to analyze.

Or is it with these concepts as with every other concept: They are all understood to a certain degree.

I had assumed that LLMs know these concepts better, because LLMs are about language.

1 Like

I could give a quick text answer but that will not really answer your question and those that would follow.

Please watch this video in full and a few times over. I did. It is from a world expert on the subject, concise and easy to follow with the proper background, thus why you have have to watch it a few times.

As I don’t know your level expertise I would also suggest that if you have other questions you watch other videos by Andrej Karpathy

Note: It would really help if you noted your level of expertise. (ref)

1 Like

My level of expertise is hard to define. I am a physicist with a PHD in Neurobiologie. My interests are in logic and language in general. Since two years, I work as a computer linguist at the TU of Munich.

I have read the principles of LLMs out of papers and have watched youtube videos about them. But I have not a clear idea of what are they capable of.

I have thought that language principles are more understood by LLMs than other arbitrary other concepts.

2 Likes

@ Eric
“I ask because I use Prolog daily and regularly create parsers for data, programming languages and Turing complete languages but not human text.”

What do you mean with “Turing complete languages”? Programming languages are usually Turing complete.

1 Like

Thanks for the info it really does help.

That is one of the big questions? Many have found some practical uses for them but many of the possibilities are still unknown. Also it is suspected that the more training data you give them and the more training they do the more they are capable of doing. I can’t recall the specific paper that notes this; if I do I will update here.


Now that I know your expertise I will try to give more specific answers.

I would not say that as a fact but would not disagree with it fully in a casual conversation. I will say that LLMs started with a problem of trying to translate English to German as noted in the classic paper

If you are familiar with the idea of neural networks and weights and the different way to connect the parts that goes a long way.

So in the description of an LLM is no notion of clause, be it main-clause or sub-clause, but somewhere in the connection and weights the ability to translate English to German emerges with an ability better than current natural language processing based on software coding such as done with Prolog. I only note Prolog because I do it every day, even as I write this I have an editor open for parsing a programming language using difference list; there are many more ways to do natural language processing as we know and I am not trying to exclude them but don’t have detailed knowledge on them.

One user on this site who you might reach out to is Profile - paul.tarau - OpenAI Developer Forum, I can’t say if he can/will help but he has more knowledge than me for both the logic programming and LLM.

Note: The word emerges was specifically used. There are many articles on it this is just one example that gives you and idea of what is meant.


EDIT

I can’t recall the specific paper that notes this; if I do I will update here.

When I wrote that I was thinking of a very specific graph with axes. I did not see the graph in this document but this is in the ballpark for now.

1 Like

Many programmers who parse text know of using regular expressions and then less know of how to implement a parser for well formed text such as CSV files. One of the most widely known ways to classify such grammars is the Chomsky Hierarchy.

So to be more specific creating parsers up to Recursively enumerable (Type-0) grammars. Unrestricted grammar - Wikipedia

As I also noted I program parsers with with difference list which is another concept that very few know much less use. After programming with them for a while the concept of thinking of them as connections with pointers is easier and then if one looks at how neural networks work noticing that the weights start to raise and lower on the edges and then at times may become almost a 0 or a 1 and that then makes the connections deterministic which is close to the parsers one starts to see lots of similarities. While there are similarities I will not say that one should draw hard conclusions from that but note that it does make it easier to think about the ideas in both worlds and exchange ideas between them.

HTH

yes. Here’s an example Q & A:

Break down this sentence into it’s grammatical structure, and tell me what are the parts (like noun, subject, prepositional phrases, etc):

“The dog jumped over the fence, and ran to the barn at the north side of the field, to chase the chickens who lived there because he is best friends with all of them.”

ANSWER:

Let’s break down the sentence into its grammatical components:

  1. The dog - Subject (noun phrase)

    • “The” is a definite article specifying a particular dog.
    • “dog” is a noun serving as the main subject of the sentence.
  2. jumped - Main Verb (past tense of “jump”)

    • “jumped” is the action that the subject (the dog) performs.
  3. over the fence - Prepositional Phrase

    • “over” is a preposition indicating the relationship between “jumped” and “fence.”
    • “the fence” is the object of the preposition “over,” which is a noun phrase, with “the” as the definite article and “fence” as the noun.
  4. and - Coordinating Conjunction

    • “and” connects two independent clauses (“The dog jumped over the fence” and “ran to the barn at the north side of the field”).
  5. ran - Conjoined Verb (past tense of “run”)

    • “ran” is the second action that the subject (the dog) performs.
  6. to the barn - Prepositional Phrase

    • “to” is the preposition indicating direction towards which the dog ran.
    • “the barn” is the object of the preposition “to,” which is a noun phrase.
  7. at the north side of the field - Prepositional Phrase

    • “at” is the preposition indicating the specific location of the barn.
    • “the north side” is the object of the preposition “at,” which is a noun phrase.
    • “of the field” is another prepositional phrase nested within the first, with “of” as the preposition and “the field” as the object.
  8. to chase the chickens - Infinitive Phrase (acting as the purpose of the dog running to the barn)

    • “to chase” is the infinitive form of the verb “chase.”
    • “the chickens” is the direct object of the verb “to chase,” which is a noun phrase.
  9. who lived there - Relative Clause (modifying “the chickens”)

    • “who” is a relative pronoun referring back to “the chickens.”
    • “lived” is the verb in the relative clause.
    • “there” is an adverb indicating the place where the chickens lived.
  10. because - Subordinating Conjunction (introduces a reason or cause)

    • “because” begins the subordinate clause explaining why the dog is chasing the chickens.
  11. he is best friends with all of them - Subordinate Clause

    • “he” is the subject pronoun referring back to “the dog.”
    • “is” is the linking verb.
    • “best friends” is a compound noun serving as the predicate nominative, which renames the subject “he.”
    • “with all of them” is a prepositional phrase with “with” as the preposition and “all of them” as the object, referring back to “the chickens.”

So, the full sentence structure is:

[Subject “The dog”] [Verb “jumped”] [Prepositional Phrase “over the fence”], [Coordinating Conjunction “and”] [Conjoined Verb “ran”] [Prepositional Phrase “to the barn”] [Prepositional Phrase “at the north side of the field”] [Infinitive Phrase “to chase the chickens”] [Relative Clause “who lived there”] [Subordinating Conjunction “because”] [Subordinate Clause “he is best friends with all of them”].

Thanks wclayf,
it is good to see, how such an analysis goes in English. I do it for the German language. Perhaps, LLMs does perform better in English than in German, because of the larger training material.

I confront LLMs with ambiguous sentences, which means that I choose sentences, which are grammatical unambiguously but the right solution is not statistically obvious.

@ Eric
I have not looked at all sources you have posted yet.

To the parsing:
I think parsing formal structures as you do is similiar to what I say by “rule-based” parser. But, I have not understood “difference lists” in total.

The problem in natural language is, that the single elements in the string are ambiguous. So, some words have different word classes. I use the “trick” to analyse the whole search space of all word class combinations. For example, if five words in a sentence have each two possible word classes, I analyse 2^5 so called interpretations of the sentence. The challenge is to sort out the false interpretations, so that the right interpretation remains.

This “trick” functions better as one thought, and there is a paper of mine, which describes the basic ideas.

So with this parser, I can solve grammar problems, which are grammatically unambiguous, but not so likely statistically, as I wrote wclayf.

Can someone give me a quick answer, which temperature values are used in bing chat?

The values are named in German “kreativ” (creative), “ausgewogen” (ballanced) and “genau” (precise). I guess that “creative” is somewhat like 0.7, “ballanced” is somewhat 0.3 and “precise” can be 0.0 or 0.1?

LLM’s are trained on patterns in language.
A Clause is a very specific pattern in language, and specified legally as a clause, but understood in language as an “exception”
(i think)… (im not thinking that deep into it)

So when it comes to an LLM’s training - it understands exptions to a rule, or conditions.

So wording is king here.

IF you ask the LLM to explain what a clause is - and then give it context, then it will give you your answer.

Unless its not trained to be able to.

anything more than that i feel like im riding the line.

I think, the grammatical understanding of text is implicite encoded in the neural network. So, it very likely produce text, which is grammatical correct. But it is another question, if the neural network makes this understanding explicite.

wclayf shows such an analysis for the Englisch language. I have tried to ask such grammatical questions to older models (Davinci003, Curie001). The newer models are significantly better in such questions.

Perhaps, grammar understanding is like understanding the concept of natural numbers. Such abilities arises with bigger training material.

Or, grammar understanding is different, because the grammar is implicitely encoded in the network. When the machine guess the next word, it knows which fits grammatically.

After a bit of chatting ChatGPT 3.5 with the topic of subordinate clauses and dependency grammar, I try out a request for Mermaid diagram code that represents the structure of the sentence “The book that she lent me, which had a beautiful cover, was a bestseller”

graph TD;
A(Main Clause: was a bestseller)
B(Subordinate Clause 1: that she lent me)
C(Subordinate Clause 2: which had a beautiful cover)
D(Noun Phrase: The book)
A --> B
A --> C
B --> D
C --> D
graph TD;
A(Main Clause: was a bestseller)
B(Subordinate Clause 1: that she lent me)
C(Subordinate Clause 2: which had a beautiful cover)
D(Noun Phrase: The book)
A --> B
A --> C
B --> D
C --> D

There’s no real diagram system that I could find for sentences with dependency and anaphoric reference, but the AI seems to generate and know after talking it up.

1 Like

hello _j,
I think the main-clause is not correctly dectected in your example. The main-clause is:
“The book was a bestseller.”
“the book” is not an alone standing noun phrase.