Japanese Language Proficiency Test (JLPT) and o1

Today, I have been testing o1 and o1 pro mode’s performance in the Japanese Language Proficiency Test (JLPT) at N1 level (C1 level CEFR), particularly in Question 6 (Sentence Composition)

There are 5 questions with four blanks each and one has to choose the correct expression that goes in the third blank. It is one of the trickier questions in the test because one has to build the sentences mentally and has to make sure they make sense both logically and grammatically. They are like small linguistic puzzles.

Here are the results I got for 5 trials:

  1. o1 failed / o1 pro mode correct
  2. o1 correct / o1 pro mode correct
  3. o1 failed / o1 pro mode correct
  4. o1 correct / o1 pro mode correct
  5. o1 correct / o1 pro mode correct

o1 got 3 out of 5 while o1 pro mode got all answers correct. I think this is another indicator of pro mode’s superiority over “normal” o1.

An interesting thing I noted is that thought time (in pro mode) showed a large variance, from 4 seconds to more than 3 minutes.

As an additional note, the 4o model fails lamentably at this question almost every time (expect in rare cases).

Just posting here in case it sparkles any interest.

How did o1 pro or the other models do at the rest of the test? it seems like you only tested one test question.

have you found any AI tool to be super useful for language learning or is it still mainly dominated by human to human?

I tested only one test question because the test is very long and I don’t have the time to test on the whole exam. The other questions are more about language knowledge rather than logic so I expect all models to do well on those.

Question 6 requires both language knowledge and logic that’s why I chose it for this test.