Japanese Language Proficiency Test (JLPT) and o1

cristopherbonn4 · December 14, 2024, 7:47pm

Today, I have been testing o1 and o1 pro mode’s performance in the Japanese Language Proficiency Test (JLPT) at N1 level (C1 level CEFR), particularly in Question 6 (Sentence Composition)

There are 5 questions with four blanks each and one has to choose the correct expression that goes in the third blank. It is one of the trickier questions in the test because one has to build the sentences mentally and has to make sure they make sense both logically and grammatically. They are like small linguistic puzzles.

Here are the results I got for 5 trials:

o1 failed / o1 pro mode correct
o1 correct / o1 pro mode correct
o1 failed / o1 pro mode correct
o1 correct / o1 pro mode correct
o1 correct / o1 pro mode correct

o1 got 3 out of 5 while o1 pro mode got all answers correct. I think this is another indicator of pro mode’s superiority over “normal” o1.

An interesting thing I noted is that thought time (in pro mode) showed a large variance, from 4 seconds to more than 3 minutes.

As an additional note, the 4o model fails lamentably at this question almost every time (expect in rare cases).

Just posting here in case it sparkles any interest.

i.am.andrew.ong · March 2, 2025, 2:41am

How did o1 pro or the other models do at the rest of the test? it seems like you only tested one test question.

have you found any AI tool to be super useful for language learning or is it still mainly dominated by human to human?

cristopherbonn4 · March 2, 2025, 1:17pm

I tested only one test question because the test is very long and I don’t have the time to test on the whole exam. The other questions are more about language knowledge rather than logic so I expect all models to do well on those.

Question 6 requires both language knowledge and logic that’s why I chose it for this test.

Topic		Replies	Views
GPT-4o testing o1-reasoning Community gpt-4o , o1	10	1515	March 19, 2025
Gpt-4o and o1-pro > other models? API models	8	517	April 28, 2025
O1 Tips & Tricks: Share Your Best Practices Here API	10	4417	September 18, 2024
Any other Pro users using o1 for math? Community chatgpt , o1 , o1-pro-mode	9	1139	January 24, 2025
Performance of GPT-4o on the Needle in a Haystack Benchmark API chatgpt , api , gpt-4o	13	5974	June 13, 2024

Japanese Language Proficiency Test (JLPT) and o1

Related topics