ARC Prize is a $1,000,000+ nonprofit, public competition

Pretty interesting. Had no idea that AI struggled so much with this.

The puzzles are fun to try. Certainly gives you the impression that this is very easy to solve.

Though #5 was very weird. I got it right away, but I can’t really express how I got it.

#6 … ahahah. Yeah, not sure how I could train an AI to figure that out.

2 Likes

#5 is made of 2x2 and 1x3 blocks.
#6 is just an AND intersection.

The public training set is significantly easier than the others (public evaluation and private evaluation set) since it contains many “curriculum” type tasks intended to demonstrate Core Knowledge systems. It’s like a tutorial level.

It seems the key here is to develop your own training set, expanding the 400 to 40000, and trying to find where the novelty of all 800 you get as examples and practice cuts off. It is a very specific domain, so success doesn’t really translate to much else.

1 Like

Yep, for sure. That’s often about 75% of the effort in these comps is generating a good complementary training set for sure.

Yeah, though it’s funny how intuitively I was able to see this but trying to think how I might generalize this for an AI… yikes.

I gave #6 to Sonnet 3.5

Based on the analysis, the pattern can be summarized as follows:

  1. Examine each column in the 7x3 input grid.
  2. Count the number of blue squares in each column:
  • If a column has exactly 2 blue squares, place a red square in the corresponding column of the 3x3 output grid.
  • The vertical position of the red square in the output matches the position of the topmost blue square in that input column.
  1. Ignore columns with 1 or 3 blue squares (they don’t produce any output).

This pattern transforms the 7x3 input grid into the 3x3 output grid for all given examples.

And then gpto

Pattern Rule:

  1. Mapping Regions: The 7x3 input grid can be divided into regions that map to the 3x3 output grid.
  2. Red Cell Criteria: A cell in the 3x3 output grid becomes red if the corresponding region in the input grid contains a majority of blue cells surrounding non-blue (gray or black) cells.