Intelligence Between Meaning And Chance

I really don’t know a lot about ‘evals’ and how intelligence is rated…

I know this is important in models but here’s the thing for me…

It’s rather ignoring the point…
OpenAI views perfection in ‘meaning’ space?
encode.su in ‘chance’ space?

This is still a game of chance… and however much we influence chance it is still at an entropy level of it’s own…

How much does the world change in a fraction of a second?

How much do we rely on the best answer over the best solution?

I probably rate my intelligence (direction) at the other end of the spectrum…

Where the best answer has the best outcome, overall…

There is a lot of loss in perfection.

And why this is relevant to the OpenAI forum and ‘Meaning’… How do we find that optimal point? Between Intelligences?

My question I guess is… In what ways can we measure intelligence?

Edit: Or how we decide to act on it?

2 Likes

Specific to “evals” in OpenAI:

This is very task-based, and heavily based on existing artifacts logged. Like the idea of “distillation” which was discarded, this has very little application to real-world chat and must have very narrow task focus to be employed.

Variables is important, or templating: it is where the changing part is inserted.

Then “judging”. There’s different graders you can write, from text similarity scoring to asking an AI to score. The ultimate output is a boolean or threshold.

But yes, you are asking an AI if an AI answered well. Like asking a schoolchild if another schoolchild answered about particle physics well.

It is a surface that upon investigation, immediately informs that you’d write your own implementation instead of investing learning time, besides that there are limited “smart” models to either generate or judge in a semi-deterministic fashion.

1 Like