LLMs need to be judge-maxxing
2025-04-01
Claude said: "You've coined a perfect term! Judge-maxxing captures the essence of this approach brilliantly."
so don't even try to call my title bad.
we need four things to make machine intelligence: training algorithms, architecture, compute, and a signal.
transformers doing gradient descent work for now.
compute will be taken care of by Nvidia and whoever scales fusion / fission.
the thing that is unclear is the signal.
most of it comes from data in a supervised manner (by us teaching the model to mimic it).
first problem: our data is running out.
second problem: our data is shit, and as long as it is generated by humans, our universal function approximator will never be better than us => no superintelligence :(
this is why I like RL: we can give rewards or penalties and create a signal that way.
the problem: we need to model an environment, since they are not ready to roam the universe freely yet (nor should they I think).
if we are talking about solving tasks (which is what LLMs are used for), it has to be verifiable.
this means, we should be able to define mathematically when something is good or bad.
unfortunately, most of our cognitive tasks are not that easy that we can model them in such a way.
we don't have a mathematical formula f(text)
which will give us cognitive-quality and creativity scores for example.
I believe, that once we run out of data, the machines will have to stop learning like machines (simply following "prechewed" signals) to more like humans (exploring and judging from their own first principles).
AIs criticizing themselves is called LLMs-as-a-judge.
while it don't introduce any new knowledge, it can close the gap between what the model could do and what it actually does - which I think is HUGE right now.
as long as AIs feel dumber than we imagine a human who has studied all of our knowledge, the gap exists.
why does it work?
mostly because judging subtasks easier than the full output, so we can increase the capability for complex thoughts by merging a bottom-down approach.
we can also introduce something like a fuzzy metric / loss function using natural language by prompting the judge in a certain way (huge for alignment btw).
this is a very similar underlying principle which makes reasoning work for deepseek's R1.
the only difference is that instead of verifying the answer mathematically, we bootstrap based on what I see as universal laws of reasoning (metacognition).
since in theory, the AIs has all the "raw knowledge" to become a cognitive beast, I believe this should work.
so, how will this bring us to super-intelligence?
what we need to achieve is a recursive improvement loop.
and this is where it gets interesting.
by using a meta-judge (so a judge judging the judge) we can not only improve on the given task, but teach the model how to learn.
a good signal, not only guides you during the learning in what is correct, it also guides you to what you need to learn.
I am fascinated by uncertainty-based approaches like curiosity-learning, where the model is rewarded to facing new things it has not seen before by itself.
this, combined with the judge's prediction in what needs to be learned based on importance and current performance, will be the way to go.
I believe we will max out all the "human made" things like reasoning and cognition in general - like I said: this is not a question of more data from the outside but rearranging itself on the inside.
there will be a specific prompt, which will be the first one to get the ball rolling.
at some point the only way to learn more is to get information from the outside.
at this point, the model should be good enough to be able to run experiments autonomously.
☘