Discussion thread for "Foundational must read GPT/LLM papers"

If they had cause and effect through knowledge graphs, I think this could take things to the next level.

This might be where synthetic data comes in, because creating knowledge graphs is hard.

So you need agents to create the graphs, possibly with hallucinations. This is a hard agent to build. But this is the “creator”.

You have other agents “check the graph” (to reduce hallucinations). These agents can be trained in specific narrow areas, they are the “experts”, and can alter/edit the graph. Like the different heads of an MoE system.

And final agents that traverse and interpret the graph (basically a good filter and translator, what we have with current models). So we have this agent built!

Something like this, where you have this crazy little AI research team might start building in more truth into the inferences. And lead to real discoveries.

3 Likes

Would look like a neural network :laughing:

There is a direct correlation between the extent of damage following a residential fire and the quantity of firefighters present at the scene.

That will also show up in a:

From this you could draw the conclusion that sending no firefighters will cause zero damages, but you could also conclude that you need more firefighters for a larger fire.

Only one of these are correct, but you’d probably be able to find the correct solution due to the amount of connected knowledge.

Scientific discovery is a bit different, it is, by definition, at the very end of our collective knowledge, so you don’t necessarily have all of these connections to work with.

1 Like

I’m glad you said ‘knowledge graph’ not ontology. The problem w ‘knowledge graphs’ (or ontologies) is the same reason symbolic AI failed IMHO*: for every rule there is an exception, every attempt to specify context leaves something out, and you can never get closure.
With that disclaimer, knowledge graphs (eg BFO in the biomolecular space) are an element of my efforts to build systems that can do discovery. So are llms. So are lots of other things.

*my PhD was in symbolic AI, I still think much is relevant, so no flames please!

2 Likes

Yep, that is also part of the scientific process if you ask me, there’s no “absolute truth” only what’s currently the best description of it. Chasing this “absolute truth” is the same as chasing the dragon.

I believe the best way AI can be part of scientific breakthroughs is by being part of the journey and by speeding up key processes.

3 Likes

Yes. One of the ‘key processes’ is keeping up with the literature. Whatever that means.
Recognizing a relevant paper. One step away? Two steps away? Maybe that’s a progress metric. Take as large or as small a bite out of that as you like, I’m fine with that.
Just looking to increase the signal-to-noise.

2 Likes

I’ve also done a bit of that myself, and I think the results are decent.

Where I think AI has been most beneficial in my work is when tasks get extremely repetitive, here’s an example of such.

I had to look at the similarity of various poisons, draw them, calculate some properties and score them based on their functional groups and those properties.

That stops being fun after the 3rd or 4th one, but with AI I was able to complete this in 45 minutes.

3 Likes

yup. Scanning ppr abstracts also stops being fun after the 500th or 600th one. :slight_smile:
I have the luxury of time to pursue quixotic goals, I’m retired!
:rofl:

1 Like

If you had to define a lose goal for what you want the AI to do in this situation, you’d probably end up with "show me something interesting" and that seems perfectly reasonable.

Tbh I don’t think we’ll get “super intelligent” AGI, before we’ve had baby AGI and awkward teenage AGI :rofl:

2 Likes

One of my current projects involves strapping a camera and microphone on my body, mostly to have easier access to inputting stuff into a database for AI use.

I’d be very interested in doing some knowledge graph based stuff with that data, like what @curt.kennedy was talking about :thinking:

3 Likes

LLM isn’t necessarily a parrot until one piles on the fine-tune of what it should parrot, such as phrases like “I’m sorry,”…


Chinese researchers report on using AI to check your rectum

https://www.thelancet.com/journals/eclinm/article/PIIS2589-5370(23)00518-7/fulltext

4 Likes

Lmao :rofl:

AI-aided colonoscopy significantly enhanced the detection of colorectal neoplasia detection, likely by reducing the miss rate.

1 Like

You can make it fun and automated on the cheap.

Embed the abstracts, and partition them based on cool or lame.

For a new abstract, create new embedding v and compute the following:

S(v) = \frac{1}{N_{cool}}\sum{v\cdot v_{cool}} - \frac{1}{N_{lame}}\sum{v\cdot v_{lame}}

So you have a set of cool, and a set of lame, and you perform the difference of the two sums.

Here, if S(v) > \delta, for some threshold \delta it is considered cool and is brought to your attention.

Cool, and not lame, huh?

3 Likes

Interesting use of the word ‘intent’

The critical question is then: will the strong model generalize according to the weak supervisor’s underlying intent

They use it in the paper as well - once - but don’t define it.

Unfortunate, as it’s a rather pivotal concept here in all of this.

I’ll be honest, I’m struggling to see the advancement here, especially in light of the lack of definition of ‘intent’. At best, all I can see is that yes, handicapping a model is likely to make it have less capability. But isn’t that self evident? And isn’t that working at odds with what we want to achieve? And if not - is the goal really about alignment here or just a way to train smarter models? I mean what happens when we lose the thread of the weaker supervisor models?

It sort of sparked one idea I had which was predicting the likelihood of a model doing something that is ‘unaligned’ (whatever that might mean). If it goes over some threshold, either shut it down, or have another stronger model do adversarial critique.

The stronger model would be larger and have more resources and would be trained only for enforcing / checking alignment. The efficiency would come from having the former model trained in such a way that it rarely goes over the threshold so that the stronger model only needs to be sparsely engaged.

GPT5, ok, but then you have to have a GPT6 which has the job of keeping GPT5 in check. put another way, the compute used to do an AI safety check on a task must be > compute used to do the task, at least then the task looks like it might go off the rails.

Taking the idea above a bit further, openai and friends could charge different amounts for different usages. Higher risk activities would cost more because they would engage the bigger critiquing model.

eg, if i started asking gpt4 how to build a meth lab, that could get costly.

we would, of course, be warned when the bigger model is going to start charging us.

the challenges would be to build a predictive model when alignment is going to become an issue and of course training the larger critic.

Pretty sure nobody sober and clear in his mind would have tried to create a language like javascript.

1 Like