Discussion about this post

User's avatar
SCPantera's avatar

Oof man the reading doctor handwriting thing is the closest I’ve hit “OH GOD IT’S COMING FOR MY JOB ALREADY” yet.

Joking aside I’m less confident it’s automatically fake because, empirically, it’s pure Bayesianism and it can be surprising how little evidence you need to get it right. “It’s a prescription” does a lot of lifting and there aren’t THAT many drugs that start with something that looks like a p and ends in something that looks like an l.

Expand full comment
Random Reader's avatar

If GitHub started charging me $30/month for Copilot, I would honestly still be capturing a significant fraction of the consumer surplus.

> On priors, it seems very likely to me that safety is much harder than capabilities and takes longer.

onceagainiamaskingyou.gif

I think we should strongly consider the possibility that the idea of "strong alignment" is Not Even Wrong. Anything worth the name of "intelligence" will likely be vastly complex and ultimately inscrutable, and it will spend much of its time operating in weird corners of its distribution.

I mean, about the simplest "AI" worth the name is linear dimensionality reduction and k-nearest-neighbors, and I sure as hell can't visualize what's happening in 13 dimensional space. When someone starts talking about in-context learning and attention heads in a multi billion parameter model, I can't imagine how they're going to guarantee constraints on future behavior.

I will concede that "weak alignment" is probably possible. By which I mean we can probably do as good a job of influencing the morality of our AIs as we do influencing the morality of teenagers. Lots of teenagers are amazing and kind people! But if we only align ASIs as well as teenagers, well...

Even if we could strongly align an ASI, it would presumably be taking orders from a human, and we can't reliably align power-seeking humans.

Expand full comment
16 more comments...

No posts