29 Comments

> Anyone who is fully funded and orders Domino’s Pizza is dangerously misaligned.

Agree, but he was clearly optimizing for nostalgia.

(Also, the SBF book review really is zvi's best work, for those who've scrolled straight to the comments)

Expand full comment

My kids actively prefer Dominoes and it makes me sad :(

But on the other hand, Dominoes is a very cheap way to make them happy, so *shrug emoji*

Expand full comment

Yes it is.

Expand full comment

Yikes, Eureka does sound terrifying. And it’s open source... what could go wrong?

Expand full comment

Not happy about Eureka, but it is good to see more and more people being willing to shout it from the rooftops that we shouldnt kill ourselves.

What a concept, huh.

Expand full comment

I suppose I'm not sure exactly what the human line is, but it seems to me that there's nearly always alpha in betting on AI news flying in the face of whatever Gary Marcus said most recently:

https://arxiv.org/pdf/2310.16045.pdf

Expand full comment

Just want to say I am here for the radio-only-HHGTTG references

Expand full comment

I was drinking coffee when I read it. :-)

Expand full comment

Missed that one, care to point to it for we mere book-readers?

Expand full comment

I wanted to add that you have provided me with so much value that I'm definitely upgrading to paid, and would suggest that to others as well.

Expand full comment

Do you have any thoughts or information about AI-safety related charitable giving? Are there any places to give that aren't doing work that can easily become capabilities? Has there been any effort among EAs/rationalists to try and pool resources for something like this (I'm not including something like OpenAI, which I think is disqualified by the capabilities issue)?

Expand full comment
author

EAs/rationalists contribute to many places that work for safety. Some are more dual use than others, all technical work is non-zero dual use unfortunately. If you want what I think is a relatively safe-from-backfire and also relatively high-EV play, I would suggest Orthogonal (https://orxl.org/). High chance of no effect, but low chance of big positive payoff and very low risk of backfire, and I have confirmed they are the real thing.

For ML-focused technical work there are many reasonable choices, but also if you're really good and I know about you, you're probably funded already. Doing good research on which other orgs are good is great.

Advocacy/policy/etc organizations are another option of course.

Expand full comment

Query: was there any grassroots activism in support of the original SALT treaties? SAILT currently looks like a prerequisite for effective shut-it-all-down coordination.

Expand full comment

I've created a market on when we'll enter an "AI Winter": https://manifold.markets/nsokolsky/in-which-year-would-zvi-confirm-tha

Expand full comment

> Perhaps we should not build such AIs, then?

I think we are one specific medium-sized breakthrough away from having AIs that quite plausibly count as "alive" and "sentient", and with potential for "sapient" and "conscious" (whatever that means). I doubt I'm the only one who sees this. I think it's more likely to initially come from the open source community, rather than the big labs (although I don't know what they're doing internally). I expect that it will happen well before we get AIs that are powerful enough to be threats. And at that point, we will want to have been thinking for some time about the ethics of how to treat them.

From what you say here and in the last AI roundup, my guess is that you view this problem (which I'll call ETA, for Ethical Treatment of AI) as having substantial overlap with extinction risk, such that we're unlikely to encounter ETA without also encountering extinction risk, and thus we can largely ignore ETA because our survival as a species will be at stake. I think this is incorrect, and that we are likely to have to deal with ETA first, and that if the issue isn't addressed ahead of time, humans (and potentially AIs) may choose courses of action that make our extinction more likely.

Expand full comment

I wonder about your blanket objection to the “superhuman persuasion isn’t a thing” objection. I think there’s a narrowish way in which it could make sense, and your examples of high-level human persuaders points to it. Dictators, cult leaders, founders of major religions: all of these categories, I suspect, contain a high number of people whose persuasive ability is *embodied* in a way we don’t fully understand. I’ve heard enough reports of what it’s like to be in a room with eg Bill Clinton, and have actually experienced something not dissimilar with at least one much less famous and powerful person, to be convinced that there really is some mysterious form of charisma that requires physical proximity to the person in question. I imagine Jesus and Mohammed would both have had this quality to a high degree. (Hitler had it, by all accounts, and his speeches, on film or radio, were famously *physically* performative -- you can practically see the endocrinal inputs to those gestures -- whereas Mein Kampf is notoriously unreadable.) And I don’t think any plausible near- to medium-term AI would be able to replicate it. This doesn’t mean that future AIs won’t be able to be superhuman at other forms of persuasion, or that weird movements in the “religion” space that we very much won’t like aren’t on the cards; but it does mean that the AI persuaders will lack a significant string to their bows.

Expand full comment

Tldr: INT does not funge into CHA!

Expand full comment
author

Even if it turns out physical presence is central to this, and VR/AR doesn't do it, and audio alone doesn't do it, they will simply... get physical presence, then? We will soon have the technology for that, too.

My guess is that other things will more than compensate for that even if various hacks/tricks/strategies I expect to work turn out not to work at all (and if they work, ho boy).

Expand full comment

I’d respond that if we’re at the point where they can “simply” get physical presence in the sense I’m talking about m, and they’re not robustly aligned, then persuasion is the least of our worries...

Expand full comment

“Thompson is focused on military AI and consumer applications here, rather than thinking about foundation models, a sign that even the most insightful analysts have not caught up to the most important game in town” - this made me think once again of a question that I still haven’t seen a great answer to. How are we certain that AGI is coming soon or at all? How do we know that we are not on the verge of a plateau? I have mostly assumed AGI’s arrival to happen soon, but still not have any logical arguments as to why this is necessarily the case

Expand full comment

It's a fair question. I haven't seen any argument that I find convincing that says AGI will definitely come, on any particular near to mid term timeline. Many people have an *intuition* that this will occur, which I share, and have written up here: https://amistrongeryet.substack.com/p/get-ready-for-ai-to-outdo-us-at-everything. But I haven't seen anything I'd call a proof. The various "biological anchor" estimates are far from being proofs.

Edit to add: I think some people believe that current LLMs are nearly there, and another round or two of scaling is all that's needed to cross the line. I do not share this view, but people who act as if it's a settled point that AGI is very near are probably mostly working from this idea.

Expand full comment

It's not necessarily the case, but since it's the explicit goal of the leading labs and there's no slowdown currently happening, it would seem extremely reckless to assume that a plateau will conveniently arrive in time!

Expand full comment

I fully agree with you that such an assumption is dangerous

Expand full comment

“The effect is still big but GPT-4 seems better at this than the other models tested.” Is this a result of simply more compute or maybe GPT-4 was finetuned/RLHFd differently?

Expand full comment

I wouldn't have generated the "persuasion > intelligence" claim on my own, but having seen it I find it immediately compelling. These are language models we're talking about, after all, and *language itself* is centrally a tool of persuasion not intelligence. Words are for storing and transferring ideas not generating them.

(There's a broader version of this observation that applies not just to AI but also to human rationality: "Reason" is more naturally framed as the mind's PR department, not its R&D department. For this reason we should model superhuman persuasiveness more like a grandstanding "public intellectual" and less like a wild-eyed cult leader.)

That's not to say that an AI trained on ~all text ever won't have to develop some shape-rotation capabilities to minimize its loss function, just that it seems inherently hard to get it to follow those capabilities consistently.

Expand full comment

Jim Fan now has achieved definite status as horseman of the apocalypse.

Expand full comment

Amazing validation of SBF’s “default to yup” conversational strategy in that Understanding Human Preference Data chart. Just agree and people like you, lmao

Am also incredibly disappointed that so few people have anything smart to say about the Techno-Optimist Manifesto. It has so many issues but culture war commentariat have dumbed down the critique. Solana is usually great but took the easy route by mocking the irrelevant press reactions.

Expand full comment

i quite appreciated the subsection links in the table of contents in previous editions — would be grateful to see them again for quicker navigation

Expand full comment