There is much talk about so-called Responsible Scaling Policies, as in what we will do so that what we are doing can be considered responsible. Would that also result in actually responsible scaling? It would help. By themselves, in their current versions, no. The good scenario is that these policies are good starts and lay groundwork and momentum to get where we need to go. The bad scenario is that this becomes safetywashing, used as a justification for rapid and dangerous scaling of frontier models, a label that avoids any actual action or responsibility.
I suppose I'm not sure exactly what the human line is, but it seems to me that there's nearly always alpha in betting on AI news flying in the face of whatever Gary Marcus said most recently:
Do you have any thoughts or information about AI-safety related charitable giving? Are there any places to give that aren't doing work that can easily become capabilities? Has there been any effort among EAs/rationalists to try and pool resources for something like this (I'm not including something like OpenAI, which I think is disqualified by the capabilities issue)?
I think we are one specific medium-sized breakthrough away from having AIs that quite plausibly count as "alive" and "sentient", and with potential for "sapient" and "conscious" (whatever that means). I doubt I'm the only one who sees this. I think it's more likely to initially come from the open source community, rather than the big labs (although I don't know what they're doing internally). I expect that it will happen well before we get AIs that are powerful enough to be threats. And at that point, we will want to have been thinking for some time about the ethics of how to treat them.
From what you say here and in the last AI roundup, my guess is that you view this problem (which I'll call ETA, for Ethical Treatment of AI) as having substantial overlap with extinction risk, such that we're unlikely to encounter ETA without also encountering extinction risk, and thus we can largely ignore ETA because our survival as a species will be at stake. I think this is incorrect, and that we are likely to have to deal with ETA first, and that if the issue isn't addressed ahead of time, humans (and potentially AIs) may choose courses of action that make our extinction more likely.
I wonder about your blanket objection to the “superhuman persuasion isn’t a thing” objection. I think there’s a narrowish way in which it could make sense, and your examples of high-level human persuaders points to it. Dictators, cult leaders, founders of major religions: all of these categories, I suspect, contain a high number of people whose persuasive ability is *embodied* in a way we don’t fully understand. I’ve heard enough reports of what it’s like to be in a room with eg Bill Clinton, and have actually experienced something not dissimilar with at least one much less famous and powerful person, to be convinced that there really is some mysterious form of charisma that requires physical proximity to the person in question. I imagine Jesus and Mohammed would both have had this quality to a high degree. (Hitler had it, by all accounts, and his speeches, on film or radio, were famously *physically* performative -- you can practically see the endocrinal inputs to those gestures -- whereas Mein Kampf is notoriously unreadable.) And I don’t think any plausible near- to medium-term AI would be able to replicate it. This doesn’t mean that future AIs won’t be able to be superhuman at other forms of persuasion, or that weird movements in the “religion” space that we very much won’t like aren’t on the cards; but it does mean that the AI persuaders will lack a significant string to their bows.
“Thompson is focused on military AI and consumer applications here, rather than thinking about foundation models, a sign that even the most insightful analysts have not caught up to the most important game in town” - this made me think once again of a question that I still haven’t seen a great answer to. How are we certain that AGI is coming soon or at all? How do we know that we are not on the verge of a plateau? I have mostly assumed AGI’s arrival to happen soon, but still not have any logical arguments as to why this is necessarily the case
“The effect is still big but GPT-4 seems better at this than the other models tested.” Is this a result of simply more compute or maybe GPT-4 was finetuned/RLHFd differently?
I wouldn't have generated the "persuasion > intelligence" claim on my own, but having seen it I find it immediately compelling. These are language models we're talking about, after all, and *language itself* is centrally a tool of persuasion not intelligence. Words are for storing and transferring ideas not generating them.
(There's a broader version of this observation that applies not just to AI but also to human rationality: "Reason" is more naturally framed as the mind's PR department, not its R&D department. For this reason we should model superhuman persuasiveness more like a grandstanding "public intellectual" and less like a wild-eyed cult leader.)
That's not to say that an AI trained on ~all text ever won't have to develop some shape-rotation capabilities to minimize its loss function, just that it seems inherently hard to get it to follow those capabilities consistently.
Amazing validation of SBF’s “default to yup” conversational strategy in that Understanding Human Preference Data chart. Just agree and people like you, lmao
Am also incredibly disappointed that so few people have anything smart to say about the Techno-Optimist Manifesto. It has so many issues but culture war commentariat have dumbed down the critique. Solana is usually great but took the easy route by mocking the irrelevant press reactions.
> Anyone who is fully funded and orders Domino’s Pizza is dangerously misaligned.
Agree, but he was clearly optimizing for nostalgia.
(Also, the SBF book review really is zvi's best work, for those who've scrolled straight to the comments)
Yikes, Eureka does sound terrifying. And it’s open source... what could go wrong?
Not happy about Eureka, but it is good to see more and more people being willing to shout it from the rooftops that we shouldnt kill ourselves.
What a concept, huh.
I suppose I'm not sure exactly what the human line is, but it seems to me that there's nearly always alpha in betting on AI news flying in the face of whatever Gary Marcus said most recently:
https://arxiv.org/pdf/2310.16045.pdf
Just want to say I am here for the radio-only-HHGTTG references
I wanted to add that you have provided me with so much value that I'm definitely upgrading to paid, and would suggest that to others as well.
Do you have any thoughts or information about AI-safety related charitable giving? Are there any places to give that aren't doing work that can easily become capabilities? Has there been any effort among EAs/rationalists to try and pool resources for something like this (I'm not including something like OpenAI, which I think is disqualified by the capabilities issue)?
I've created a market on when we'll enter an "AI Winter": https://manifold.markets/nsokolsky/in-which-year-would-zvi-confirm-tha
> Perhaps we should not build such AIs, then?
I think we are one specific medium-sized breakthrough away from having AIs that quite plausibly count as "alive" and "sentient", and with potential for "sapient" and "conscious" (whatever that means). I doubt I'm the only one who sees this. I think it's more likely to initially come from the open source community, rather than the big labs (although I don't know what they're doing internally). I expect that it will happen well before we get AIs that are powerful enough to be threats. And at that point, we will want to have been thinking for some time about the ethics of how to treat them.
From what you say here and in the last AI roundup, my guess is that you view this problem (which I'll call ETA, for Ethical Treatment of AI) as having substantial overlap with extinction risk, such that we're unlikely to encounter ETA without also encountering extinction risk, and thus we can largely ignore ETA because our survival as a species will be at stake. I think this is incorrect, and that we are likely to have to deal with ETA first, and that if the issue isn't addressed ahead of time, humans (and potentially AIs) may choose courses of action that make our extinction more likely.
I wonder about your blanket objection to the “superhuman persuasion isn’t a thing” objection. I think there’s a narrowish way in which it could make sense, and your examples of high-level human persuaders points to it. Dictators, cult leaders, founders of major religions: all of these categories, I suspect, contain a high number of people whose persuasive ability is *embodied* in a way we don’t fully understand. I’ve heard enough reports of what it’s like to be in a room with eg Bill Clinton, and have actually experienced something not dissimilar with at least one much less famous and powerful person, to be convinced that there really is some mysterious form of charisma that requires physical proximity to the person in question. I imagine Jesus and Mohammed would both have had this quality to a high degree. (Hitler had it, by all accounts, and his speeches, on film or radio, were famously *physically* performative -- you can practically see the endocrinal inputs to those gestures -- whereas Mein Kampf is notoriously unreadable.) And I don’t think any plausible near- to medium-term AI would be able to replicate it. This doesn’t mean that future AIs won’t be able to be superhuman at other forms of persuasion, or that weird movements in the “religion” space that we very much won’t like aren’t on the cards; but it does mean that the AI persuaders will lack a significant string to their bows.
“Thompson is focused on military AI and consumer applications here, rather than thinking about foundation models, a sign that even the most insightful analysts have not caught up to the most important game in town” - this made me think once again of a question that I still haven’t seen a great answer to. How are we certain that AGI is coming soon or at all? How do we know that we are not on the verge of a plateau? I have mostly assumed AGI’s arrival to happen soon, but still not have any logical arguments as to why this is necessarily the case
“The effect is still big but GPT-4 seems better at this than the other models tested.” Is this a result of simply more compute or maybe GPT-4 was finetuned/RLHFd differently?
I wouldn't have generated the "persuasion > intelligence" claim on my own, but having seen it I find it immediately compelling. These are language models we're talking about, after all, and *language itself* is centrally a tool of persuasion not intelligence. Words are for storing and transferring ideas not generating them.
(There's a broader version of this observation that applies not just to AI but also to human rationality: "Reason" is more naturally framed as the mind's PR department, not its R&D department. For this reason we should model superhuman persuasiveness more like a grandstanding "public intellectual" and less like a wild-eyed cult leader.)
That's not to say that an AI trained on ~all text ever won't have to develop some shape-rotation capabilities to minimize its loss function, just that it seems inherently hard to get it to follow those capabilities consistently.
Jim Fan now has achieved definite status as horseman of the apocalypse.
Amazing validation of SBF’s “default to yup” conversational strategy in that Understanding Human Preference Data chart. Just agree and people like you, lmao
Am also incredibly disappointed that so few people have anything smart to say about the Techno-Optimist Manifesto. It has so many issues but culture war commentariat have dumbed down the critique. Solana is usually great but took the easy route by mocking the irrelevant press reactions.
i quite appreciated the subsection links in the table of contents in previous editions — would be grateful to see them again for quicker navigation