If it *is* hysterical BS -- maybe it is, I certainly hope so -- it would be nice if the person pushing back was making any attempt to rebut it, rather than, you know, just being an angry asshole.
It's very clear what Yann thinks from this exchange, and what his argument is; if you can't see that you need to stop making LessWrong your primary reading source.
You can assert what’s “very clear” and make assumptions about me all you like; doesn’t change the way LeCun came across above. Yudkowsky came across pretty arrogant, impatient, contemptuous and up himself, by the way... so par for the course, for him... but I did get the impression he was trying to have a substantive conversation in a way that LeCun wasn’t. If his ideas are beneath contempt, it should be easier than this for someone with so very many citations to show it.
The comments on YouTube: "wow, it's awesome to see two great minds come together to debate like this..."
The comments by Bret Weinstein fans: "Dawkins didn't even respond to some of what he said!!"
And the comments when this is posted on more informed venues are more frequently about how Dawkins either couldn't have responded to Weinstein because his arguments were nonsensical, or would have to engage in outside-their-wheelhouse speculation, or some other more knowing indication that Dawkins just knows vastly more about this and was thrown off by the incoherence of some of Weinstein's remarks.
If I need to spell this out, Yudkowsky is the analogue for Weinstein here.
It’s striking that you continue to indulge in lofty snark while side-stepping the actual exchange that we’re talking about. If I need to spell this out, you are the analogue for LeCun here. But seriously, I would love to be convinced that LeCun really did own Yudkowsky here, and we have nothing to worry about. I’m receptive to the idea, as I think I made clear in my original post. If you can show me how he did it, rather than tell me, “He did but you’re too stupid to spot it,” do please go ahead.
Meh. One of the important lessons I have learned from the Trump years is that pushing back against bullshit with bullshit from the other side only makes things worse.
All Yann had to do is make one valid argument. He didn't so that, meaning it's much more likely what Yudkowsky says isn't bullshit. If it's bullshit, it should be trivially easy to refute. Instead, he resorts to irrelevant crap like "you're depressing teenagers'
Yeah, I don’t think what Eliezer says is bullshit either (though perhaps overconfident). I was trying to make the point that even if you thought EY was spouting nonsense, Yann’s response is worse than useless.
Like that's a reason to stop discussing the possible fate of the world . . . that it "depresses teenagers." Geez. And this guy's the Chief AI Engineer for META? Figures--look whom he works for.
I agree with you that "catastrophizing" is the wrong way to go. Teenagers already have a problem with that, and climate catastrophizing is a perfect example. Not to "get spiritual" on you, but the tendency to catastrophize has got to be balanced with daily gratitude. "What are the three good things that happened to me yesterday" should balance with the "What are the fucked-up things?"
But as EY points out--avoiding the AI danger does not solve it. It reminds me of a girlfriend who said, "But doesn't belief in God make one happier?" I'm more interested in the truth and working with the truth, not trying to hide from it within the bliss of ignorance.
I'm not going to avoid telling teenagers about the risks of Super-AI so they continue the joy of believing in Santa Claus. I don't believe that is maturing and adaptive.
Yeah, when I think of places known for being not-lame, I think of LessWrong. Maybe Yann should join a Berkeley polycule, with an inclusive series of onboarding powerpoints and afternoon cuddle parties.
So your argument here is that people who are into powerpoints and poly are not cool and stylish, ergo staying fair and on-topic in arguments with them is unnecessary? Listen, Alfred, our present exchange is not being conducted at Vanity Fair. The reason Yann's response was lame was that (1) he had no substantive arguments against the idea that AI is dangerous -- just stuff like, aligning AI with human well-being is not that hard --you just optimize those objectives at run time. Corporations are sort of like ASI, and the legal system does a great job of making sure that corporate objectives are in alignment with the well-being of community. [um -- what about the tobacco companies? And the pharmaceutical companies hawking opioids to doctors?]. (2) After tossing out a few arguments about as persuasive as "AI can't hurt us because it doesn't have hands," Yann moves on to saying that Yud's concerns about AI harming us are scaring people.
Even if Yud and the people who share his concerns wore pink sombreros and lavender tutus and fucked chickadees on the weekend, these habits are completely irrelevant to discussions of whether Yud's concerns are valid.
Yeah, so maybe don't use the term "lame" if that's what you actually meant and you're concerned with misinterpretation, and how this will make me question whether we've read the same thing.
"he had no substantive arguments against the idea that AI is dangerous"
He did twice, and clearly started biting his tongue (cf. "Stop it") because he would end up repeating himself. It's reasonable at that stage to press more important concerns. You either cannot infer by context or did not pick up on the implication, or are using "substantive" to mean whatever you want.
From the post:
1. "To *guarantee* that a system satisfies objectives, you make it optimize those objectives at run time (what I propose). That solves the problem of aligning behavior to objectives. Then you need to align objectives with human values. But that's not as hard as you make it to be."
2. "Setting objectives for super-intelligent entities is something humanity has been familiar with since people started associating into groups and laws were made to align their behavior to the common good." (Corporate law was an example here, and Zvi's focus on this is a basic reading comprehension failure. His actual point is that humanity has been "associating into groups and aligning their behavior" for a long time.
I'm not about to read any material outside the exchange in question. Either Yann's argument is impressive on its own merits or it's not, and I think it's not. Yes, Yann mentioned corporate law. And? I think choosing corporations as his example of a super-human entity with whom the rest of the population peacefully co-exists is very unfortunate. Corporate law withstanding, we have before us some excellent examples of mega-entities not aligned with the wellbeing of the public: tobacco companies, pharmaceutical companies. Duh.
But if you think it's relevant to read articles about topics relevant to the present discussion, may I recommend this one: https://arxiv.org/pdf/2209.00626.pdf
"I'm not about to read any material outside the exchange in question"
"But may I recommend this arxiv link"
You are being ridiculous. That is not how argumentation works; you cannot just say "I'm only reading This One Thing". Making an ultimatum about the origin of evidence is arbitrary and petulant.
Or actually, I guess you can, since many people have done exactly that for thousands of years with the Christian Bible, but I presume you want to be right about this.
But if you think Yudkowsky is wrong based on anything Yann said, you're a delusional fool
Nothing Yann said was a valid argument, it was all hand waving. And the fact you think Yann came out of this looking better than Yudkowsky shows that you're either unwilling or unable to understand the arguments around alignment at even a rudimentary level.
Thanks for doing this! I've been trying to keep track of their interaction, hoping there would be some value generated, but twitter doesn't make that easy.
I kept hoping that Yann would actually engage with EY's arguments, rather than these tangential snipes or Ad Hominems...
They aren't tangential. They're normative concerns, or as you would call them "meta-level" though I think that's a false characterization because being 'meta' is not necessarily 'normative'. Yann pivoted to what he saw as a more important societal concern, because in many stages of the argument he would be repeating himself and it would be a redundant lesson in teaching Eliezer how to pay attention.
Yann's strongest argument is his last one. There is a flavor of pascal's mugging in EY's arguments, it's bothered me from the first time I heard them. Arguments of this form make me skeptical. At the end of the day I don't think super intellegence is as likely as the rest of you. Also I think intellegence is limited in ways that people like EY aren't considering. As long as AI depends on humans to do it's bidding it will fail to take over the world. Now if it had a robot army, that's another story. Robots are reliable and follow instructions. No matter how smart you are getting humans to collaborate to execute complex plans is hard. Being smarter might not even help. You have to build a certain amount of failure into your plans...totally possible for a super-AI. However I think the evidence is we'll get lack of alignment before super enough AI to execute a complex plan using humans. So the one-shot just seems wrong.
This is a straw man. Pascal's Mugging would beif he thought the probability of AI wiping out humanity was small, but that it was important anyways because the impact was large. But he's arguing that the AI wiping out humanity is extremely likely.
I'm not saying it's exactly like that. But it's a scenario with no precident has an indermenent chance of happening, he says it's high, but everyone says that about their pet issue, climate, ai, nukes, might all be correct, just saying it feels a bit Pascally, because until there is a precident, we have to trust his view of the chances or we might not agree.
There cannot be a precedent! Either something wipes out humans or it doesn't. We aren't alive to ponder future existential risks if we get wiped out, so it's literally impossible for onw existential risk to set a precedent for future ones. Therefore, precedent is not a valid standard.
Furthermore, aside from the fact that we have good reasons for believing climate change and nukes aren't existential threats, we take these things very seriously!
We don't let companies just develop nukes unchecked. It's an extremely controlled technology, so you're analogy makes no sense.
Natural and human history provides substantial precedent for whole species wiping out competing species - and whole groups of intelligent beings extinguishing other groups of intelligent beings. It’s extremely dangerous to live anywhere near a group more powerful than you.
But he has no rigorous model to calculate the odds of "wiping out humanity" that he can use to actually tell the chance. He's doing a lot of guessing with absolutely *zero* hard evidence to base a prediction on. We can't say that the scenario is "extremely likely" at all. That's purely made up. We have no idea what the actual chances are. If that scares you, then I'm sympathetic. The narrative story of how an AI can take over the world is internally consistent and can't be ruled out.
Humans are used to this kind of apocalyptic message, and the typical response is "yep, another doomsday scenario like the other 415 I've seen before." EY comes across like a more sophisticated version of the homeless guy on a street corner telling us that the end has come.
But, since he (or anyone else) cannot actually show the math on the likelihood of this happening, or even explain the specific mechanisms of how it would work out, then it's still a Pascal's Mugging.
EY has given a variety of specifics around why he believes superhuman AI is likely to kill us all. Inner vs Outer alignment, Instrumental convergence, Orthogonality Thesis. All of these compose to the general category of the 'alignment problem'. I think (but don't know) that EY would claim any one of those would be sufficient to kill us all and that (at least?) those need to be solved to protect us from misalignment.
If you aren't familiar with those, then you don't have any basis to argue percentages. It seems improbable that you'd argue that each of them isn't actually a problem, so wiping us out is unlikely... But you'd need to address them more directly.
Regardless, this doesn't seem remotely like a Pascal's Mugging. Also regardless, if you believe, based on your priors, that X is unlikely, and someone else believes, based on their priors, that X is likely, it would be way more productive to argue about the priors, the WHY, instead of what it's called.
I'm quite familiar with EY's various arguments. I simply find them unconvincing. As I say, he comes across to me more like a religious fanatic on a street corner. That's unfair in some ways, but I use that language because I find EY's approach off-putting and think that it's more useful (including for him and those who agree with him) to be told the truth about his approach. Scaring people into depression is not praise-worthy, *even if he were right*. If he's wrong it's downright evil.
But even if I were not familiar with his approach, do you think it reasonable that you would need to know the details of every argument for an apocalyptic scenario in order to refute it, or even say "I'm unconvinced"? Do you need to sit down and discuss all of the street-corner-guy's theories before you accept or reject them? What makes EY's arguments different, from the outside perspective? That he's smart and is discussing something hot in the news? You need more than that to be convincing, and a lot more than that to get buy in on your apocalypse theory - because there are thousands of very intelligent people throughout history who each had their own apocalyptic theory, some of whom were extremely convincing.
Though the actual Pascal scenario works a bit different, I'm defining it as a very low chance of a very important outcome (in this case negative outcome). In other posts I've described it as a problem of multiplying by infinity - which always makes the math come out wrong. You can't do a cost-benefit analysis involving infinity on any side of the equation, the answer is pre-determined. So EY putting infinity negative ("wiping out humanity") in the results column means that no matter how large or small the chance of this happening, his answer will be the same. He's okay with this, because he believes the chance is high and therefore believes that society should ignore the Pascal nature of his arguments and accept his goals. I believe the chance is much lower, but to him that doesn't matter because of the infinity on the outcome side - but I also don't believe that the negatives are infinitely negative. Or, more specifically, at pretty much every stage of his argument I disagree with his take on how likely *and* how bad those stages will be. I don't believe that foom is possible. I think he badly misunderstands what it would take for an AI to interact with meatspace. I think he badly underestimates how much humanity would fight back, both how early and how effectively.
But I don't have solid numbers on those things either. In a lot of ways it's the equivalent or arguing who would win in a fight, Superman or the Hulk. We're arguing about something that has no solid numbers or points of comparison and we can't try out in an experiment. EY is trying to end run around the argument by pulling a Pascal-type argument - infinite negative on the results column, therefore if his opponents admit even the smallest chance of him being right then we have to accept all of his arguments. I reject that.
You don't think "super-intelligence" is possible? I agree that as it's defined, "super-intelligence" is not here, and a ways off. But how do you describe a machine that has passed the bar exam, the physician's medical exam, and the MBA exam . . . that has already replaced thousands of jobs worldwide? It has read most of the internet up to Sept 2021, and could easily complete its reading up to now. No human who has ever lived has covered that much material, and passed all those kinds of tests. Isn't that a type of "super" AI intelligence right here right now, if not yet the world-crushing, human-crushing "super-intelligence EY describes?
"But how do you describe a machine that has passed the bar exam, the physician's medical exam, and the MBA exam . . . that has already replaced thousands of jobs worldwide?"
A machine that does those things? By the "killing jobs" standard, email and Excel have a much higher headcount.
Look, there is this bias in AI Alignment discourse, and it's really annoying, and goes something like:
- AI can do the most difficult human things
-> Therefore, AI can do all human things, or many human things better than humans
Clearly, this is not how that works. AI can do very complex things for humans while failing at very simple things for humans.
"No human who has ever lived has covered that much material, and passed all those kinds of tests."
Yeah, and no human who has ever lived never did the last previous scandalous AI thing. This gets very repetitive, as if it's not sufficient to tell you that this is an extremely narrow understanding of what intelligence is or does, and is precisely what Yann was getting at when he says the term "intelligence" is a weasel word here: https://archive.is/Zqh9W
If I were Mark Zuckerberg and my opinions about AI risk were exactly the same as Yann LeCun's, I would *not* put Yann LeCun in a position of responsibility.
Are these people familiar with the concept of being mistaken about something?
The sad thing here for me is that Eliezer is not a good communicator, at least in this format. He comes across as strident and contemptuous. He takes his points as obvious, and can't always be bothered to explain them in a way that might actually help convey understanding to someone who doesn't already agree with him.
All of this is understandable, given the weight of the burden Eliezer has been carrying and the length of time he's been carrying it for. But it's not productive. If the goal is to save humanity, then at this stage, a critical subgoal is to be convincing: comprehensible, appealing, sympathetic. We need to meet people where they are (if not Yann in this instance, then the many other folks who will be reading this public conversation), explain things in terms that they can understand, and always come across as rational, patient, constructive. To the extent that it is ever possible to change someone's mind, that is the path. No one has ever had their mind changed by being browbeaten.
If Eliezer doesn't have the skills for this – which are a specific set of skills – or if he simply no longer has the patience, then again that's quite understandable, but I hope someone can help him understand that he is not serving the cause by engaging in public in this fashion.
I agree that he doesn't have the skills. These people do: https://www.youtube.com/watch?app=desktop&v=bhYw-VlkXTU Of course they're not talking about FoomDoom, but I actually think the stuff they're talking about is scary enough, and gives enough of the flavor of the power and dangerousness of AI, that as a strategy it would be more effective to have a lot of this sort of thing, rather than a lot more FoomDoom spokemen.
Thanks for the pointer – I'll check this out. I agree that FoomDoom is not necessarily the right rhetorical tool to employ on a general audience at this point.
Just a followup to say: I found time to watch that video, and agreed, it is excellent. Thanks for the pointer! I plan to reach out to the folks behind it.
"Yann LeCun: To *guarantee* that a system satisfies objectives, you make it optimize those objectives at run time (what I propose). That solves the problem of aligning behavior to objectives. Then you need to align objectives with human values. But that's not as hard as you make it to be."
Someone with more familiarity with the research can show that there's mesa-optimization and meta-optimization that can occur even if the optimization happens at runtime, right? As for the last part of this... holy shit. I don't even know where to begin. He does realize that humans are extremely divergent on the values they claim to hold even amongst each other? Axiology and values are open problems in philosophy and economics research. First you have to solve that problem, then hope you can properly transmit that to an AI and then hope the AI doesn't mesa or meta optimize away from it.
You know, I don't think Yann is even capable of considering Eliezer's argument. His mind is fixed on a position and anything that goes against that position gets instantly filtered out.
The argument for AI risk is actually quite simple, so I wonder why so many people have problems with it. Like, if you ever programmed anything you know that the computer simply follows your orders the way you wrote them and not necessarily in the way you intended. Scale that up to extremely capable adaptable autonomous programs and it's pretty clear what failure looks like.
People for some reason seem to think AI is just a normal person, maybe more intelligent, so it's all fine since smart people are nice. That view is, quite frankly, ridiculous. Delusional even.
Honestly, if they had called it "adaptable autonomous system" or "complex information processing" instead of AI, things would be much better...
> People for some reason seem to think AI is just a normal person, maybe more intelligent, so it's all fine since smart people are nice.
I think this is right - exponential gains in intelligence is unfathomable so its totally out of the reference class of things to worry about. Yann's senior engineers are perfectly aligned to him and they're plenty smart already - so any general AI they make should also be aligned and only be slightly smarter.
Actually I think Eliezer's best argument is just about operation speed. Even if AI is exactly human intelligence, the fact they can outthink us means even linear gains in technology will appear to happen super fast to us. Like imagine all the Manhattan project science and production exactly as it happened but clocked up to 2 hours instead of 3 years - instead of just two cities, it's seems very plausible the US Air Force is nuking every European and Japanese as early as 1942 in that scenario.
An AI is not a person. It's a program that was trained on a bunch of data with the objective of predicting that data. If it's aligned with anything it would be with the objective of predicting text. It might seem to "understand" our values but that's only because it pattern matches certain things. If you prompt it with something inhuman it will predict something inhuman.
We are different, because our "training objective" is not to predict next token, but to survive and reproduce or whatever. The mechanisms that exist within us were created through that training process. There's no reason, at all, to believe an AI trained on predicting text will develop in it's program the same mechanisms.
This is not so relevant for limited systems like GPT-4. But as their capability for adaptation and autonomous action increases, the difference between what we want the program to do and what the program effectively developed will become more apparent.
For the record, I don't really think the foom scenario is particularly likely, since you can't will computational power out of thin air, no matter how smart you are. My model of how things might play out is that we give these systems to much control of our lives and they eventually go haywire for some reason or another, causing a catastrophic event. And after some time they do some dumb stuff and break down. The end.
There is an argument out there that an LLM-based AI actually *would* be somewhat human-like, enough for it to matter. GPT-4 is only predicting the next token, so it by itself is extremely non-human, sure. However, we're not talking to a GPT-4, we're talking to a "simulated human" who GPT-4 is continuously trying to simulate. Texts on the internet are primarily written by humans, for humans. If GPT-4 is trained to predict text on the internet, then the text it predicts would be extremely human-like, to the point of being indistinguishable from something an actual human would write.
Now, if we ask such an AI to come up with a cure for cancer, it would probably try its best to simulate what a human scientist would say. And a human scientist wouldn't say killing everyone is technically a cure since cancer is also dead. And he also probably wouldn't say that a substance that permanently paralizes everyone (and cures cancer) is a valid cure. If GPT-N is smart enough to correctly simulate a human scientist (or a thousand of human scientists working hundreds of times faster than humans), then all the "obviously unaligned" solutions are correctly predicted as extremely unlikely.
Sure, this isn't utilizing all the capabilities of such a model, it can probably do much more, but at least it's a somewhat safe option, right? At least "we turn it on a week ago and everyone is still alive" safe, even though it can still be greately misused and will be able to kill everyone easily if somebody asked it to.
Also for the record, I don't quite accept the above argument fully, it seems too optimistic given how little we know about LLMs, but I genuinely couldn't find significant flaws in it with my current knowledge.
I agree that LLMs are more "human-like", given their training data. However, as you said, the ability to simulate an evil actor (say, an evil AI) is still there in the model. I'm not well versed in the memes of this space, but I think that's what people try to capture when they describe GPT as that Lovecraftian abomination but with a smiley face.
The problem I have with this argument is that it just sidesteps the issue. "So, how do you know your AI won't kill everyone?" "Ah, it probably won't. I think. So it's fine!". Doesn't exactly makes me relieved.
I'm not all that worried about LLMs though (other than perhaps their capacity to replace me eventually). As I understand it, data is a significant bottleneck and there not many ways to work around that limitation other than waiting for more data to be created. High-quality data is slated to end next year, and even with the 100T tokens in common crawl it might not be enough, given most of it is probably garbage.
Of course, I'm no ML expert, so I'm probably wrong in some non trivial sense, but that's what it seems like to me. The real test will be GPT-5 imo
is unable to consider the argument of a mostly-self-published fringe author on AI discourse, whose terminology mostly exists in his walled-garden of a subculture, who has been badgering him for months, whose arguments he's been pestered with repeatedly in twitter threads, who has written articles in dispute of them — https://archive.is/Zqh9W — and whose SOURCE OF INCOME DEPENDS ON THIS CONCERN BEING RELEVANT — — — *THAT GUY* is the one who does not understand here. Get the hell out of here, lmao.
And by the way: "People for some reason seem to think AI is just a normal person, maybe more intelligent, so it's all fine since smart people are nice. That view is, quite frankly, ridiculous. Delusional even." This is precisely a point that Yann made in his article linked above, just in the reverse; that this crowd thinks special things will happen because the term "intelligence" starts invoking magical thinking.
You might want to consider that you are going hard for the AI equivalent of a "deep state" conspiracy theorist and not a Very Smart Person.
I see. It depends very much on the parameters - that term in a broad sense - of the AI.
The only model of AI that fits the doomsday scenario is an optimisation engine with complete dominion over physical matter, but no other powers of reasoning at all, particularly moral reasoning - something which can annihilate huge swathes of matter, but which can't ask itself elemental moral questions about whether or not the annihilation of huge swathes of matter is a worthwhile activity. It's hard to imagine how something would gain independent and uncontested dominion over physical matter with such an unsophisticated moral intellect.
I don't discredit the idea of AI being capable of enormous harm, but the focus on the risk from a putative AGI/ASI which reprises the dimensionality of human intelligence while excelling it in performance is not where the imminent risk areas are.
Because it was given the goal of calculating pi, and determined the optimal strategy for doing this was maximising its computational resources by using all matter available to it for this.
Put any goal i there you like, and there's ways for it to end similarly badly.
Maybe it would work to have the meta-goal of the AI rating its goals and actions according to what actual humans from a specific recent reference frame and culture would consider good and bad. Not that that’s particularly easy to solve! I think any specific concrete goal runs into D&D-style malicious genie problems, as you point out.
So we're presuming that it's an entirely systematic intelligence, completely without any power of moral reasoning?
These are the kinds of assumptions I find so interesting about these debates - we're presuming it's an AI so powerful that it has omnipotent physical dominion over matter, but also that it's so stupid from a perspective of moral weighting that it wouldn't be able to ask itself "Is the destruction of all living matter a worthy price to pay to attain my original incentive?", or, indeed, "Is my core incentive stupid/pointless, or are there better things I could do with my time and limitless capabilities?"
These debates tend to be predicated on mental models of intelligence that are fundamentally flawed or limited.
Human level or super human intelligence doesn't mean human-like intelligence. People who think 'being intelligent' means 'thinking like a human' are the ones with the rudimentarily flawed models of intelligence. A machine superintelligence will most likely be an extremely alien intelligence, albeit one with a human-friendly interface.
And morality isn't an objective fact about reality, it's an entirely human construct, so being intelligent doesn't necessarily result in holding particular moral values. Trying to put these values into code is an extremely tough technical challenge. Doing so in a robust way that doesn't lead to perverse instantiation is even tougher.
"Moral reasoning" does not exist unless you hold certain values, and these values in humans are not a product of human intelligence - they're more fundamental parts of our nature. We certainly did arrive at them through any kind of reasoning. Nearly all "moral reasoning" is simply rationalizing our innate values and sense of justice/fairness. If you don't have human values, there's absolutely no reason whatsoever to expect that you will engage in human-like moral reasoning.
But let's sweep all of that aside. Let's assume that somehow, 'morality' is an objective part of reality. Who the hell said that humans are right about it? How do you know that this superintelligence moral reasoner doesn't come to the "right" moral facts that are extremely different from our own? What if it decides that the best thing to maximise total human utility is to kill everyone (Benevolent Artificial Anti-Natalism (BAAN))?
Regarding the potential for sharp difference between human intelligence and an as- or more-intelligent AI, you make an excellent point. These distinctions may very well be material. But they're never discussed. Look at the debates about how AGI/ASI could be, is likely to be, or indeed should be different from human intelligence and see how little discussion you see about the proper classification and typology of different classes of intelligence. That lack of definition with respect to the kind of intellect anyone's talking about reduces the likelihood that the discussion as a whole will be productive.
There's something implicit in your points around moral reasoning that it sits apart from intelligence. The root presumption here, common as it is, is that intelligence is unrelated to the holding of moral values because morals are not intellectual. It's very much more likely the case, and easy to imagine if we view general intellect as modular, that moral intelligence is its own mode, sits in a separate department from other forms of intelligence, and has its own conditions for optimisation. And, just as the systematically intelligent are not necessarily kinaesthetically intelligent, so neither will necessarily be morally intelligent.
If we find that the primary risk vector from AI comes from an intelligence with sole recourse to systematic performance of limited tasks, with no power of reasoning otherwise (in either moral domains or others), then we must be rigorous about defining AGI/ASI thusly. And personally, if we are defining AGI/ASI in this way, I would suggest there be an alternative title for it. An optimisation solution with omnipotence over physical matter might be an interesting thing, but it is not rounded enough to be considered an intelligence, or at least not a general one.
On the objectivity of morality – there is no correctly calculated moral end that demands BANN, as BANN would eliminate the conditions for morality’s existence. Only an absurd morality that took the elimination of matter or avoidable destruction as positive could justify this end.
When I first started reading YUD on this topic I was perplexed, and with dawning realization came a sense of excitement: "This is understandable but difficult, and I must be very smart indeed to understand this obvious smart fellow when so many other smart people do not."
Having read more of the arguments of those who do not understand, I feel somewhat deflated -- they're plenty smart enough to understand, but are just engaging in the 'motivated stopping' mentioned so long ago now in the Sequences. This stuff really isn't that complicated, it's just that nobody wants to look directly at it long enough to understand.
His argument is already there; you weren't paying attention or have bad contextual understanding. I would not be surprised if he did not reply to Eliezer because it's redundant; his questions are addressed by things he's already stated, and he would be repeating himself. He made another argument here https://archive.is/Zqh9W and in many other places. I swear, you people need to have this spelled out for you in the most kidgloved terms and your reading comprehension is abominable.
Your approach using ad hominem attacks is not dissimilar to the one LeCunn is using on Twitter.
The other two deep learning pioneers that share LeCunn's Turing award are both quite concerned about AI safety. My "bad contextual understanding" notwithstanding, I presume the you wouldn't also attribute the same to them.
The historical argument is not compelling. This technology is not comparable to historical technologies nor deterministic programs. These nondeterministic systems that have unexplained behavior are dangerous because they are unlike any system which has had predictable behavior. Much of their value has been due to this 'emergent behavior' that comes out of complex systems that is unpredictable.
Even the creators of ChatGPT have admitted the dangers of the technology.
lmfao if you're so risk-averse, or entitled, or impatient a reader that you need your interlocutor to SUMMARIZE ARCHIVE.IS LINKS FOR YOU BEFORE YOU CLICK THEM there is no hope. I could be locked in a room with you for 10 hours and get zero minutes of productive dialogue.
> YL: You know, you can't just go around using ridiculous arguments to accuse people of anticipated genocide and hoping there will be no consequence that you will regret. It's dangerous. People become clinically depressed reading your crap. Others may become violent.
Wow. There's no tone on the Internet, of course. But depending on the tone and facial expressions, this could be interpreted in *very* different ways.
Or maybe it's just that I saw a mafia TV show a few months back, priming my brain's pattern recognition to recognize potential similar patterns elsewhere.
Yann sure isn't smart. He can't debate well, and he bails pretty fast on the actual debate, and resorts to complaining about Yudkowsky scaring people. Even the way he bails is dumb, because if Yudkowsky is right then obviously he is going to have to scare people in the process of warning them, so it's an argument based on the presupposition that EY is wrong. Um, Yann, the truth of that is what you guys are debating. I hope someone makes a spoof vid of him like the one where EY is talking about cats -- except not a fun spoof, but one that eviscerates this creep.
"Bailing" is quite the audacious interpretation, as most sane people would see this as "ignoring the same thing Eliezer has repeated many times and we're well aware he thinks to talk about a much more pressing concern."
You actually think I'm insane? Good grief, calm down. I'm not a Yudkowsky acolyte, but do take him seriously. I think it's very hard to tell whether he's just wrong, or he's a brilliant, somewhat autistic man who has an extraordinary talent for intuiting how things will play out with AI, the way some other autistic people can recognize 8 digit prime numbers at a glance. In any case, even people who are convinced he's a crazy dork have to behave reasonably in a debate if they want their views to be respected. If Yann wants to say to Yud, I don't think the important thing to discuss is whether you are right or wrong, I think what matters more is how much you're scaring people, so let's argue that -- well then fine, he can say that. But of course that sounds kind of silly because if Yud is *right* then telling people about the danger is the responsible thing to do, even if he scares them. So the logic Yann displays in the exchange consists of the following 2 stoopit stand-ins for syllogisms:
People who are smart and good say good thiings.
Yud says scary thiings.
Ergo he is neither smart nor good.
If AI was dangerous it would be smart and good to warn people.
Eliezer has to struggle for anyone outside of this LessWrong bubble — which has become immeasurably worse over the years and especially since the NYTimes freakout — to take him seriously. It's not Yann who needs the respect here. He has it. Eliezer cannot get any of these major experts to take him seriously because.... he is wrong.
"Eliezer has to struggle for anyone outside of this LessWrong bubble." You sound overconcerned with how well people are doing in the public market. This is not about doing great takes on Twitter, it's about figuring out something that's very hard to figure out: how will things play out if we create and AI of approx. human intelligence and then set it to work improving itself. I think it's worth considering the possibility that the one person most able to intuit the answer is an odd, sad guy who has no cool and no game.
""Eliezer has to struggle for anyone outside of this LessWrong bubble." You sound overconcerned with how well people are doing in the public market."
Uh, yes. The fact that you think this can be ignored tells me you have not thought about how this is supposed to be actually stopped in the real world. You have to make people at large take this idea seriously to get anything done about it. You cannot simply use your connections to get your manifesto published in Time magazine, make a public talk, and collect headpats where Congress follows along. Eliezer's radioactive public unlikeability outside the LessWrong bubble is an objective and undeniable liability for any of you who claim to be on his side and seriously care.
Well we finally found something we agree on. I agree that Yud's public presentation interferes with people taking his ideas seriously, and have been arguing on other forums (not Less Wrong, where I have spent a lifetime total of about 5 minutes) that people concerned about AI risk should throw themselves into efforts to persuade the public. Sometimes I wonder whether we should consider doing sleazy things like scaring the antivaxxers & religious right with rumors that AI will be tasked with keeping people up to date on vaxxes and forcibly vaxing children by sending Disney-themed drones to play areas to vax them -- & also saying AI will promote atheism because it thinks it's God.
While Yann came across as less eccentric than Yud, he did not come across well in that Twitter exchange. He was not pleasant, witty or persuasive, and even people who don't really think analytically about the argument will sense that he soon abandoned the topic and shifted to complaining that Yud is a big meanie asshole.
I've been thinking about this conversation since it happened. How can the godfather of modern AI be thinking about this question so poorly? My most charitable explanation is that he's experiencing some pretty extreme cognitive dissonance. His subconscious is telling the narrative writing part of his brain that safety isn't a concern (because A. If we treat it as dangerous and it turns out not to be we'll be missing out on massive benefits to humanity, and B. if it turns out we are at risk Yann will bear a non trivial amount of the responsibility.) and the narrative writer does it's best to come up with a reason why which turns out to be some sophomoric nonsense like the above.
So LeCun is the top citation ranked researcher using the tag "AI". https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:artificial_intelligence says Hinton and Bengio are the top citation ranked researchers using the tag "artificial intelligence". Given that they shared a Turing Award, not surprising. To my mind most of the people on the second list are key, but the first list seems less indicative.
You might point out that this is in 2018, and currently there is a twitter post about how he left Google to talk about the dangers of AI. Notice that this can mean a lot of things, as AI can indeed be dangerous in a lot of ways: "the near-term risks of AI taking jobs, and the proliferation of fake photos, videos and text that appear real to the average person" are indeed dangers. You can still think this, as it is a quite reasonable concern, and still not remotely come close to endorsing the "AI will kill us all" lunacy of Eliezer's position.
LeCun also shared an award with Hinton; the Princess of Asturias Award, also with Hassabis and Bengio. Does it matter? This is a "godfather" vs "founding father" distinction; they're clearly all very important.
From the twitter thread link (https://twitter.com/lmldias/status/1650773428390047745): this is one of the better argument seeds contra-inevitable-foom I've seen. Material manipulation is not a simple follow-on from digital manipulation. Figuring out the material technology processes to get I Pencil (and the associated emergent order) working (or I Paperclip or more, I Paperclip^100...) to coordinate the likely >trillions of current daily transactions required to get paper clips to be the dominant production framework in the world/galaxy, is highly non-trivial.
I understand that paper clip maximizing is a metaphor. Paper clip manufacturing (via the I Pencil metaphor) is highly non-trivial. If foom scenarios are that AGI will be able to manipulate fusion energy/gravitons and generate paper-clips out of anything within our lifetime, I'd like to see the logical progression. I get that alignment is a problem, I get that IQ=40000 is functionally unfathomable. An inevitable AGI-so-alien/intelligent-to-be-able-by-default-do-galaxy-rending-material-manipulation-is-the-only-outcome-possible-outcome, is not a complete argument. I'd like to see the projected timeline, and the potential gating needed to do galaxy-rending material manipulation.
This isn't a complete thought, but mostly a conversation starter.
I've been saying "this AI doomsday thing won't happen because of supply chain failures" at https://alfredmacdonald.substack.com/p/do-ai-experts and this exchange is an extension of that argument that I felt should be obvious from the term "supply chain failure", but perhaps people so accustomed to this literature don't know what that is or why obtaining resources from developing countries is important.
The number of steps you have to go through to get to "and then the AI develops a [factory/nanobots/whatever]" is ludicrous, and you are insane if you think that's *enough* to be convincing.
I wonder *why* there seems to be a relatively consistent lack of thorough engagement from those who do not believe AI presents existential risk with a meaningful probability.
The requests from Scott, yourself, and others to plainly state objections is obviously reasonable. Perhaps this is evidence that thorough engagement on these issues leads people to the AI Risk side, so all that's left is what appears to be relative flippancy.
Maybe in some cases, but there a lot of people who thoroughly (by most reasonable standards) engaged with this topic whose point of view is "Bayesian probability predictions are the wrong way to reason about this sort of event."
I think LessWrong/rationalism adjacency strongly correlates to both wanting to put Bayesian probabilities on things and also strongly believing AI prevents significant existential risk. I don't think it's really fair to point at that venn diagram and say anyone outside it is being flippant. Although certainly some are.
If it *is* hysterical BS -- maybe it is, I certainly hope so -- it would be nice if the person pushing back was making any attempt to rebut it, rather than, you know, just being an angry asshole.
He did make an attempt. You either didn't read it or were lazy about it.
He's made other attempts, two I've posted here: https://alfredmacdonald.substack.com/p/do-ai-experts
It's very clear what Yann thinks from this exchange, and what his argument is; if you can't see that you need to stop making LessWrong your primary reading source.
You can assert what’s “very clear” and make assumptions about me all you like; doesn’t change the way LeCun came across above. Yudkowsky came across pretty arrogant, impatient, contemptuous and up himself, by the way... so par for the course, for him... but I did get the impression he was trying to have a substantive conversation in a way that LeCun wasn’t. If his ideas are beneath contempt, it should be easier than this for someone with so very many citations to show it.
Consider that he did that, and his doing that went over your head.
For example, here is Bret Weinstein vs. Richard Dawkins: https://www.youtube.com/watch?v=hYzU-DoEV6k
The comments on YouTube: "wow, it's awesome to see two great minds come together to debate like this..."
The comments by Bret Weinstein fans: "Dawkins didn't even respond to some of what he said!!"
And the comments when this is posted on more informed venues are more frequently about how Dawkins either couldn't have responded to Weinstein because his arguments were nonsensical, or would have to engage in outside-their-wheelhouse speculation, or some other more knowing indication that Dawkins just knows vastly more about this and was thrown off by the incoherence of some of Weinstein's remarks.
If I need to spell this out, Yudkowsky is the analogue for Weinstein here.
It’s striking that you continue to indulge in lofty snark while side-stepping the actual exchange that we’re talking about. If I need to spell this out, you are the analogue for LeCun here. But seriously, I would love to be convinced that LeCun really did own Yudkowsky here, and we have nothing to worry about. I’m receptive to the idea, as I think I made clear in my original post. If you can show me how he did it, rather than tell me, “He did but you’re too stupid to spot it,” do please go ahead.
Meh. One of the important lessons I have learned from the Trump years is that pushing back against bullshit with bullshit from the other side only makes things worse.
All Yann had to do is make one valid argument. He didn't so that, meaning it's much more likely what Yudkowsky says isn't bullshit. If it's bullshit, it should be trivially easy to refute. Instead, he resorts to irrelevant crap like "you're depressing teenagers'
Yeah, I don’t think what Eliezer says is bullshit either (though perhaps overconfident). I was trying to make the point that even if you thought EY was spouting nonsense, Yann’s response is worse than useless.
Like that's a reason to stop discussing the possible fate of the world . . . that it "depresses teenagers." Geez. And this guy's the Chief AI Engineer for META? Figures--look whom he works for.
I agree with you that "catastrophizing" is the wrong way to go. Teenagers already have a problem with that, and climate catastrophizing is a perfect example. Not to "get spiritual" on you, but the tendency to catastrophize has got to be balanced with daily gratitude. "What are the three good things that happened to me yesterday" should balance with the "What are the fucked-up things?"
But as EY points out--avoiding the AI danger does not solve it. It reminds me of a girlfriend who said, "But doesn't belief in God make one happier?" I'm more interested in the truth and working with the truth, not trying to hide from it within the bliss of ignorance.
I'm not going to avoid telling teenagers about the risks of Super-AI so they continue the joy of believing in Santa Claus. I don't believe that is maturing and adaptive.
But Yann doesn't push back intelligently. He's really lame.
Yeah, when I think of places known for being not-lame, I think of LessWrong. Maybe Yann should join a Berkeley polycule, with an inclusive series of onboarding powerpoints and afternoon cuddle parties.
So your argument here is that people who are into powerpoints and poly are not cool and stylish, ergo staying fair and on-topic in arguments with them is unnecessary? Listen, Alfred, our present exchange is not being conducted at Vanity Fair. The reason Yann's response was lame was that (1) he had no substantive arguments against the idea that AI is dangerous -- just stuff like, aligning AI with human well-being is not that hard --you just optimize those objectives at run time. Corporations are sort of like ASI, and the legal system does a great job of making sure that corporate objectives are in alignment with the well-being of community. [um -- what about the tobacco companies? And the pharmaceutical companies hawking opioids to doctors?]. (2) After tossing out a few arguments about as persuasive as "AI can't hurt us because it doesn't have hands," Yann moves on to saying that Yud's concerns about AI harming us are scaring people.
Even if Yud and the people who share his concerns wore pink sombreros and lavender tutus and fucked chickadees on the weekend, these habits are completely irrelevant to discussions of whether Yud's concerns are valid.
Yeah, so maybe don't use the term "lame" if that's what you actually meant and you're concerned with misinterpretation, and how this will make me question whether we've read the same thing.
"he had no substantive arguments against the idea that AI is dangerous"
He did twice, and clearly started biting his tongue (cf. "Stop it") because he would end up repeating himself. It's reasonable at that stage to press more important concerns. You either cannot infer by context or did not pick up on the implication, or are using "substantive" to mean whatever you want.
From the post:
1. "To *guarantee* that a system satisfies objectives, you make it optimize those objectives at run time (what I propose). That solves the problem of aligning behavior to objectives. Then you need to align objectives with human values. But that's not as hard as you make it to be."
2. "Setting objectives for super-intelligent entities is something humanity has been familiar with since people started associating into groups and laws were made to align their behavior to the common good." (Corporate law was an example here, and Zvi's focus on this is a basic reading comprehension failure. His actual point is that humanity has been "associating into groups and aligning their behavior" for a long time.
Not from the post:
3. https://archive.is/Zqh9W
I'm not about to read any material outside the exchange in question. Either Yann's argument is impressive on its own merits or it's not, and I think it's not. Yes, Yann mentioned corporate law. And? I think choosing corporations as his example of a super-human entity with whom the rest of the population peacefully co-exists is very unfortunate. Corporate law withstanding, we have before us some excellent examples of mega-entities not aligned with the wellbeing of the public: tobacco companies, pharmaceutical companies. Duh.
But if you think it's relevant to read articles about topics relevant to the present discussion, may I recommend this one: https://arxiv.org/pdf/2209.00626.pdf
"I'm not about to read any material outside the exchange in question"
"But may I recommend this arxiv link"
You are being ridiculous. That is not how argumentation works; you cannot just say "I'm only reading This One Thing". Making an ultimatum about the origin of evidence is arbitrary and petulant.
Or actually, I guess you can, since many people have done exactly that for thousands of years with the Christian Bible, but I presume you want to be right about this.
Thanks for that Togelius link, food for thought.
You can think Yudkowsky is wrong
But if you think Yudkowsky is wrong based on anything Yann said, you're a delusional fool
Nothing Yann said was a valid argument, it was all hand waving. And the fact you think Yann came out of this looking better than Yudkowsky shows that you're either unwilling or unable to understand the arguments around alignment at even a rudimentary level.
I've seen you make dozens of comments about AI by now, and you have never actually made anything resembling an actual argument.
Tell us why you think alignment is such a trivially easy thing.
You accidentally repeated the three paragraphs beginning with:
"Yann LeCunn: Scaremongering about an asteroid that doesn't actually exist (even if you think it does) is going to depress people for no reason."
Thanks for doing this! I've been trying to keep track of their interaction, hoping there would be some value generated, but twitter doesn't make that easy.
I kept hoping that Yann would actually engage with EY's arguments, rather than these tangential snipes or Ad Hominems...
They aren't tangential. They're normative concerns, or as you would call them "meta-level" though I think that's a false characterization because being 'meta' is not necessarily 'normative'. Yann pivoted to what he saw as a more important societal concern, because in many stages of the argument he would be repeating himself and it would be a redundant lesson in teaching Eliezer how to pay attention.
Yann's strongest argument is his last one. There is a flavor of pascal's mugging in EY's arguments, it's bothered me from the first time I heard them. Arguments of this form make me skeptical. At the end of the day I don't think super intellegence is as likely as the rest of you. Also I think intellegence is limited in ways that people like EY aren't considering. As long as AI depends on humans to do it's bidding it will fail to take over the world. Now if it had a robot army, that's another story. Robots are reliable and follow instructions. No matter how smart you are getting humans to collaborate to execute complex plans is hard. Being smarter might not even help. You have to build a certain amount of failure into your plans...totally possible for a super-AI. However I think the evidence is we'll get lack of alignment before super enough AI to execute a complex plan using humans. So the one-shot just seems wrong.
To be fair, the flavor is of some adapted form of Pascal's mugging that fails in some important ways.
This is a straw man. Pascal's Mugging would beif he thought the probability of AI wiping out humanity was small, but that it was important anyways because the impact was large. But he's arguing that the AI wiping out humanity is extremely likely.
I'm not saying it's exactly like that. But it's a scenario with no precident has an indermenent chance of happening, he says it's high, but everyone says that about their pet issue, climate, ai, nukes, might all be correct, just saying it feels a bit Pascally, because until there is a precident, we have to trust his view of the chances or we might not agree.
There cannot be a precedent! Either something wipes out humans or it doesn't. We aren't alive to ponder future existential risks if we get wiped out, so it's literally impossible for onw existential risk to set a precedent for future ones. Therefore, precedent is not a valid standard.
Furthermore, aside from the fact that we have good reasons for believing climate change and nukes aren't existential threats, we take these things very seriously!
We don't let companies just develop nukes unchecked. It's an extremely controlled technology, so you're analogy makes no sense.
Natural and human history provides substantial precedent for whole species wiping out competing species - and whole groups of intelligent beings extinguishing other groups of intelligent beings. It’s extremely dangerous to live anywhere near a group more powerful than you.
But he has no rigorous model to calculate the odds of "wiping out humanity" that he can use to actually tell the chance. He's doing a lot of guessing with absolutely *zero* hard evidence to base a prediction on. We can't say that the scenario is "extremely likely" at all. That's purely made up. We have no idea what the actual chances are. If that scares you, then I'm sympathetic. The narrative story of how an AI can take over the world is internally consistent and can't be ruled out.
Humans are used to this kind of apocalyptic message, and the typical response is "yep, another doomsday scenario like the other 415 I've seen before." EY comes across like a more sophisticated version of the homeless guy on a street corner telling us that the end has come.
But, since he (or anyone else) cannot actually show the math on the likelihood of this happening, or even explain the specific mechanisms of how it would work out, then it's still a Pascal's Mugging.
EY has given a variety of specifics around why he believes superhuman AI is likely to kill us all. Inner vs Outer alignment, Instrumental convergence, Orthogonality Thesis. All of these compose to the general category of the 'alignment problem'. I think (but don't know) that EY would claim any one of those would be sufficient to kill us all and that (at least?) those need to be solved to protect us from misalignment.
If you aren't familiar with those, then you don't have any basis to argue percentages. It seems improbable that you'd argue that each of them isn't actually a problem, so wiping us out is unlikely... But you'd need to address them more directly.
Regardless, this doesn't seem remotely like a Pascal's Mugging. Also regardless, if you believe, based on your priors, that X is unlikely, and someone else believes, based on their priors, that X is likely, it would be way more productive to argue about the priors, the WHY, instead of what it's called.
I'm quite familiar with EY's various arguments. I simply find them unconvincing. As I say, he comes across to me more like a religious fanatic on a street corner. That's unfair in some ways, but I use that language because I find EY's approach off-putting and think that it's more useful (including for him and those who agree with him) to be told the truth about his approach. Scaring people into depression is not praise-worthy, *even if he were right*. If he's wrong it's downright evil.
But even if I were not familiar with his approach, do you think it reasonable that you would need to know the details of every argument for an apocalyptic scenario in order to refute it, or even say "I'm unconvinced"? Do you need to sit down and discuss all of the street-corner-guy's theories before you accept or reject them? What makes EY's arguments different, from the outside perspective? That he's smart and is discussing something hot in the news? You need more than that to be convincing, and a lot more than that to get buy in on your apocalypse theory - because there are thousands of very intelligent people throughout history who each had their own apocalyptic theory, some of whom were extremely convincing.
Though the actual Pascal scenario works a bit different, I'm defining it as a very low chance of a very important outcome (in this case negative outcome). In other posts I've described it as a problem of multiplying by infinity - which always makes the math come out wrong. You can't do a cost-benefit analysis involving infinity on any side of the equation, the answer is pre-determined. So EY putting infinity negative ("wiping out humanity") in the results column means that no matter how large or small the chance of this happening, his answer will be the same. He's okay with this, because he believes the chance is high and therefore believes that society should ignore the Pascal nature of his arguments and accept his goals. I believe the chance is much lower, but to him that doesn't matter because of the infinity on the outcome side - but I also don't believe that the negatives are infinitely negative. Or, more specifically, at pretty much every stage of his argument I disagree with his take on how likely *and* how bad those stages will be. I don't believe that foom is possible. I think he badly misunderstands what it would take for an AI to interact with meatspace. I think he badly underestimates how much humanity would fight back, both how early and how effectively.
But I don't have solid numbers on those things either. In a lot of ways it's the equivalent or arguing who would win in a fight, Superman or the Hulk. We're arguing about something that has no solid numbers or points of comparison and we can't try out in an experiment. EY is trying to end run around the argument by pulling a Pascal-type argument - infinite negative on the results column, therefore if his opponents admit even the smallest chance of him being right then we have to accept all of his arguments. I reject that.
> "EY comes across like a more sophisticated version of the homeless guy on a street corner telling us that the end has come."
Bill Burr is relevant here: https://youtu.be/bxckkBAuiBw?t=396
Eliezer should have never surpassed the Subway Level.
My, aren't you mean-spirited.
You don't think "super-intelligence" is possible? I agree that as it's defined, "super-intelligence" is not here, and a ways off. But how do you describe a machine that has passed the bar exam, the physician's medical exam, and the MBA exam . . . that has already replaced thousands of jobs worldwide? It has read most of the internet up to Sept 2021, and could easily complete its reading up to now. No human who has ever lived has covered that much material, and passed all those kinds of tests. Isn't that a type of "super" AI intelligence right here right now, if not yet the world-crushing, human-crushing "super-intelligence EY describes?
"But how do you describe a machine that has passed the bar exam, the physician's medical exam, and the MBA exam . . . that has already replaced thousands of jobs worldwide?"
A machine that does those things? By the "killing jobs" standard, email and Excel have a much higher headcount.
Look, there is this bias in AI Alignment discourse, and it's really annoying, and goes something like:
- AI can do the most difficult human things
-> Therefore, AI can do all human things, or many human things better than humans
Clearly, this is not how that works. AI can do very complex things for humans while failing at very simple things for humans.
"No human who has ever lived has covered that much material, and passed all those kinds of tests."
Yeah, and no human who has ever lived never did the last previous scandalous AI thing. This gets very repetitive, as if it's not sufficient to tell you that this is an extremely narrow understanding of what intelligence is or does, and is precisely what Yann was getting at when he says the term "intelligence" is a weasel word here: https://archive.is/Zqh9W
If I were Mark Zuckerberg and my opinions about AI risk were exactly the same as Yann LeCun's, I would *not* put Yann LeCun in a position of responsibility.
Are these people familiar with the concept of being mistaken about something?
Are you? If you think Eliezer came off better in that exchange you need an MRI.
The sad thing here for me is that Eliezer is not a good communicator, at least in this format. He comes across as strident and contemptuous. He takes his points as obvious, and can't always be bothered to explain them in a way that might actually help convey understanding to someone who doesn't already agree with him.
All of this is understandable, given the weight of the burden Eliezer has been carrying and the length of time he's been carrying it for. But it's not productive. If the goal is to save humanity, then at this stage, a critical subgoal is to be convincing: comprehensible, appealing, sympathetic. We need to meet people where they are (if not Yann in this instance, then the many other folks who will be reading this public conversation), explain things in terms that they can understand, and always come across as rational, patient, constructive. To the extent that it is ever possible to change someone's mind, that is the path. No one has ever had their mind changed by being browbeaten.
If Eliezer doesn't have the skills for this – which are a specific set of skills – or if he simply no longer has the patience, then again that's quite understandable, but I hope someone can help him understand that he is not serving the cause by engaging in public in this fashion.
I agree that he doesn't have the skills. These people do: https://www.youtube.com/watch?app=desktop&v=bhYw-VlkXTU Of course they're not talking about FoomDoom, but I actually think the stuff they're talking about is scary enough, and gives enough of the flavor of the power and dangerousness of AI, that as a strategy it would be more effective to have a lot of this sort of thing, rather than a lot more FoomDoom spokemen.
Thanks for the pointer – I'll check this out. I agree that FoomDoom is not necessarily the right rhetorical tool to employ on a general audience at this point.
Just a followup to say: I found time to watch that video, and agreed, it is excellent. Thanks for the pointer! I plan to reach out to the folks behind it.
I'm glad to hear it. Speaking of the dangers of current AI, this creeped me out : https://tinyurl.com/muah54as
Something wrong with the link, it goes to Gmail and yields an error message.
Yeah sorry, trying again.
https://boeing.mediaroom.com/news-releases-statements?item=131225
Also this.
https://lieber.westpoint.edu/idf-introduces-ai-battlefield-new-frontier/
"Yann LeCun: To *guarantee* that a system satisfies objectives, you make it optimize those objectives at run time (what I propose). That solves the problem of aligning behavior to objectives. Then you need to align objectives with human values. But that's not as hard as you make it to be."
Someone with more familiarity with the research can show that there's mesa-optimization and meta-optimization that can occur even if the optimization happens at runtime, right? As for the last part of this... holy shit. I don't even know where to begin. He does realize that humans are extremely divergent on the values they claim to hold even amongst each other? Axiology and values are open problems in philosophy and economics research. First you have to solve that problem, then hope you can properly transmit that to an AI and then hope the AI doesn't mesa or meta optimize away from it.
You know, I don't think Yann is even capable of considering Eliezer's argument. His mind is fixed on a position and anything that goes against that position gets instantly filtered out.
The argument for AI risk is actually quite simple, so I wonder why so many people have problems with it. Like, if you ever programmed anything you know that the computer simply follows your orders the way you wrote them and not necessarily in the way you intended. Scale that up to extremely capable adaptable autonomous programs and it's pretty clear what failure looks like.
People for some reason seem to think AI is just a normal person, maybe more intelligent, so it's all fine since smart people are nice. That view is, quite frankly, ridiculous. Delusional even.
Honestly, if they had called it "adaptable autonomous system" or "complex information processing" instead of AI, things would be much better...
> People for some reason seem to think AI is just a normal person, maybe more intelligent, so it's all fine since smart people are nice.
I think this is right - exponential gains in intelligence is unfathomable so its totally out of the reference class of things to worry about. Yann's senior engineers are perfectly aligned to him and they're plenty smart already - so any general AI they make should also be aligned and only be slightly smarter.
Actually I think Eliezer's best argument is just about operation speed. Even if AI is exactly human intelligence, the fact they can outthink us means even linear gains in technology will appear to happen super fast to us. Like imagine all the Manhattan project science and production exactly as it happened but clocked up to 2 hours instead of 3 years - instead of just two cities, it's seems very plausible the US Air Force is nuking every European and Japanese as early as 1942 in that scenario.
An AI is not a person. It's a program that was trained on a bunch of data with the objective of predicting that data. If it's aligned with anything it would be with the objective of predicting text. It might seem to "understand" our values but that's only because it pattern matches certain things. If you prompt it with something inhuman it will predict something inhuman.
We are different, because our "training objective" is not to predict next token, but to survive and reproduce or whatever. The mechanisms that exist within us were created through that training process. There's no reason, at all, to believe an AI trained on predicting text will develop in it's program the same mechanisms.
This is not so relevant for limited systems like GPT-4. But as their capability for adaptation and autonomous action increases, the difference between what we want the program to do and what the program effectively developed will become more apparent.
For the record, I don't really think the foom scenario is particularly likely, since you can't will computational power out of thin air, no matter how smart you are. My model of how things might play out is that we give these systems to much control of our lives and they eventually go haywire for some reason or another, causing a catastrophic event. And after some time they do some dumb stuff and break down. The end.
There is an argument out there that an LLM-based AI actually *would* be somewhat human-like, enough for it to matter. GPT-4 is only predicting the next token, so it by itself is extremely non-human, sure. However, we're not talking to a GPT-4, we're talking to a "simulated human" who GPT-4 is continuously trying to simulate. Texts on the internet are primarily written by humans, for humans. If GPT-4 is trained to predict text on the internet, then the text it predicts would be extremely human-like, to the point of being indistinguishable from something an actual human would write.
Now, if we ask such an AI to come up with a cure for cancer, it would probably try its best to simulate what a human scientist would say. And a human scientist wouldn't say killing everyone is technically a cure since cancer is also dead. And he also probably wouldn't say that a substance that permanently paralizes everyone (and cures cancer) is a valid cure. If GPT-N is smart enough to correctly simulate a human scientist (or a thousand of human scientists working hundreds of times faster than humans), then all the "obviously unaligned" solutions are correctly predicted as extremely unlikely.
Sure, this isn't utilizing all the capabilities of such a model, it can probably do much more, but at least it's a somewhat safe option, right? At least "we turn it on a week ago and everyone is still alive" safe, even though it can still be greately misused and will be able to kill everyone easily if somebody asked it to.
Also for the record, I don't quite accept the above argument fully, it seems too optimistic given how little we know about LLMs, but I genuinely couldn't find significant flaws in it with my current knowledge.
I agree that LLMs are more "human-like", given their training data. However, as you said, the ability to simulate an evil actor (say, an evil AI) is still there in the model. I'm not well versed in the memes of this space, but I think that's what people try to capture when they describe GPT as that Lovecraftian abomination but with a smiley face.
The problem I have with this argument is that it just sidesteps the issue. "So, how do you know your AI won't kill everyone?" "Ah, it probably won't. I think. So it's fine!". Doesn't exactly makes me relieved.
I'm not all that worried about LLMs though (other than perhaps their capacity to replace me eventually). As I understand it, data is a significant bottleneck and there not many ways to work around that limitation other than waiting for more data to be created. High-quality data is slated to end next year, and even with the 100T tokens in common crawl it might not be enough, given most of it is probably garbage.
Of course, I'm no ML expert, so I'm probably wrong in some non trivial sense, but that's what it seems like to me. The real test will be GPT-5 imo
"You know, I don't think Yann is even capable of considering Eliezer's argument."
Yeah, the top guy on this page: https://scholar.google.com/citations?view_op=search_authors&hl=en&mauthors=label:ai
is unable to consider the argument of a mostly-self-published fringe author on AI discourse, whose terminology mostly exists in his walled-garden of a subculture, who has been badgering him for months, whose arguments he's been pestered with repeatedly in twitter threads, who has written articles in dispute of them — https://archive.is/Zqh9W — and whose SOURCE OF INCOME DEPENDS ON THIS CONCERN BEING RELEVANT — — — *THAT GUY* is the one who does not understand here. Get the hell out of here, lmao.
And by the way: "People for some reason seem to think AI is just a normal person, maybe more intelligent, so it's all fine since smart people are nice. That view is, quite frankly, ridiculous. Delusional even." This is precisely a point that Yann made in his article linked above, just in the reverse; that this crowd thinks special things will happen because the term "intelligence" starts invoking magical thinking.
You might want to consider that you are going hard for the AI equivalent of a "deep state" conspiracy theorist and not a Very Smart Person.
>Yeah, the top guy on this page: https://scholar.google.com/citations?view_op=search_authors&hl=en&mauthors=label:ai is unable to consider the argument
Yeah. It surprised me too.
no really, stop. you are being ridiculous.
Hey Zvi,
I posted this response to Daniel Eth's tweet about an AGI turning the universe into data centres for computing Pi.
"What incentive would [the putative AGI] have for engineering this outcome?"
Daniel didn't respond, but I wonder what you think?
I see. It depends very much on the parameters - that term in a broad sense - of the AI.
The only model of AI that fits the doomsday scenario is an optimisation engine with complete dominion over physical matter, but no other powers of reasoning at all, particularly moral reasoning - something which can annihilate huge swathes of matter, but which can't ask itself elemental moral questions about whether or not the annihilation of huge swathes of matter is a worthwhile activity. It's hard to imagine how something would gain independent and uncontested dominion over physical matter with such an unsophisticated moral intellect.
I don't discredit the idea of AI being capable of enormous harm, but the focus on the risk from a putative AGI/ASI which reprises the dimensionality of human intelligence while excelling it in performance is not where the imminent risk areas are.
Because it could use the datacenters to calculate faster.
Because it was given the goal of calculating pi, and determined the optimal strategy for doing this was maximising its computational resources by using all matter available to it for this.
Put any goal i there you like, and there's ways for it to end similarly badly.
Maybe it would work to have the meta-goal of the AI rating its goals and actions according to what actual humans from a specific recent reference frame and culture would consider good and bad. Not that that’s particularly easy to solve! I think any specific concrete goal runs into D&D-style malicious genie problems, as you point out.
So we're presuming that it's an entirely systematic intelligence, completely without any power of moral reasoning?
These are the kinds of assumptions I find so interesting about these debates - we're presuming it's an AI so powerful that it has omnipotent physical dominion over matter, but also that it's so stupid from a perspective of moral weighting that it wouldn't be able to ask itself "Is the destruction of all living matter a worthy price to pay to attain my original incentive?", or, indeed, "Is my core incentive stupid/pointless, or are there better things I could do with my time and limitless capabilities?"
These debates tend to be predicated on mental models of intelligence that are fundamentally flawed or limited.
Human level or super human intelligence doesn't mean human-like intelligence. People who think 'being intelligent' means 'thinking like a human' are the ones with the rudimentarily flawed models of intelligence. A machine superintelligence will most likely be an extremely alien intelligence, albeit one with a human-friendly interface.
And morality isn't an objective fact about reality, it's an entirely human construct, so being intelligent doesn't necessarily result in holding particular moral values. Trying to put these values into code is an extremely tough technical challenge. Doing so in a robust way that doesn't lead to perverse instantiation is even tougher.
"Moral reasoning" does not exist unless you hold certain values, and these values in humans are not a product of human intelligence - they're more fundamental parts of our nature. We certainly did arrive at them through any kind of reasoning. Nearly all "moral reasoning" is simply rationalizing our innate values and sense of justice/fairness. If you don't have human values, there's absolutely no reason whatsoever to expect that you will engage in human-like moral reasoning.
But let's sweep all of that aside. Let's assume that somehow, 'morality' is an objective part of reality. Who the hell said that humans are right about it? How do you know that this superintelligence moral reasoner doesn't come to the "right" moral facts that are extremely different from our own? What if it decides that the best thing to maximise total human utility is to kill everyone (Benevolent Artificial Anti-Natalism (BAAN))?
Regarding the potential for sharp difference between human intelligence and an as- or more-intelligent AI, you make an excellent point. These distinctions may very well be material. But they're never discussed. Look at the debates about how AGI/ASI could be, is likely to be, or indeed should be different from human intelligence and see how little discussion you see about the proper classification and typology of different classes of intelligence. That lack of definition with respect to the kind of intellect anyone's talking about reduces the likelihood that the discussion as a whole will be productive.
There's something implicit in your points around moral reasoning that it sits apart from intelligence. The root presumption here, common as it is, is that intelligence is unrelated to the holding of moral values because morals are not intellectual. It's very much more likely the case, and easy to imagine if we view general intellect as modular, that moral intelligence is its own mode, sits in a separate department from other forms of intelligence, and has its own conditions for optimisation. And, just as the systematically intelligent are not necessarily kinaesthetically intelligent, so neither will necessarily be morally intelligent.
If we find that the primary risk vector from AI comes from an intelligence with sole recourse to systematic performance of limited tasks, with no power of reasoning otherwise (in either moral domains or others), then we must be rigorous about defining AGI/ASI thusly. And personally, if we are defining AGI/ASI in this way, I would suggest there be an alternative title for it. An optimisation solution with omnipotence over physical matter might be an interesting thing, but it is not rounded enough to be considered an intelligence, or at least not a general one.
On the objectivity of morality – there is no correctly calculated moral end that demands BANN, as BANN would eliminate the conditions for morality’s existence. Only an absurd morality that took the elimination of matter or avoidable destruction as positive could justify this end.
When I first started reading YUD on this topic I was perplexed, and with dawning realization came a sense of excitement: "This is understandable but difficult, and I must be very smart indeed to understand this obvious smart fellow when so many other smart people do not."
Having read more of the arguments of those who do not understand, I feel somewhat deflated -- they're plenty smart enough to understand, but are just engaging in the 'motivated stopping' mentioned so long ago now in the Sequences. This stuff really isn't that complicated, it's just that nobody wants to look directly at it long enough to understand.
I find Yann's argument so disappointing. If he has a real argument for why there is no need to worry I'm eager to hear it.
His argument is already there; you weren't paying attention or have bad contextual understanding. I would not be surprised if he did not reply to Eliezer because it's redundant; his questions are addressed by things he's already stated, and he would be repeating himself. He made another argument here https://archive.is/Zqh9W and in many other places. I swear, you people need to have this spelled out for you in the most kidgloved terms and your reading comprehension is abominable.
Your approach using ad hominem attacks is not dissimilar to the one LeCunn is using on Twitter.
The other two deep learning pioneers that share LeCunn's Turing award are both quite concerned about AI safety. My "bad contextual understanding" notwithstanding, I presume the you wouldn't also attribute the same to them.
The historical argument is not compelling. This technology is not comparable to historical technologies nor deterministic programs. These nondeterministic systems that have unexplained behavior are dangerous because they are unlike any system which has had predictable behavior. Much of their value has been due to this 'emergent behavior' that comes out of complex systems that is unpredictable.
Even the creators of ChatGPT have admitted the dangers of the technology.
Is that link somehow related to Yann LeCunn?
lmfao if you're so risk-averse, or entitled, or impatient a reader that you need your interlocutor to SUMMARIZE ARCHIVE.IS LINKS FOR YOU BEFORE YOU CLICK THEM there is no hope. I could be locked in a room with you for 10 hours and get zero minutes of productive dialogue.
> YL: You know, you can't just go around using ridiculous arguments to accuse people of anticipated genocide and hoping there will be no consequence that you will regret. It's dangerous. People become clinically depressed reading your crap. Others may become violent.
Wow. There's no tone on the Internet, of course. But depending on the tone and facial expressions, this could be interpreted in *very* different ways.
Or maybe it's just that I saw a mafia TV show a few months back, priming my brain's pattern recognition to recognize potential similar patterns elsewhere.
Yann sure isn't smart. He can't debate well, and he bails pretty fast on the actual debate, and resorts to complaining about Yudkowsky scaring people. Even the way he bails is dumb, because if Yudkowsky is right then obviously he is going to have to scare people in the process of warning them, so it's an argument based on the presupposition that EY is wrong. Um, Yann, the truth of that is what you guys are debating. I hope someone makes a spoof vid of him like the one where EY is talking about cats -- except not a fun spoof, but one that eviscerates this creep.
"Bailing" is quite the audacious interpretation, as most sane people would see this as "ignoring the same thing Eliezer has repeated many times and we're well aware he thinks to talk about a much more pressing concern."
You actually think I'm insane? Good grief, calm down. I'm not a Yudkowsky acolyte, but do take him seriously. I think it's very hard to tell whether he's just wrong, or he's a brilliant, somewhat autistic man who has an extraordinary talent for intuiting how things will play out with AI, the way some other autistic people can recognize 8 digit prime numbers at a glance. In any case, even people who are convinced he's a crazy dork have to behave reasonably in a debate if they want their views to be respected. If Yann wants to say to Yud, I don't think the important thing to discuss is whether you are right or wrong, I think what matters more is how much you're scaring people, so let's argue that -- well then fine, he can say that. But of course that sounds kind of silly because if Yud is *right* then telling people about the danger is the responsible thing to do, even if he scares them. So the logic Yann displays in the exchange consists of the following 2 stoopit stand-ins for syllogisms:
People who are smart and good say good thiings.
Yud says scary thiings.
Ergo he is neither smart nor good.
If AI was dangerous it would be smart and good to warn people.
But since Yud is neither smart nor good
It follows that his warnings are dumb and wrong,
And he should shut the fuck up.
There you have it, folks. QED
Your syllogistic summary is not an accurate summary of what Yann said; you've simplified his claims to the point of strawmanning them.
Moreover: "people who are convinced he's a crazy dork have to behave reasonably in a debate if they want their views to be respected."
Yes, many people who are not AI Alignment freaks will think LeCun behaved reasonably there, and Eliezer like a badgering conspiracy theorist.
But this doesn't matter, because LeCun's views are already respected. Refer: https://scholar.google.com/citations?view_op=search_authors&hl=en&mauthors=label:ai
Eliezer has to struggle for anyone outside of this LessWrong bubble — which has become immeasurably worse over the years and especially since the NYTimes freakout — to take him seriously. It's not Yann who needs the respect here. He has it. Eliezer cannot get any of these major experts to take him seriously because.... he is wrong.
"Eliezer has to struggle for anyone outside of this LessWrong bubble." You sound overconcerned with how well people are doing in the public market. This is not about doing great takes on Twitter, it's about figuring out something that's very hard to figure out: how will things play out if we create and AI of approx. human intelligence and then set it to work improving itself. I think it's worth considering the possibility that the one person most able to intuit the answer is an odd, sad guy who has no cool and no game.
""Eliezer has to struggle for anyone outside of this LessWrong bubble." You sound overconcerned with how well people are doing in the public market."
Uh, yes. The fact that you think this can be ignored tells me you have not thought about how this is supposed to be actually stopped in the real world. You have to make people at large take this idea seriously to get anything done about it. You cannot simply use your connections to get your manifesto published in Time magazine, make a public talk, and collect headpats where Congress follows along. Eliezer's radioactive public unlikeability outside the LessWrong bubble is an objective and undeniable liability for any of you who claim to be on his side and seriously care.
Well we finally found something we agree on. I agree that Yud's public presentation interferes with people taking his ideas seriously, and have been arguing on other forums (not Less Wrong, where I have spent a lifetime total of about 5 minutes) that people concerned about AI risk should throw themselves into efforts to persuade the public. Sometimes I wonder whether we should consider doing sleazy things like scaring the antivaxxers & religious right with rumors that AI will be tasked with keeping people up to date on vaxxes and forcibly vaxing children by sending Disney-themed drones to play areas to vax them -- & also saying AI will promote atheism because it thinks it's God.
While Yann came across as less eccentric than Yud, he did not come across well in that Twitter exchange. He was not pleasant, witty or persuasive, and even people who don't really think analytically about the argument will sense that he soon abandoned the topic and shifted to complaining that Yud is a big meanie asshole.
I've been thinking about this conversation since it happened. How can the godfather of modern AI be thinking about this question so poorly? My most charitable explanation is that he's experiencing some pretty extreme cognitive dissonance. His subconscious is telling the narrative writing part of his brain that safety isn't a concern (because A. If we treat it as dangerous and it turns out not to be we'll be missing out on massive benefits to humanity, and B. if it turns out we are at risk Yann will bear a non trivial amount of the responsibility.) and the narrative writer does it's best to come up with a reason why which turns out to be some sophomoric nonsense like the above.
Why do you view LeCun as "the godfather of modern AI"? For me several candidates come to mind but LeCun is not one of them.
https://scholar.google.com/citations?view_op=search_authors&hl=en&mauthors=label:ai
Who comes up first on this list?
So LeCun is the top citation ranked researcher using the tag "AI". https://scholar.google.com/citations?hl=en&view_op=search_authors&mauthors=label:artificial_intelligence says Hinton and Bengio are the top citation ranked researchers using the tag "artificial intelligence". Given that they shared a Turing Award, not surprising. To my mind most of the people on the second list are key, but the first list seems less indicative.
I don't think we're going to get closer to Eliezer on the spectrum from Hinton, even though he's begging Hinton to DM him now. https://venturebeat.com/ai/geoffrey-hinton-and-demis-hassabis-agi-is-nowhere-close-to-being-a-reality/
You might point out that this is in 2018, and currently there is a twitter post about how he left Google to talk about the dangers of AI. Notice that this can mean a lot of things, as AI can indeed be dangerous in a lot of ways: "the near-term risks of AI taking jobs, and the proliferation of fake photos, videos and text that appear real to the average person" are indeed dangers. You can still think this, as it is a quite reasonable concern, and still not remotely come close to endorsing the "AI will kill us all" lunacy of Eliezer's position.
LeCun also shared an award with Hinton; the Princess of Asturias Award, also with Hassabis and Bengio. Does it matter? This is a "godfather" vs "founding father" distinction; they're clearly all very important.
From the twitter thread link (https://twitter.com/lmldias/status/1650773428390047745): this is one of the better argument seeds contra-inevitable-foom I've seen. Material manipulation is not a simple follow-on from digital manipulation. Figuring out the material technology processes to get I Pencil (and the associated emergent order) working (or I Paperclip or more, I Paperclip^100...) to coordinate the likely >trillions of current daily transactions required to get paper clips to be the dominant production framework in the world/galaxy, is highly non-trivial.
I understand that paper clip maximizing is a metaphor. Paper clip manufacturing (via the I Pencil metaphor) is highly non-trivial. If foom scenarios are that AGI will be able to manipulate fusion energy/gravitons and generate paper-clips out of anything within our lifetime, I'd like to see the logical progression. I get that alignment is a problem, I get that IQ=40000 is functionally unfathomable. An inevitable AGI-so-alien/intelligent-to-be-able-by-default-do-galaxy-rending-material-manipulation-is-the-only-outcome-possible-outcome, is not a complete argument. I'd like to see the projected timeline, and the potential gating needed to do galaxy-rending material manipulation.
This isn't a complete thought, but mostly a conversation starter.
I've been saying "this AI doomsday thing won't happen because of supply chain failures" at https://alfredmacdonald.substack.com/p/do-ai-experts and this exchange is an extension of that argument that I felt should be obvious from the term "supply chain failure", but perhaps people so accustomed to this literature don't know what that is or why obtaining resources from developing countries is important.
The number of steps you have to go through to get to "and then the AI develops a [factory/nanobots/whatever]" is ludicrous, and you are insane if you think that's *enough* to be convincing.
I wonder *why* there seems to be a relatively consistent lack of thorough engagement from those who do not believe AI presents existential risk with a meaningful probability.
The requests from Scott, yourself, and others to plainly state objections is obviously reasonable. Perhaps this is evidence that thorough engagement on these issues leads people to the AI Risk side, so all that's left is what appears to be relative flippancy.
Maybe in some cases, but there a lot of people who thoroughly (by most reasonable standards) engaged with this topic whose point of view is "Bayesian probability predictions are the wrong way to reason about this sort of event."
I think LessWrong/rationalism adjacency strongly correlates to both wanting to put Bayesian probabilities on things and also strongly believing AI prevents significant existential risk. I don't think it's really fair to point at that venn diagram and say anyone outside it is being flippant. Although certainly some are.