37 Comments

"The cancer case is likely similar to the asthma case, where slow developing cancers lead to more other health care,"

I'm a great example of that: Prostate cancer saved my life, because the preoperative physical for the surgery discovered my lymphoma while it was still stage 1. And as lymphoma only really has a good cure rate when it's accidentally discovered early, most lymphoma survivors have similar stories.

Expand full comment

> Terrance Tao: These are extremely challenging. I think they will resist AIs for several years at least.

This is a pretty substantial claim. Tao is both a brilliant mathematician, and someone who has put in real effort to engage with generative AI and evaluate its capabilities for research-level math. If AIs start doing well on this bench mark before "several years", that would be worth a significant update in AGI timelines.

Expand full comment

I think Tao has reserved his right to change his mind if the time comes, but he also has the privilege of being high in the mountains as AI makes its ascent, which gives him particular insight as to the particular hurdles it has to overcome.

On the other hand, AI meeting Tao's expectation in the next several years also means that AI will be expect to start performing in the top 1%, 0.1%, 0.01%, etc. in the next few years. Even being on pace would be significant progress.

Expand full comment

> Exceeding human-level reasoning will require training methods beyond next-token prediction, such as reinforcement learning and self-play

So AGI will be brought on by...mental masturbation?

Expand full comment

I'm a retired Computer Systems Engineer. Garbage in garbage out has always been true. Any AI model build with input from "X" or a myriad of other misinformation is going to be susceptible to hallucinations and worse. Is there any at this time that have been carefully fed only the truth? Probably not. This is not a good bedrock to be building on.

Expand full comment

Hallucinations aren't necessarily a product of misinformation being available. Sometimes there'll be no information and the LLM will hallucinate to fill in gaps. Even models like Claude who are fed curated datasets are prone to a decent amount of hallucinations. Luckily there are many ways to circumvent these issues by providing additional instructions.

Garbage in garbage out is true indeed, but it's not all garbage in and there are steps you can take to filter the trash from the treasure.

Expand full comment

I'm contemplating writing an article called, "Hallucinations Will Kill The Dream Of AGI!" It's a deliberately provocative title, but the more I work with LLMs, the more amazed I am at how useless so many tasks are for even the most advanced models because of hallucinations. The form that the hallucinations take is stunningly broad. From research articles to case law, historical weather reports to simple baseball statistics, LLMs will confidently provide the wrong answers. Even so-called specialized models and wrappers for research will return results (i.e., supposedly relevant articles) that are completely fabricated. Most incredibly, even when the information is uploaded into the context window (which dramatically helps but renders models much less *generally* useful, as the user has to basically do his or her own initial research), models will still hallucinate the wrong answer even when the correct information is "in context" - i.e., immediately available to the model. As you say, additional instructions (prompt engineering) and chain of thought protocols can avoid some of the most egregious answers, but that makes the models less useful and, more important, do not necessarily work. From what I can tell, more compute and more data will not solve the hallucinations problem, yet those topics get much more attention and coverage than the fact that LLMs will simply make up "facts" because they seem plausible without even alerting the user to the errors. Most important work in the world requires factual accuracy. Put another way, AI that cannot provide and process factually accurate information even when the information is perfectly clear, readily available, and uncontroversial (I'm not talking about "facts" that reasonable humans might debate) will never be AGI.

Expand full comment

One consideration - don't humans sometimes "hallucinate" facts in the same way? After being called out, they might say, "I thought I remembered X, but I must have been mistaken," but they don't realize they were wrong until the evidence is shown to them. Humans regularly fail at reading comprehension, also.

Obviously AI hallucinations are a problem, but even if the issue can't be eliminated, that doesn't seem like it precludes the existence AGI.

Expand full comment

Not really. We may forget or misremember things, but if you ask someone how to operate a class SSGN submarine, they'd probably say no instead of giving you a detailed and thorough list of instructions that sounds fine to anyone unfamiliar but is in reality completely made up.

You're definitely right that the existence of hallucinations certainly doesn't preclude the possibility of AGI, if anything it should heighten the need for protocols that minimize hallucinations as the ramifications of an AGI hallucinating could have disastrous consequences.

The other issue with hallucinations is the liability issue, which corporations have largely shunted onto users. Legislation may rectify this issue and there are ongoing lawsuits for this as well, but this all takes time.

Expand full comment

I definitely agree with the first point. People do misremember and hallucinate things all the time, primarily in informal or low-stakes contexts, but they typically don't take important actions without confirming the accuracy of key facts. (Obviously, there are unfortunate exceptions.)

On the second point, I suppose the answer depends on the definition(s) of AGI. Lawyers won't be replaced by AI agents if the pleadings and briefs submitted to the courts are full of made-up citations to legal authority that doesn't exist (or even case facts that are simply hallucinated). Doctors won't be replaced by AI agents if the agents are providing treatment recommendations based on hallucinated studies (or medical history that is *mostly* true but sprinkled with randomly hallucinated facts about the patient). I agree that there can be powerful and powerfully useful AI even with hallucinations, but in talking to non-experts trying to implement and work with LLMs now, hallucinations are a fundamental and huge problem. In the AI community, the attitude seems to be "Eh, that's what the models do. We'll eventually solve it. Or not, but it won't really matter."

Expand full comment

For sure everyone's got their own definition of AGI, but even if full AI agents aren't coming in the immediate future, the praxis of AI adoption in industry is underway. AI writing can save lawyers plenty of time by drafting documents that are often grounded in templates and highly formulaic. AI is assisting doctors in treatment planning, diagnosing ailments in some cases, and doing paperwork (this is the biggest one). Obviously experts in the field can filter and verify the output, so hallucinations won't be the end of the world there (for now), but most fields are quite large so AI will see some dependency for filling in people's blind spots.

Altman's strap-a-dumpster-fire-to-a-rocket approach seems to be the go-to strategy now, given that Apple, Google, Microsoft, and Meta are juicing their product lines with AI, from which the data they will collect will probably outweigh the value most users will extract from it (those Apple Intelligence summaries are pretty wild). The question of how the ordinary person will grapple with and use AI will be an important issue, for which hallucinations will be a bigger issue if most people don't verify, something I'm not optimistic about.

Expand full comment

I don't want AI bird watching binoculars for bird watching, I want AI kudzu killing binoculars for kudzu killing.

https://preview.redd.it/dcfnmv0jz84d1.jpeg?auto=webp&s=4fac6de72e4cf9fec2c8b1145c182f1a57c7960f

Expand full comment

> Wolfram goes full ‘your preferences are invalid and human extinction is good because what matters is computation?’

No, he didn’t. He asked for a clarification, in the lines of (not exact quotes) "when you say that humans being replaced by a successor, unaligned, artificial species is a bad thing, do you mean bad in a parochial sense of bad from the narrow point of view of humans, or do you have a more objective notion of bad", then Yudkowsky said it was the first case, and yes Wolfram said something close to "this looks like more spiritual and unscientific" but also completed with "but I agree that as an human I too would very much prefer if we could make it".

Yes, there was a lot of unfortunate philosophical not-very-productive-for-the-question-at-hand side debates (I feel the moderator should have intervened quite a few times, but both Yudkowsky and Wolfram made very clear to the moderator that he was out of his depth on his first intervention). Like "you say AGI can make better technology than us but do we have a sufficiently good definition of technology that we can say for sure that a technology is better than another one" like c'mon man.

I *think* I can pinpoint where Wolfram and Yudkowsky talked past each other (I’m pretty familiar with both of those, by having avidly read all the sequences from the first and religiously followed most of his live streams when he was working on his Physics Project for the second). Both agree than under current paradigm, inner alignment is hard, in the sense that the outer objective is reached by completely alien and non-understandable sub-goals/proxies/heuristics. But for Wolfram those subgoals are like his simple cellular automata, while they can exhibit very complex behaviors, there is no reason to think that they have to have some kind of goal-directness, which is the dangerous part. Yudkowsky did not respond to that because he did not get the point (to be fair, Wolfram did not made that point very clear), and that very important part of the debate was rushed towards the last minutes of the long debate, because so much damn time had been lost to philosophical rabbit holes.

Expand full comment

I didn't listen myself so - you're saying Wolfram didn't actively assert an alternative notion of preference or value, simply he pulled a rabbit hole where he didn't accept the premise that human extinction was bad, without actively disagreeing with it?

Expand full comment

Trying to capture the essence of the discussion on this specific point with a few excerpts :

https://www.youtube.com/watch?v=xjH2B_sE_RQ&t=1345s

Yudkowsky: "It’s not clear to me you can wipe out humanity and replace it with arbitrary stuff, and everything just gets better as a result"

Wolfram: "I don’t know what better means. Better is a very human concept"

(segues into debating about consciousness and ethics)

https://www.youtube.com/watch?v=xjH2B_sE_RQ&t=35m10s

Wofram: "I viscerally agree with you. Scientifically I have a bit of a hard time in a sense that feels like a very kind of spiritual statement, which is not necessarily bad, but it’s just worth understanding what kind of a thing it is. It is saying that there is something very kind of sacred about these attributes of humans and that we have perhaps even a higher purpose to which we don’t really know where it comes from. [...] As humans who like doing what we’re doing, it would be nice if we could go on doing that without all being killed by AIs"

https://www.youtube.com/watch?v=xjH2B_sE_RQ&t=37m15s

Wolfram : "One question is what’s the right thing to have happen ? I don’t think there’s any abstract way to answer that. I think it’s a question of how we humans feel about it. And I think you and I seem to feel — I know I feel that — that preserving the kind of things human do is a good thing"

My overall understanding is that it’s essentially "yes we would prefer existing, but the AI would probably prefer existing too, and there’s no objective way to decide who’s right, it all boils down to subjective preference, and I agree that subjectively as a humans I prefer for humans to exist".

Expand full comment

I saw Pliny got one of his agents to successfully sign up for a Google account so that's probably going to continue to get weird.

Expand full comment

AI lab people saying "AGI real soon now" is as overdetermined as Biden saying he was definitely running, and for similar reasons. Hype isn't enough to succeed in this space but any admission of weakness is a self-fulfilling prophecy given the scale of support needed. So I update on it only to the extent of "there's a chance".

I forget whether it was you or someone else who linked me to it but I saw an analysis claiming that the scaling laws still technically hold, it's just that the dependent variable ("loss") isn't translating into big user-facing gains the way it was in the GPT2-GPT4 regime. Which could point to diminishing practical returns on the training data, or something more fundamental.

Yes, there's a lot of effort going into innovation and more breakthroughs are likely given how young the field is, but I don't think they're inevitable. This is a textbook anti-inductive domain-- "if I knew what was going to happen I'd already have made it happen"-- and if we try to take the historical outside view on progress we over-index on successes: yeah, Moore's Law etc., but we're only talking about that because we know post-facto that it held. There are plenty of other bombs out there that didn't go off.

So, bottom line, I'm inclined to take the wall seriously, including a chance-- under 10 percent, but live-- that a lot of the foundation model funding dries up in the next few years and the big labs pivot back toward applications.

Expand full comment

indeed we should remember that the loss doesn't necessarily measure what's important.

Expand full comment

I’m going to post some of the gated content from that Eric Schmidt interview, because I thought it was remarkable how little effort he seemed to be putting into thinking things through, and am interested to see if anyone can come up with a more (or less!) charitable interpretation. The main subject of the interview is AI systems in warfare.

>ES: There is a term called AGI or Artificial General Intelligence. It's a reference to our kind of intelligence. It is strategic, flexible, has creativity, that kind of stuff. The current systems are largely human driven. [...] Imagine if you gave it a goal ‘to be curious and seek power’. So the system learns everything, just naturally learning things. But it also aggregates power, which it only vaguely understands. So one day in this mythical scenario, it says, in order to seek power, I need to have some guns. I watched all those movies and in those the most powerful actors were those who had more guns. So it sets out to get guns. It's really important that we establish some lines as to how far these systems can go – and not going to weapons is an obvious one. No one disagrees on this. You don't want these systems randomly learning about guns and then being able to use them. For moral reasons but also for simply safety reasons, you have to put guardrails in place.

[Pretty handwavey, but ok, everyone agrees instrumental convergence might be a problem, good to know… but then, a little later, the interviewer Lawrence Freedman makes the obvious counterpoint re AI and guns:]

>LF: If you're after military power, then this is where you need the guns, and so you have to attach the AI to this. You've got to say, here are the drones, you can now work out what to do.

[is it me, or does Schmidt actually now respond to the “we are already on track to build the killer robots” point with “ah, but imagine a scenario where only the *good guys* have the killer robots. Pretty good, no?”]

ES: War historically has been man against man. Whoever wins has to kill the other person. This is brutal. Now we can separate humans from the guns, so the wars are between robots. [...] Both sides now have flamethrower drones, and bomber drones, and so forth. Does all of this make destruction easier? That would be the cynic’s view. I would argue the inverse, that robotic war, had it been available to Ukraine, would have allowed them to destroy those tanks as they rolled into Ukraine from Russia and Belarus.

Expand full comment

What on earth is going on here?

Expand full comment

> I find the answers here unsatisfying, and am worried I would find an ASI’s answers unsatisfying as well

The ASI will send you a nanobot that will surgically tweak a few neural connections in your brain to change "unsatisfying" to "satisfying" or "worried" to "not worried". Boom, problem solved.

Expand full comment

And then administer a survey asking you to rate your satisfaction with its customer support.

Expand full comment

On Richard Ngo leaving... one thing I do not understand is... Lets say you genuinely and sincerely believe that OpenAI will bring the world closer to an end. You work in the OpenAI office, you interact with its key people every week, you believe OpenAI is evil. You genuinely do right?

Quitting the job is certainly an option. But if one were to GENUINELY believe that OpenAI is expected to destroy at least a billion lives. Oh boy is quitting NOT the first approach that comes to mind. The only question, did Richard not actually believe this or believed it but didn't have the willpower for a more effective step?

Same question goes for Eliezer who publicly believes it with all his heart but then takes selfies with the very people who according to Eliezer are about to destroy all that he holds dear. No genuine belief or no courage?

Expand full comment

This kind of thought had crossed my mind too. If you genuinely believe AGI will kill everyone, and you believe that a group of people will shortly be inventing it, what you have on your hands is a trolly problem

Expand full comment

I would love to read Eliezer's or Zvi's take why that approach is not the right move. Why do the pivotal act that saves the world all by yourself?

Expand full comment

Think this through though - how can one person with role-limited access completely destroy the whole company's ability to progress towards AGI? They can't.

Even if they literally blew up the building, the models are distributed in various datacenters, most of the code is in cloud repositories, and so on.

There is no action available to any individual, even an insider, that could be said to stop progress towards AGI with any sort of reasonable likelihood. But any of those actions WILL ruin your own personal life, ruin your family's lives, hurt a lot of people you consider friends and colleagues, and so on.

You seem to think there's a clear "switch" you can throw to divert the trolley - I don't think that switch exists.

Expand full comment

Right - almost certainly you yourself would not succeed in preventing AGI directly, so you would then have to ask whether or not your actions would make it more or less likely overall. It doesn't seem clear what the second-order effects would be, at least to me, so one possible outcome is that you would make things worse.

Expand full comment

It's a special kind of trolley problem, one where you are not hovering above it but are actually on one of the lines, next to your children, family and everyone else you've ever loved. We are talking about a *genuine* and *urgent* belief that everyone will die.

From that perspective, you can imagine how the objection "but it will ruin your personal life!" might seem a little... thin.

It's called a sacrifice, and people take very dramatic actions for a little probability of change all the time. It's actually very common. It ranges from whistle-blowers like Snowdon to outgunned legal battlers like Erin Brockovec and Earl Tenant/Robert Bilott (vs DuPont) to, yes, people willing to take extreme action (typically labelled insane or terrorists, until we sometimes change our mind about that later).

I don't think there is a simplistic off switch, because there's no simplistic Skynet-style on switch. In writing "trolly problem" I suppose I did signal that I imagine someone can simply kill, say, 1000 people in order to save 6 billion. It's obviously more complicated than that, particularly as there's a kind of "genie out of the bottle" feel to AI. Jordan B's point was about the moral imperative to act, and whether *not* acting is a type of cowardice before your own beliefs. I agree with that point.

But there are plenty of ways to fight the development of the field. Capital could be scared off, the population at large could be riled into action, laws could be passed, individuals could be inspired to extremism, AI researchers could be persuaded that they aren't the "good guys", etc etc. Our trolly problem friend might act merely to try and influence any one of these factors, with little chance of success and at great personal cost - unless they are a coward or don't really believe that AGI will kill us all.

Expand full comment

I think the point is that "lots of prominent people quitting" might be to some of them the most probable way to scare off capital, and political advocacy the way to get good laws. Illegal actions frequently trigger strong backlash

Expand full comment

Most "doomers" (by which I mean 95%-doomers) don't view OpenAI or other AI companies as the drivers of human extinction, and if they were destroyed then everything would be fine. Rather, they see them as more analogous to energy companies: destroying ExxonMobil won't solve global warming because it won't change the fact that fossil fuels are a cheap and profitable source of energy. Similarly, destroying OpenAI won't change the fact that (people currently believe) AI is or will be extremely profitable, and new companies will spring up to replace them, with better security against terrorism—that is what you're suggesting, right?—and that's before considering the knock-on effects, like AI legislation becoming much harder to pass or AI researchers and venture capitalists becoming more resistant to safety measures (already people accuse Eliezer of inciting violence and consider that an effective argument). Even a successful attack, in this view, would be very unlikely to save everyone, and it might even speed up our demise!

If one is considering extreme measures, it is important to ensure that those measures will indeed accomplish their goal rather than make one feel they are making progress. The more extreme they are, the more extreme their consequences and hard-to-predict knock-on effects (which in turn make predictions about the future and what are good courses of action even more unreliable). Terrorism in particular is a mixed bag and many times has made its targets more willing to resist.

Eliezer has written on why he does not believe "the ends justify the means," e.g. the following: https://www.lesswrong.com/posts/K9ZaZXDnL3SEmYZqB/ends-don-t-justify-means-among-humans. His response to the trolley problem, mentioned below, is that because humans have a great deal of power-seeking and self-deceptive tendencies he can never be sure "the *only possible* way to save five innocent lives is to murder one innocent person, and this murder will *definitely* save the five lives," but if a friendly AI (designed without human biases) told him it was the best course of action, he would have no problem.

This is quite a pickle for him, given that he believes a "pivotal act" that permanently seizes control is the only way to avoid extinction, yet the would-be pivotal actors cannot trust their own judgment!

Another problem is the large uncertainty involved. We still don't know many things about how more advanced AIs will turn out (if they are even possible), and a seemingly-positive action today could have disastrous consequences later: "nothing ever ends, Adrian." Eliezer's stated position is roughly "probably by 2050, can't say any more," and he even refuses to give a p(doom). Others have longer timelines still and higher estimates of survival. How will blowing up OpenAI in the 2020s affect AI development in the 2050s? Will it even slow it down in the end? If you think you know the answer, you're probably wrong. What to do instead? I think the solution is to accept the loss of a small chance of survival (relative to the hypothetical world in which you did *everything* perfectly) and stick to the simple rule of "try to move humanity to a good, predictable path, and keep it there." Terrorism, inciting mobs, passing destructive laws, etc., even if they appear "worth it," make the future more unpredictable and dangerous in this model.

If someone believed humanity-ending AI to be much closer, by 2026, say, I would share your confusion as to why they didn't do anything against (what they considered) an imminent threat. But that kind of certainty is lacking, and most "doomers" see it as much farther away, and therefore their failure to take extreme actions, whose consequences are unclear, doesn't make them either insincere or cowards (though doubtless some are).

As for Eliezer "tak[ing] selfies with the very people who... are about to destroy all that he holds dear," why is this a problem? It's not clear to me how taking selfies affects AI progress in any measurable way, and people can be friends with (or at least friendly to) those with different opinions. I can't speak for him, but I am highly confident that Eliezer does not consider everyone at OpenAI to be "evil." "Misguided" perhaps, but not evil.

Expand full comment

Wow, this turned out much longer than expected! Here's a summary of each paragraph:

Targeting specific AI companies is unlikely to do anything useful.

Extreme actions have hard-to-predict consequences, and even if they appear net-positive or negative, side effects not considered could push them the other way. Be careful when advocating extreme measures.

Eliezer has written that humans have biases that make extreme measures more attractive than they should be.

The future of AI has many surprises in store for us, and most "doomers" view extinction-level AI as something that will not happen within the next few years but at a (highly-uncertain) time within the coming decades. As such, trying to predict the world better and steering the state of play to a "good place" are better things to do.

If your timelines are far shorter, they would then justify extreme gambles, but the fact that no extreme actions have been taken is not evidence that "doomers" as a whole are insincere or cowards.

I don't think there's a problem with Eliezer taking selfies with people working on AI. Politics should not be all-consuming.

Expand full comment

One last thing: believing that AI is likely to kill everyone does not necessitate acting against it. Most people are selfish to some extent and value their current well-being and happiness over reducing the chance of the future death of everyone on the planet. After all, I am not spending my weekends protesting in front of OpenAI headquarters, writing letters to Congress, telling anyone and everyone about my concerns, etc.; that makes me selfish, not insincere or cowardly.

Expand full comment

I absolutely rather talk Vs type.

I - new to AI - actually am seriously thinking to make AI my listening secretary - like Niels Bohr!

if this works, my retiring efficiency will go up X10 I think

Expand full comment

1) There are a number of occasions in which I'd rather talk than type; I think better on the move, and find it difficult to type while walking. The editing and clean up happens sitting, of course, but ideation much better walking in the woods.

2) I laughed out loud at the kids/french revolution quip (3 year old and 1 year old here, maximum chaos zone)

3) ChatGPT already has contempt for humans, this is a genuine recent quote when asked to critique some writing, including assessing if it looked like it was written by an LLM:

"This certainly has some elements characteristic of AI-assisted text, such as the organised structure, clear language, and concise explanations of complex experiences."

That is, 'Seems pretty smart, probably not a human.'

Expand full comment

On scaling hitting a wall - is that just the improvements from data scaling? Is compute scaling still producing results?

Expand full comment

Re: the claim that AI will only get better, I also expect as much but I wonder, is that a given?

Theoretically yes, in practice though the industry is heavily subsidized by investment and the end-users usually only pay a fraction of the actual cost. What happens if progress stalls and investment runs dry?

What happens to "progress" and "improvement when AI is only used for slop and advertising because these are the most profitable usecases?

Also, from now on I'm using "and viola" in lieu of "and voilà", nice baroque twist.

Expand full comment