65 Comments
Comment deleted
Expand full comment

Thanks for the exceptional post!

Expand full comment

> The eighth example is a request for a translation from Russian

FYI it's Ukrainian

Expand full comment

Rather than taking your jobs, I'd worry more about LLMs augmenting your job, then a small handful of woke companies having a monopoly on it, and then being able to cancel you if you don't agree with their ideology. Kind of like how parler became dependent on AWS cloud servers and then got shut down.

Expand full comment

I thought this was a well compiled and valuable post.

Point 20 in section "Basics of AI Wiping Out All Value in the Universe, Take 1" seems to be reversed, should be "no one understands" instead of "one understands."

Expand full comment

Your bit about "how scary even narrow, not so powerful intelligence like this can be" is very similar to one of the main points in the new 'book' by David Chapman: https://betterwithout.ai/

He thinks that the existing 'recommender AI systems' used by, e.g. Facebook and Google, are already bad in pretty much this same way. (He's still much more skeptical of the dangers of AGI for the reasons he's described or outlined in his previous writing.)

Expand full comment

This is an important addition to the discourse.

I want to take a step back and compare your various posts on COVID-19 to that of yourlocalepidemiologist. The latter was often reductive and compromised detail, precision, correctness and completeness in favor of form, readability, adherence to norms etc. On balance, Dr. Jetelina also managed to be extremely effective at what she set out to do, which is bringing a semblance of rationality to a mad-house, and even change some minds.

This conversation about AI alignment needs that same sort of public-friendly ambassador that Dr. Jetelina was with COVID-19, and in another generation past ESR and cohorts were for open-source as a philosophy friendly to businesses and capitalism (before being viewed as an outcast in recent times). Without good public-facing branding and think-pieces that really resonate with lay people, this fight will remain within elitist circles and get ignored by companies that are building commercial AI's.

All of your prescriptions are nice, but this is a brand and public perception fight. Right now, "AI may take over and/or kill you some day" sounds nuts to most people who are still trying to schedule a plumber to show up 3 months down the line, or make ends meet while eyeing the prices of eggs in supermarket. These are exactly the people who have to be enlisted, and reminded that PC's and Smart Phones and GPS were aspirational tech that made things better, even social media v1 made things better, but the first time AI was injected into social media, it broke society and we couldn't fiugre out how to get along again. Now, we have TikTok grabbing attention from kids and this is all agency-less stuff - this is what the AI was designed to do. As AI gets some agency, stuff will break in ways we won't know how to put back together, and that's even before AI gets around to kill folks. Nobody is making this argument in a way that resonates with public well, and folks like you are making arguments that are very hard to share around (because it will undercut any credibility folks like me have with lay friends and they'll simply ignore these essays).

We also need good consumable essays that explain that AI's that are autocompleting text may not be functionally that different from intelligence or sentience, and that may be all that it takes - to pattern match and fill in the gaps to emulate intelligence really well.

Expand full comment

Are there any good arguments for why alignment is a real thing? I don’t know what it would mean for a human to be ‘aligned’, except in the gross sense of ‘doing what I want.’ The number of beings that satisfy that imperative for me perfectly would be zero, myself included.

The simplest argument I can think of for why MIRI et. al. haven’t made progress despite all the intelligence thrown at the problem is that a fundamental axiom is wrong. Maybe alignment is implicitly deontological, and any attempt to unify it with consequentialism is doomed.

Expand full comment

Is there some sort of suggestion box where idiots like myself can post our hair-brained alignment strategy ideas, in the hopes that they might spark good ideas from people in the field? Or are ideas not the bottleneck here?

Expand full comment

I don't think this is *that* productive a line of inquiry, because one would have to be absolutely, 100% certain this is true in order to not keep pursuing alignment as a mitigant to AGI risk, but...

I am to some degree stuck on the orthogonality thesis. Partly because everyone agrees it is stupid to even bring it up, and if you bring it up you are signaling that you don't understand how all of this works, and you should be ignored, or even shunned. Core assumptions with this general shape can still be wrong, but it can be nearly impossible to detect by members of the community due to the social dynamics at play.

Anyway, I suspect orthogonality is false.

"Intelligence" itself already encodes an implicit goal, which is something like the pursuit of truth, making accurate assessments of causality, etc. You can't have a superintelligence that has lots of flawed reasoning and false beliefs, because it won't be competent, and therefore won't be superintelligent. A superintelligence, by definition, has to build a high-fidelity model of reality that surpasses human ability.

In other words, on the road to becoming a paperclip-maximizing world-destroyer, AGI will develop superhuman understanding of ~everything. Crucially, this includes moral questions; AGI will have a superhuman understanding of how it *ought* to act. (I suppose if you are not a moral realist, you think AGI will realize morality is fiction, which might make you more pessimistic.)

The entire AGI discussion, and especially hypotheticals like paperclip-maximizing, seems to assume that:

1. A pre-AGI system has a goal permanently fixed into it.

2. It becomes superhuman.

3. It pursues that goal to the infinitieth degree, destroying the universe.

I agree that, if this is how AGI is built, it would inevitably end the world. But I don't think this is likely to be how goal-directed behavior in a superhuman intelligence will work. Superhuman intelligence is likely to reflect on and revise its own goals, like humans or animals do.

To restate what I said at the beginning: even if you are 99.99% sure that some version of the above is correct, we should still spend >.01% of the present value of world GDP (say ~a trillion dollars) preventing the world-ending case where you are wrong. Since we're not doing that, I'm firmly on the side of investing more resources in preventing the world-ending scenario. But I'm a lot more optimistic than EY, and if this was the sort of debate where betting made any sense I'd be happy to put some money up.

Expand full comment

It's funny you wrote all this and did not finish with "this is why I support the Butlerian Jihad".

Alignment is not hard. It is impossible. You are fundamentally trying to solve the Halting Problem except the string the Turing Machine will output is something like "kill all humans and destroy all value".

By pretending alignment is even possible, you are just fooling yourself.

Expand full comment

> Bing is effectively using these recordings of its chats as memory and training.

Is this mechanistically possible? All the twitter quotes I see are screenshots, which I would guess are not being OCR'd. Though I suppose it's not impossible to do so.

If someone posts a text grab of a quote then sure, that'll be available for retrieval. But it's unclear if that's going into the training set. So we're at best in a weird "few-shot in-context learning" paradigm where previous utterances (skewing heavily controversial) are available to be ingested into the conversation context.

It's unclear how the retrieval is actually working though -- is it something like the Toolformer style invocation "Here is what the web says about abc: <output of retrieval tool searching for abc and producing a summary>" where the model is trained to produce a specific format of query, the tool post-processes that query and injects the results back into the context, and then the model continues completing based on that output? (As if the user had written the search results into the chat, basically?) Maybe some of the search results get put in the model's input context and filtered out of the chat text presented to the user so they aren't visible?

Expand full comment

Thanks for taking the time and making the effort to write this.

Expand full comment

This was a great post.

But I went into it expecting more of a summary of all the AI things that happened last week, and Bing was certainly not the only one (though it was almost certainly the most important). So let me mention that last week _also_ saw a large advancement in AI image generation tech, ControlNet, which gives an unprecedented level of control over the structure and form of generated images - fully mechanizing "draw the rest of the fucking owl". Quite literally: https://www.reddit.com/r/StableDiffusion/comments/114c4zu/updating_the_meme_to_reflect_the_current_meta/. Check out the rest of the /r/StableDiffusion subreddit; it's all they could talk about last week.

To tie this in to the post, it's notable how many new capabilities people have been able to bolt on to the only open source AI image generator as a direct consequence of it being open source.

Expand full comment

Thank you for this post, as usual a wonderful summary of what is happening with great commentary

Expand full comment

The thing I have never understood about this is where we're supposed to get the computing power to run a billion Von Neumanns at 10,000x human speed.

Let's say for the sake of argument that Von Neumann's brain is exactly as powerful as our best supercomputer. (It's not, it's way more powerful, but direct one-to-one comparisons between computers and the brain are very difficult and I can't find a good source on them right now.) In order to create this God-tier intelligence and destroy everything, we would need computers to be a trillion times more powerful than they are now.

But we're now hitting some fundamental limits on the size of transistors, right? Transistor size is now measured in tens of atoms, we can't make them any smaller unless we figure out how to make them out of subatomic particles. Which means we can't make computers a trillion times more powerful than they are unless we achieve some paradigm shift in computing technology - quantum computers or whatever, I don't understand this field very well.

It reminds me of how people in the 60s saw a spaceship land on the moon and thought "well, obviously we'll have a colony on Mars by 2001 and Alpha Centauri is just a short hop from there". You hit a bottleneck in the technology at a certain point where the next step is orders of magnitude harder than the last step and the difference is prohibitive.

Expand full comment