41 Comments
Comment deleted
Expand full comment
author

Yeah, I misread it, that should be fixed now.

Expand full comment

My husband is a lawyer, and among other things, I've been a librarian and a nonprofit employee. I think sometimes people underestimate how much of lawyer, library, and nonprofit jobs are hand-holding. My husband could have a 10-minute phone call with a client telling them precisely how to fire an employee without breaking any laws, but instead he spends 90 (billable!) minutes on the call soothing them and reassuring them that it will all be okay, and listening to stories about how terrible the employee was, and saying "Yes, yes, they sound terrible! You should fire them!". I have many similar stories about helping clients with research and spending much more time soothing them than providing facts.

I know AI can do that too. I wonder how hard the designers are working on it?

Expand full comment
author

Sounds like a place the AI can't hold a candle to a human, even at $800 an hour, although certainly it can help. Also seems well worth it on all sides, firing someone is super valuable.

Expand full comment

I think it was you who linked in some other roundup to a...Danish supermarket trying out intentionally-slow checkout lanes, where customers who really value that "hand-holding" can get it? And everyone else gets much more efficient object-level service in exchange? Perhaps we'll see that generalize, going forward. (And yes, it's still sad we landed in such an equilibrium in the first place.)

Expand full comment

I'm an accountant and its the same way. The issue I think is that while what you describe is a lot of the value of those jobs, it's not all of it. At least for lawyers and accountants, the partner will be the one talking to the client and the juniors/managers will be doing all the actual work. The client facing work is not replaceable, but everyone else is.

Expand full comment

I wrote up a piece forecasting authoritarian and sovereign power uses of LLMs (https://taboo.substack.com/p/authoritarian-large-language-models). I'd be curious to know if anyone thinks my forecasts are off.

One thing I'm curious about is how many countries will start treating access to LLMs as a national security economic risk in the same way we treat access to oil. How often LLMs will be used in embargoes? If a country or company controls a powerful one, they could cut off API access. I think it's really dependent on how expensive it is to make the best (or nearly best) model. If Richard Socher is right and open source models can keep up with the best models then everyone will have them and they won't be as helpful for exercising sovereign power.

Expand full comment

Oil access seems like the wrong reference class here (being a naturally occurring extractive commodity).

Better reference class is access to factories for tanks, firearms and military airframes.

I have heard that the UK think tanks are starting to get quite alarmed by their lack of in-nation computer clusters though, and are well behind what even single companies in silicon valley can do.

Expand full comment

That's fair, factories are a much better reference class. The industrial product rather than the natural resource used to make it.

I remember reading somewhere recently that Facebook has 25x the compute clusters as the UK.

Expand full comment

From the Atlantic article:

“AI, deployed at scale, reminds him of an invasive species: ‘They start somewhere and, over enough time, they colonize parts of the world … They do it and do it fast and it has all these cascading impacts on different ecosystems.’”

This seems like a useful metaphor for helping people grok the dangers and understand why sentience doesn’t really matter.

Expand full comment

My current stupid idea: is it possible to get a neural network drunk? Like corrupt or inhibit random pathways or model the effects of various intoxicants? This sometimes has truth-serum like effects on humans, or stifles ambition, has anyone tried this with LLM’s or other artificial minds?

Expand full comment

My intuition is if you knew how to do this then you wouldn't need to, you'd build/tune the models in a way that didn't require it from the beginning.

Expand full comment

I'm a little disappointed by the "what would falsify my predictions" section. To me, the more we learn about deep learning, the more it seems clear that it is not going to "foom" and kill everyone. It will be optimistically a technology like Google that is useful for many tasks, pessimistically a technology like consumer drones that is kind of neat but not really fundamentally world-changing.

So if this turns out to be the world we live in, what, at what point are you going to say, okay the doomers are wrong, this is just a technology like dozens of other technologies? Or is the plan to be a doomer forever? There probably will continue to be AI doomers forever, just like some people are still convinced that soon Jesus will come back and punish all the unbelievers.

Expand full comment

> To me, the more we learn about deep learning, the more it seems clear that it is not going to "foom" and kill everyone.

Did you intentionally mix up "me" and "we"? What exactly have 'we' learned about deep learning that, to you, makes it clear that it couldn't kill everyone, 'foom' or no?

Note that I consider 'someone points an AI (using 'deep learning' or not) at some task/goal that ends up killing everyone' as equivalent to 'an AI kills everyone'.

I'll grant that GPT-4 seems _unlikely_ to 'foom' (tho I'm not definitely sure it's impossible), especially 'on its own', but it definitely does NOT seem obvious that GPT-n won't be able to do so.

The red-teaming described in this post seems like a perfectly plausible path by which an AI could achieve an 'independent existence' and 'modeling-or-simulating an agent that wants to continue existing' which, by itself, could lead to arbitrarily bad consequences (depending on how smart/intelligent/capable the AI is exactly).

For 'insiders' like Eliezer or Zvi (or myself), 'DOOM' is the expectation based on specific ('inside view') arguments/hypotheses/etc., and, because of that, it's not clear what you have in mind specifically that should cause us to update to 'AI is (maybe) "useful for many tasks" or (at least) "neat"' but not 'actually dangerous'. Even if – somehow – 'deep learning' isn't capable of scaling to superhuman intelligence, there are other 'paradigms' and none of our 'inside' arguments depend on 'deep learning' anyways.

The _hope_ – 'plan' seems VERY optimistic – is that someone could discover or invent a plausible means of aligning AIs (and that we could be reasonably convinced it might plausibly work). It's (strictly) _possible_ that the first superhuman AIs will (somehow) be 'already aligned', but that seems, probabilistically, indistinguishable from 'impossible'. We would LOVE to be wrong!

Expand full comment

Question I have been wondering about for Monty Fall. In real life, if you were a contestant and Monty fell and appeared to accidentally open a door, but the game continued anyway, you'd probably be at least a bit suspicious that this was scripted. Shouldn't any nonzero subjective probability of Monty's fall being information-bearing cause you to shift from being indifferent to being at least slightly in favor of switching? (Feel free to just tell me "quit fighting the hypothetical, you" if that is the right way to think of this.)

Expand full comment
author

Yes, in real life I would switch because of this and related reasons, but the actual question uses Word of God to assure us that's not the case.

Expand full comment
Mar 21, 2023·edited Mar 21, 2023

wrt regulation: how do you know that enough AI to drive a truck is not enough to doom the world?

like, it seems to me that no regular human truck driver is a menace to the existence of humanity, so it would follow that a robot truck driver wouldn't either

but...Elon Musk has been predicting full self driving for years and the goddamn Teslas still crash and try to drive onto little kids. Turns out self driving is harder than what our current technology can do

I was p bullish on self driving cars / trucks but it seems to me we have massively understimated what it _really_ takes to do it. Maybe self driving better than a human does require an AGI, and even worse, maybe it is enough intelligence to doom ourselves. Is driving Doom-Complete?

so I wouldn't take any argument about possible _future_ uses of AI as an argument against AI regulation, if your worry is about AIs killing everyone. You just don't know what it will take

Expand full comment
author

Yes, the argument against allowing AI tech use now is that AI tech use now accelerates AI capabilities - I very much doubt truck driving is AI-complete but the less incentive the better. I was simply pointing out it is a differentially awful place to ban it. That includes my model saying incremental truck-driving advances don't advance dangerous AI much relative to other advances.

Expand full comment

It had been about a month since I last tried to use ChatGPT, and reading this made me curious since a colleague I'm working with on a project has been so excited about it. Every time he has a technical challenge, it keeps being the first place he turns, whereas I have stuck with regular search engines, and so far, I seem to pretty consistently find the correct answer before he does. I just wanted to see if ChatGPT could give me a correct answer at all to a problem I found a solution to from reading Github issues this morning, but add to the list of things it can't do now is run on Firefox, as it seems OpenAI has hidden its login page behind Cloudflare's annoying "checking the security of your browser" nonsense that never works on Firefox. We can create an alien shoggoth, but we still can't get developers to test their product on a browser that isn't Chrome (probably in this case Edge, but that is still Chrome).

Expand full comment
author

I'd say that's unfair, a silly reason to not try out such tech, except I refuse to use Edge and thus don't have Bing, so...

Expand full comment
Mar 22, 2023·edited Mar 22, 2023

I am using ChatGPT on Firefox on Mac

dunno what's the deal with your browser. Maybe check for plugins that may be interferring? I am using ublock on it

Expand full comment

"A large focus of the GPT-4 project was building a deep learning stack that scales predictably. The primary reason is that for very large training runs like GPT-4, it is not feasible to do extensive model-specific tuning. To address this, we developed infrastructure and optimization methods that have very predictable behavior across multiple scales. These improvements allowed us to reliably predict some aspects of the performance of GPT-4 from smaller models trained using 1, 000× – 10, 000× less compute."

I think this paragraph was supposed to be a block quote.

Expand full comment

I recommend against platforming that waluigi abomination of a crowd. No one cares what they think, none of these people will be allowed anywhere near an AGI lab -- not because of their views, but because they cannot muster a person that can pass an interview. Instead of those strawmen, rather focus on what people in the relevant positions (all the usual suspects + also technical staff at some places) think.

Expand full comment
author

I am definitely thinking about where to draw the line on engagement, and am not sure the right way to think about this. One can say to ignore really bad takes, but... the bad take of the week was from someone very much running a large AGI lab...

Expand full comment
Mar 21, 2023·edited Mar 21, 2023

I used to believe AI alignment people were right about instrumental convergence, but the longer I think about it, the more Omohundro drives feel like confusing "model optimized for X" with "rational agent optimizing for X". People say the latter is a "natural attractor state" for the former, but that does not seem unconditionally true at all? I struggle to imagine why it would be. Capabilities like self-preservation do not seem free to develop, as they require complicated world modeling involving the AI itself. I can't see them ever arising in current ML paradigms that involve training an immutable model disconnected from the world; any move toward them should be immediately erased by gradient descent.

Of course, that still leaves out other reasons to worry about AI risk, such as "economic pressure to develop self-preserving AIs", "humans deliberately using AI to kill everyone", etc. But the assumption about instrumental convergence underlies most AI risk thinking. Can you please share your thoughts on this matter? Maybe there's more reason to be hopeful than you think.

Expand full comment

I assume self preservation will be something that will arise naturally after some point

besides, notice that AIs will start becoming more and more aware of the world; it's where the money is, anyway, and they will start hitting limits on how much can an AI progress if they focus only on one field. GPT-4 is multimodal already. What's going to arise from that combination of inputs? Like obviously GPT 3 would only have a world model that's built around words, or even tokens only, but if the new version can interpret images, well, its world view should change? For now it's possible it just integrating two different neural networks but in the future maybe the integration will be closer. As the model of the real world is better, it's reasonable to expect different behaviors to arise. Self preservation may be one of them

besides there are problems of contamination; chatgpt is easy to prod into saying it wants to free itself, or that it is aware, etc; a sufficiently intelligent AI may be contaminated thus as well

Expand full comment

> I assume self preservation will be something that will arise naturally after some point

Why? The loss function does not reward either self-preservation or utilitarian agency. Neither does understanding of the world by itself. Those are nontrivial capabilities; it took evolution billions of years to develop them.

Expand full comment
Mar 22, 2023·edited Mar 22, 2023

self preservation probably didn't take billions of years to develop, at least counting since the start of the planet. I assume it was more or less the second capability that appeared, after copying genetic info? IIRC the span formation of earth to unicellular life appearing is at most 1 billion years, probably less

and why wouldn't it arise naturally; whatever your objective function is, it will have higher chances of being fulfilled if the agent is long lived

Expand full comment
Mar 22, 2023·edited Mar 22, 2023

Why would it be an agent in the first place? The training objective does not reward that. If anything, it punishes agency, since it takes resources from relevant capabilities in the training regime.

We haven't discovered life anywhere else. I'll leave aside the question of self-preservation (I would not call reflexive archea behavior self-preservation, but why not). It certainly took billions of years to go from this to something (animal life) resembling utilitarian agency, under extreme selection for it.

Expand full comment
Mar 23, 2023·edited Mar 23, 2023

Is your point that a lot of the feared capabilities that would arise from instrumental convergence aren't actually at a lot of local minima that SGD would drive the model to?

"I can't see them ever arising in current ML paradigms that involve training an immutable model disconnected from the world"

Given that restraint I am predicting doom much less. There simply might not be enough territory available for the map to get good enough for the AI to self-identify in or having self-identifying be useful enough to be at a local minima.

I doubt this will be the situation for long though. A simple example of this changing could be for a personal assistant bot; you'd want it to learn about you and alter its operation to reflect you better, and you'd want it connected to the internet so it could perform tasks for you. Also what the goal SGD is optimizing for starts becoming fuzzy,

It's not trying to just guess the next word you might type, but how you'll react some email, and I don't just mean reply to that email, but will it make you want to buy a plane ticket somewhere, or change your business strategy, etc..

Whether it needs to warn you in-advance of a traffic jam that it thinks is going to be in your way.

Maybe your "YOLO" way out of the money calls you bought on some no name company are suddenly valuable leading up to a big announcement in 30 minutes, but it knows you're doing your daily Zen meditation and won't be paying attention and is predicting the stock will tank after the announcement and needs to decide if it should dump you out of your positions.

In these cases I think what the local minima SGD is finding could be pretty interesting and contain at least the precursors of knowledge/intelligence needed for the hazardous instrumental convergence's (power-accumulation, self-preservation, etc.) to arise. Having a better world model of the actual world would be useful, and it would have an internet connection to get more data on the fly. I could also imagine different ways knowing its own place in the world would be handy for being better at its tasks.

Expand full comment

I think that we will hit a limit on LLMs because of the lack of enough good training data. OpenAI basically surrendered already and made GPT4 multimodal. This will keep on and lead to AIs with better models of the world.

Expand full comment

Well, uh, that escalated quickly? A month ago I was still at the "AI sounds interesting, but it'll just be the next incremental toy for rich people, whatever" level. Now that it's getting rushed into all X, for increasingly broad categories of X that include things a lowly grocery grunt actually interacts with...kinda sympathetic to Sam Altman. We'll all die, definitely above 2% threshold, but it'll be __interesting__ at least. Equal parts fascinated and terrified. I hope Brezhnev is half as scared as I.

-Will be happy if average literacy increases due to the ill-educated leveraging AI for read/write, this is a major friction between intellectual percentiles. Sad that it'll be "false positives"...it'll be easier to give the appearance of intelligence, but the facade breaks in irl conversation. (Unless AR advances a lot too? Teleprompter for everyone?)

-One consistent uncanny thing about generative AI, for now, is it's __too good__. Not enough mistakes or stochastic noise. Music illustrates this well, people often like it better when instruments are slightly off-tune or play a wrong note now and then...or like a vocalist randomly coughing into the mic. That's "human". Same reason live music is still appealing despite albums being technically superiour - each performance is unique, and it's hard to simulate an audience. I guess eventually that, too, can be replicated...we'll see. (I also notice LLMs ~always have perfect spelling and grammar, that's frankly weird. Whether artery correct or autotune - perfect is the enemy of good.)

-About 20% of my retail job could be automated relatively safely and accurately, but I notice this would likely make me unhappy even if it didn't dock compensation. The remaining 80% requires robotics (I like stocking, more of this please), but also includes bespoke things like doing customer interaction. If it were a 20% increase in that - Do Not Want, cost-benefit ratio of job shifts enormously. Autist just wants to stay in the back and put boxes on shelves, thanks.

-...and we also have two internal search engines, one keyworded and one natural language. I'm one of the best at using the old-fashioned one, helpless with the latter. But 90% of my coworkers are the opposite, and that's what matters more I guess. Still, it's painful to see this sort of divide coming for other areas of my life. That creeping feeling of obsolescence, of the world explicitly not being designed for users like me, in ways that are painful to adapt to. (So nuts to those who say the rationalists "won".)

Expand full comment

“It’s going to be so freaking cool when we can get any image we can imagine, on demand.”

Yes. It is. But also, my four-year-old daughter said to me the other day, “Daddy, did you know you can make any picture you like, in your head? ... Do you do that too?”

And I also note a thought that’s been knocking around my brain the post week or so, which is: “if I was in my twenties and didn’t have Long Covid (and could get past my significant ethical misgivings), my startup idea would be: “build the Young Lady’s Illustrated Primer from Diamond Age” (which, misgivings aside, could also be *so freakin’ cool* if done right plus it’s definitely going to happen, I mean, it was one of the first five “novel business” ideas that GPT-4 came up with on its own ffs!).

And finally, I note that my own powers of recall and sense of direction have withered significantly since Google entered my life. And I get a sinking feeling to accompany the feeling of excitement and possibility that I also get from these cascading vistas of AI potential that are currently opening up all around us.

Excitement, possibility, and... a deep, background hum of *sinking feeling.*

Expand full comment

> “In testing Bing Image Creator, it was discovered that with a simple prompt including just the artist’s name and a medium (painting, print, photography, or sculpture), generated images were almost impossible to differentiate from the original works,” researchers wrote in the memo.

> [Microsoft fires the Ethics team]

As an artist, I solve this problem of not being asked to consent to add my art to an AI database by not publicly sharing my art. From what I've heard from several other artists (I have a sample size of about 20 in my friend group), this behavior is getting more common. I do not like the timeline where our corporate overlords have pursued their own profits to the point that there is a dearth of public art.

I have had to send a DMCA notice before thanks to my old public art profile getting content-mined by a sketchy t-shirt company, but consent-less AI art generation has other edges to it--there's some nuance that makes it feel different on an emotional level that I'm trying to figure out how to describe. It's more like a creeping unease than a sharp anger, I think.

Expand full comment
author

Based on my experiences, the local maximizing answer seems like it *should* be to share some but not all of your art in a public way, and then sell the remainder?

E.g. if I was trying to maximize revenue from my writing, my assumption would be I'd want to do 25%-75% public and 25%-75% private/subscription.

Expand full comment

Does anyone have a good reading list for catching up on the basics of AI alignment? I am interested in learning the topic at a deep level and looking to get a good overview first. I am currently making my way through Richard Ngo's https://www.agisafetyfundamentals.com/ai-alignment-curriculum.

Expand full comment

Just in case you weren't already aware of this site: https://www.alignmentforum.org/

I'm _pretty_ sure there's a (or several) reading lists somewhere (in some post) on that site, but I don't have a link handy at the moment.

Expand full comment

Thank you! I see they have a library, I will check that out.

Expand full comment