Danger, AI Scientist, Danger

Aug 15, 2024

While I finish up the weekly for tomorrow morning after my trip, here’s a section I expect to want to link back to every so often in the future.

Read →

14 Comments

sean pan

Aug 15

The need for a humanity defense organization to develop defensive technologies is ever increasing, and yet, there isn't the funding for it.

Expand full comment

Matt

Aug 15

I love this quote from the repository’s readme:

“Note: Caution! This codebase will execute LLM-written code. There are various risks and challenges associated with this autonomy. This includes e.g. the use of potentially dangerous packages, web access, and potential spawning of processes. Use at your own discretion. Please make sure to containerize and restrict web access appropriately.”

Expand full comment

Shaked Koplewitz

Aug 16

Does Sakana mean anything else in some other language? At lease Llama fits in with animal-themed naming.

Expand full comment

Reply (2)

Tolotos

Aug 16

It's fish in Japanese, see also https://jisho.org/search/%E9%AD%9A

I'm not sure whether the Kanji in the original thread is a mistake or whether the translation is...

Expand full comment

Reply (1)

Aug 28

No it's just that the phonetics of "sakana" is the same as the phonetics for the Hebrew's word "danger". It's an insane comparison. This whole post is a joke anyway.

Expand full comment

procyon

Aug 16

It doesn't meant anything in portuguese, but it does have the same sound as 'bastard' and it's spelled in a similar way (with a k instead of a c). Make of that what you will.

Expand full comment

Leon

Aug 16

I hope if/when something happens, there's enough time to drag people before Congress or whatever. It'll be cathartic to see the narrative change from "oh that's in loads of fiction, its not real" to "that was in loads of fiction, how could you not see it coming?"

Expand full comment

MichaeL Roe

Aug 16

Sandboxing a program to restrict what it can do is pretty much a solved problem.

So all these potentially self-modifying AI's should be sandboxed. Obviously.

Which would just leave us with more exotic possibilities, like the AI invents a day zero exploit against your sandboxing system and uses it to escapoe.

[Why, yes, we do have formal machine-verified (HOL4) security proofs for the CHERI capability system...]

Expand full comment

Reply (2)

MichaeL Roe

Aug 16

Of course, you don't have to use CHERI, there are other ways to do sandboxing.

But, if someone who is building an enormous number of AI data centres were to tell certain chip manufacturers that they want to buy like a gazillion CPUs with CHERI extensions, I personally will be celebrating...

Expand full comment

CubicQuartz

Aug 17

I does not sound particularly exotic to me since those things happen regularly in real life. Even systems whose sole purpose is sandboxing (Intel SGX, chrome sandbox) get regularly compromised.

Expand full comment

PauseAI

Aug 16

We have our warnings. Now, let's convince our politicians to act on them.

Expand full comment

John Lawrence Aspden

Aug 17

It's so nice that you still care about all this. There's a ravening alien god coming to destroy everything I care about, and I'm not even interested enough to keep an eye on it any more. Pathetic.

Is it time to say "I told you so, you fucking fools" yet? I can't imagine there'll be enough time to say it by the time it becomes obvious enough that the fools see it.

Expand full comment

Reply (1)

sean pan

Aug 19

Well, there is still time to try to build d/acc. Perhaps we might survive.

Expand full comment

sean pan

Aug 19

Fwiw: the lesswrong community had some good replies to this. So it might be less "instrumental convergence" and more "AI makes a mistake" but notably, it really doesn't matter if "AI goodnaturedly mistakenly tries to grab resources" versus "AI deceptively tries to get resources" in a way, because either way, the AI is powerseeking.

https://www.lesswrong.com/posts/ppafWk6YCeXYr4XpH/danger-ai-scientist-danger

Expand full comment

Don't Worry About the Vase

Danger, AI Scientist, Danger