14 Comments

The need for a humanity defense organization to develop defensive technologies is ever increasing, and yet, there isn't the funding for it.

Expand full comment

I love this quote from the repository’s readme:

“Note: Caution! This codebase will execute LLM-written code. There are various risks and challenges associated with this autonomy. This includes e.g. the use of potentially dangerous packages, web access, and potential spawning of processes. Use at your own discretion. Please make sure to containerize and restrict web access appropriately.”

Expand full comment

Does Sakana mean anything else in some other language? At lease Llama fits in with animal-themed naming.

Expand full comment

It's fish in Japanese, see also https://jisho.org/search/%E9%AD%9A

I'm not sure whether the Kanji in the original thread is a mistake or whether the translation is...

Expand full comment

No it's just that the phonetics of "sakana" is the same as the phonetics for the Hebrew's word "danger". It's an insane comparison. This whole post is a joke anyway.

Expand full comment

It doesn't meant anything in portuguese, but it does have the same sound as 'bastard' and it's spelled in a similar way (with a k instead of a c). Make of that what you will.

Expand full comment

I hope if/when something happens, there's enough time to drag people before Congress or whatever. It'll be cathartic to see the narrative change from "oh that's in loads of fiction, its not real" to "that was in loads of fiction, how could you not see it coming?"

Expand full comment

Sandboxing a program to restrict what it can do is pretty much a solved problem.

So all these potentially self-modifying AI's should be sandboxed. Obviously.

Which would just leave us with more exotic possibilities, like the AI invents a day zero exploit against your sandboxing system and uses it to escapoe.

[Why, yes, we do have formal machine-verified (HOL4) security proofs for the CHERI capability system...]

Expand full comment

Of course, you don't have to use CHERI, there are other ways to do sandboxing.

But, if someone who is building an enormous number of AI data centres were to tell certain chip manufacturers that they want to buy like a gazillion CPUs with CHERI extensions, I personally will be celebrating...

Expand full comment

I does not sound particularly exotic to me since those things happen regularly in real life. Even systems whose sole purpose is sandboxing (Intel SGX, chrome sandbox) get regularly compromised.

Expand full comment

We have our warnings. Now, let's convince our politicians to act on them.

Expand full comment

It's so nice that you still care about all this. There's a ravening alien god coming to destroy everything I care about, and I'm not even interested enough to keep an eye on it any more. Pathetic.

Is it time to say "I told you so, you fucking fools" yet? I can't imagine there'll be enough time to say it by the time it becomes obvious enough that the fools see it.

Expand full comment

Well, there is still time to try to build d/acc. Perhaps we might survive.

Expand full comment

Fwiw: the lesswrong community had some good replies to this. So it might be less "instrumental convergence" and more "AI makes a mistake" but notably, it really doesn't matter if "AI goodnaturedly mistakenly tries to grab resources" versus "AI deceptively tries to get resources" in a way, because either way, the AI is powerseeking.

https://www.lesswrong.com/posts/ppafWk6YCeXYr4XpH/danger-ai-scientist-danger

Expand full comment