Discussion about this post

User's avatar
Peter A. Jensen's avatar

"Oh, Also, If Anyone Builds It, Everyone Dies. I must also mention the elephant in the room of all this, which is existential risk."

BAN Superintelligence Until Provably Safe.

hwold's avatar

> Another possibility is that open source software projects that are worth compromising may have to close off purely for security reasons. Exposing your source might make you too vulnerable, especially if you accept public submissions at all.

Except AI is pretty good at decompilation now, I’m being told.

> You can in some cases ‘prove’ the correctness of software in theory, but the physical world is weird, and I don’t think this buys you security in practice, and the most important software is in practice too complex for full proofs.

All the vulnerabilities discovered by AI I’ve seen so far are of the kind of "local mistake", not "complex and weird interaction between distant and different submodules of the software". It’s the usual suspects, buffer overflow, use-after-free and such joys that can be comprehensively eliminated given enough dakka (aka intelligence and lot of tedious but human-level reviews). I’m pretty sure we live in the first world for that class of vulnerabilities.

What remains once this class of vulnerabilities is closed by AI ? Log4shell-like vulnerabilities and Rowhammer-like vulnerabilities. If we look at the past, they are a minority, but the interesting question is "are they the minority because humans as a rule are too dumb to find them or are they the minority because they are genuinely rare ?"

If it’s the later, I think we’re good. If it’s the former, big trouble, but still manageable if you now have a formalized step of "audit important open source software, do responsible disclosure" before releasing a new model.

Note that this apply to *software* vulnerabilities. Whole systems (think : the AWS infrastructure of your average startup or even big company), the kind of thing that a competent pentester plays with, are another beast, and I think it’s here that most of the trouble lies. Anthropic can work on software vulnerabilities on open source projects unilaterally, behind closed doors, and do responsible disclosure (and for proprietary software, well, I’m pretty sure you can sell this service to the editor). For infrastructure, if you want to audit, you need to bring the AI to the system, not the other way around, aka release the AI, which is big trouble if the AI can exploit it — because now it’s a race between the audit and the attack.

51 more comments...

No posts

Ready for more?