I recently served as a recommender deciding how to distribute a bunch of money to charities. This is my report. Read it if and only if it is relevant to your interests!
To be honest, I'm relieved that you and the other recommenders all found too few good places to put money for the ultra-long-term. Almost all such charities I've seen smell. It's a daunting scale to optimize for, and concrete proposals seem few and far between. There's no way to show that 'raising awareness' and 'investigating possibilities' and whatnot are actually productive in any way, and so I generally assume they're not. On the other hand, most of the organizations you mention by name are new to me, so I'm clearly uninformed about a lot here.
The one particularly that interested me was ALLFED though. Preparation for life after apocalypse seems like an underexplored X-risk approach; it allows for your research time to be helpful in mitigating many different X-risks as well as being helpful in major non-existential crises (like nuclear war or an out-sized solar flare burning out much of what's connected to electric lines.) I'm looking forward to doing some research on them.
This is worrying to read, and confirms the only really big worry I have about the EA space. I'd hope concerns like this were taken at least somewhat seriously! The *colossal* long-term incentive problems with things like EA orgs recommending/funding each other (beyond the cases in which one is just a sort of brand subsidiary of another), "movement-building" and things like that is a real concern to me. It doesn't strike me as that much different from the standard "well our group of powerful unaccountable people making decisions for the good of all would just *not become corrupt*" failure mode of every other revolutionary leftist I talk to.
I'm nervous about how quickly EA is growing and worried it'll just get corrupted and become just one more group that directs money based on in-group signalling. If it does, it'll probably still end up a lot better than the current elite charity space that works exactly that way, because the signals required will probably align more closely with actually doing good, but I'm worried even that couldn't last long.
It's also a fear of mine around the "patient philanthropy" ideas of just saving up tonnes of money, growing it through investments, then spending it all on doing more good later. I can't see a way that isn't just massively vulnerable to getting corrupted over time - it's giving people so long to worm their way in and corrupt things. Either that or it'll become too dangerous a powerbase and someone will just expropriate it, possibly out of (some mix of real and pretend) fears that it is already fatally corrupted or is at too much risk of becoming so.
I understand turning down money and interest and enthusiasm is very hard, but I think the small size of EA is one of its strengths, and if it grows too fast it both risks corruption from the inside or attack from the outside before it's really ready to deal with those threats.
Yep, doing things at scale and moving lots of money without becoming corrupted or having your motivations change over time is really hard, and predictions are difficult especially about the future, and all that. Hopefully it doesn't fall on deaf ears where it counts most.
In the spirit of this sort of thing: Is there any meaningful grant-giving for red teaming, currently?
Concretely: If the authors of "Universal and Transferable Adversarial Attacks on Aligned Language Models" were inclined to pioneer further attacks on LLMs, as a method of demonstrating that they are fundamentally unsafe in as many situations as possible, to whom would they apply for grants, and would those grants be likely to issue? Let us assume, for the sake of argument, that the authors will in the alternative of getting such a grant do something completely different or non-technical.
One expects that (of course) someone could make more money in private industry than they could from grants. To have an incentive to go to private industry instead of producing useful research is different from having no financial incentive for producing useful research.
My expectation is that the usual suspects, including SFF and OpenPhil, would be excited by Red Teaming work, and happy to fund it if the terms were reasonable. As you note, that does not mean industry-competitive salaries, but a good proposal would likely go quite smoothly.
There are a ton of things that fall into this ground of 'no one did it and also no one bothered putting up a bounty but people would be excited to see someone do it.'
To be honest, I'm relieved that you and the other recommenders all found too few good places to put money for the ultra-long-term. Almost all such charities I've seen smell. It's a daunting scale to optimize for, and concrete proposals seem few and far between. There's no way to show that 'raising awareness' and 'investigating possibilities' and whatnot are actually productive in any way, and so I generally assume they're not. On the other hand, most of the organizations you mention by name are new to me, so I'm clearly uninformed about a lot here.
The one particularly that interested me was ALLFED though. Preparation for life after apocalypse seems like an underexplored X-risk approach; it allows for your research time to be helpful in mitigating many different X-risks as well as being helpful in major non-existential crises (like nuclear war or an out-sized solar flare burning out much of what's connected to electric lines.) I'm looking forward to doing some research on them.
This is worrying to read, and confirms the only really big worry I have about the EA space. I'd hope concerns like this were taken at least somewhat seriously! The *colossal* long-term incentive problems with things like EA orgs recommending/funding each other (beyond the cases in which one is just a sort of brand subsidiary of another), "movement-building" and things like that is a real concern to me. It doesn't strike me as that much different from the standard "well our group of powerful unaccountable people making decisions for the good of all would just *not become corrupt*" failure mode of every other revolutionary leftist I talk to.
I'm nervous about how quickly EA is growing and worried it'll just get corrupted and become just one more group that directs money based on in-group signalling. If it does, it'll probably still end up a lot better than the current elite charity space that works exactly that way, because the signals required will probably align more closely with actually doing good, but I'm worried even that couldn't last long.
It's also a fear of mine around the "patient philanthropy" ideas of just saving up tonnes of money, growing it through investments, then spending it all on doing more good later. I can't see a way that isn't just massively vulnerable to getting corrupted over time - it's giving people so long to worm their way in and corrupt things. Either that or it'll become too dangerous a powerbase and someone will just expropriate it, possibly out of (some mix of real and pretend) fears that it is already fatally corrupted or is at too much risk of becoming so.
I understand turning down money and interest and enthusiasm is very hard, but I think the small size of EA is one of its strengths, and if it grows too fast it both risks corruption from the inside or attack from the outside before it's really ready to deal with those threats.
Yep, doing things at scale and moving lots of money without becoming corrupted or having your motivations change over time is really hard, and predictions are difficult especially about the future, and all that. Hopefully it doesn't fall on deaf ears where it counts most.
In the spirit of this sort of thing: Is there any meaningful grant-giving for red teaming, currently?
Concretely: If the authors of "Universal and Transferable Adversarial Attacks on Aligned Language Models" were inclined to pioneer further attacks on LLMs, as a method of demonstrating that they are fundamentally unsafe in as many situations as possible, to whom would they apply for grants, and would those grants be likely to issue? Let us assume, for the sake of argument, that the authors will in the alternative of getting such a grant do something completely different or non-technical.
One expects that (of course) someone could make more money in private industry than they could from grants. To have an incentive to go to private industry instead of producing useful research is different from having no financial incentive for producing useful research.
My expectation is that the usual suspects, including SFF and OpenPhil, would be excited by Red Teaming work, and happy to fund it if the terms were reasonable. As you note, that does not mean industry-competitive salaries, but a good proposal would likely go quite smoothly.
It is bizarre to me, if that would be something they considered a good goal, that they do not yet have bounties up for them.
Thanks for the insight.
People don't do things!
There are a ton of things that fall into this ground of 'no one did it and also no one bothered putting up a bounty but people would be excited to see someone do it.'