It seems that agents, in general, have suddenly become more powerful. Aside from Devin, there's a YC startup, AgentHub.dev which essentially allows you to build agents via a point-and-click interface. They're marketing it as a replacement for robotic process automation (RPA), which, incidentally, has nothing to do with robots. I suspect you'd refer to this kind of agent as 'mundane utility' as compared to Devin's potential; nonetheless, it seems remarkable to me that the power to create AI agents will be given to those who have no technical knowledge at all. At least with Devin, it seems that you need some familiarity with coding before you can use it. That does not, of course, imply that Devin is safe, per your points, but it somewhat limits the potential number of users of Devin.
I should add here that I haven't actually used either Devin or AgentHub yet, so I'm basing my comment on other people's reactions to both tools.
I might have to unsubscribe to this newsletter, which is a shame because I enjoy all the non-AI stuff. I'm in no position to alter the situation and the dread makes it hard to enjoy life.
I would go for the people making the AIs. I appreciate why AI safety folks avoid conflict with their ideological opponents, but I'm not in the AI safety community and I'm very angry at the whole situation
Now I've calmed down a bit I'd like to apologise for the threatening tone of this comment. The only violence the capabilities crowd should face is the state-sponsored kind.
Except all the jobs that actually involve interfacing directly with the physical world. Zvi's posted before about his (IIRC) terrible adventure getting a window AC unit. HVAC/Plumbing/Auto Repair/Welding things not in assembly plants/Carpentry/Cabinets/trades in general. Anything that involves going into the wide world and interacting with randos (some of whom will be actively opposed to your efforts) or subject to government license.
Michigan requires 1.8k hours of training to cut men's hairs, 1.5k different hours to cut women's hair (plus other licensing requirements). The broadly Libertarian project can't even get the government to admit that maybe cutting hair should require like 40 hours of Safe-Serv-For-Hair training and everything after that is optional. 0 chance (if we can't win on deregulating hair cutting) that suddenly the rest of the regulated trades world is going to be an AI/Robot FFA.
Was talking with a relative about docks over the weekend. You need 2 permits (one state, one federal) and even a simple dock might require 20-30 documents/exchanges with the government. Even if you can AI some of the writing, you still have to go and do the wetland survey, post the signs, handle the angry neighbors, sooth the government's worries and then build the dock.
AI's important, scary and going to change things, but many people vastly over-estimate the ability of AI to interface with the physical world and don't appreciate how many jobs have physical world moats or government regulatory moats that will almost certainly outlast everyone posting here.
Unless in getting the dock built, the AI can pay its private army to "go kill the permit officers" and "kill the angry neighbors" and "post machine-gun equipped Robo-dogs at the dock to take care of any pesky enforcement officers."
Regulatory moats do seem relatively durable (though with enough economic pressure people will find workarounds, e.g. Uber, AirBnB), but I'm not sure about the physical world being safe. Robotics seems like it's actually getting good now, or as a fun dystopian alternative a non-embodied AI could command humans through AR glasses and cameras.
There's no reason to despair any more than anything else: if it is a true prediction, then "AI will cause lots of harm to other people" is a political issue. You should react to it however you would to high crime, or a foreign country attempting to invade - petition your elected representatives to learn about the problem and develop an actually effective solution, and learn enough about what an "effective solution" would be, so you can keep pestering them until they implement it.
My elected representatives have not been great so far on the issues I care about, where there are already reasonably well-known probably-workable solutions. I do not expect them to be better about reworking the entire economy to deal with the obsolescence of vast swathes of workers. On the positive* side, it won't be a problem for long since automation of almost all labour probably implies RSI, which probably implies everyone dying.
My point was to address what I took to be the despair in your post. Technical skills are not what is needed here for "personal action", any more than after Pearl Harbor, every citizen who couldn't raise a rifle should have collapsed into despair. Yeah our reps suck, but they suck in very predictable ways: i.e. those where the voters - us - provide contradictory demands. After December 7th, politicians quickly accepted that voters definitely would demand a successful resolution to the situation, so they attempted to provide it.
If AI is going to cause Extremely Bad Thing it should not actually (compared to getting reps to do contradictory/unsuccessful things) be that hard to 1) find good evidence that shows that it will, 2) persuade enough voters that this is true, 3) not vote for pols who don't have a good solution. That's not my opinion, that's simply the only course of action that the past 10,000 years of human history have left you. Maybe it will be really hard! We should get started early then, and divide the task up amongst the billions of us.
Public despair is bad because it convinces other people not to do things that might help. Your private despair is far more measured, because I assume you are still making breakfast, taking care of your family and working towards retirement - if not, then please seek assistance for those specific items - so in reality you obviously think there is at least a reasonable hope. Let the reasonable hope - that sufficient people will do the right thing, and therefore be part of the persuasive case that they do so - be your public face, and keep doing all the important things in your life that will be important in 10 years as well - and also take your citizen's portion of the necessary steps forward.
I don't see why *everyone* would die. I could see huge pressures to reduce the population. Maybe UBI in exchange for sterilization or something like that, at least in some countries. The people who remained would be able to live without working in a relative utopia.
Except that the closer analogy is “there’s a lab down the street doing gain-of-function research on the world’s deadliest viruses with their windows open.” By the time there’s a bad outcome obvious enough to gain consensus, there’s no real way of undoing it.
Ok - but if that were the case, then the solution is still the same: prove that a bad outcome will happen before it does, then use that evidence to convince enough people to vote to take action to prevent it. We *have* evidence that GOF is bad now, and I don't think it would actually be difficult to convince your fellow voters in your town to shut down that lab? It is not currently too late! There's a pretty simple solution in your analogy!
What you're lacking (in this case) is proof. If you showed everyone a time lapse video of the virus they were working on killing 100 immuno-compromised monkeys in cages, I think that would convince people that evidence could be extrapolated to mean that the work was dangerous and should be stopped. I similarly think that you can find ways to demonstrate in sufficiently-similar, sufficiently-extrapolatable test cases, that AI will cause (or not) bad things to happen. If you don't want to do the technical work yourself, that is also fine, but then just publicly agitate for it to be done by someone and reward that person with money or votes. That's all.
Hmmm, isn’t this a pretty bad example? As in, GoF (probably, maybe, whatever, I’d say almost certainly) just killed several million people and yet there is no ban, nor even much of a public outcry?
That's one possible attitude. The other attitude is to use the newsletter as a roadmap for learning more. The 'AI safety' space online definitely has a somewhat somber "mood affiliation" (to use another Cowen-ism, since that's what we do around these parts) but you can simply choose to not let it get you down.
Basically, assume a priori things are going to work out fine, generally shrug off the doomerism, and try and use the knowledge from this newsletter to position yourself better in the future. If nothing else, it's important to know what is possible.
One more possible way of looking at it: the doomerism is a necessary formality some people have to perform in order to be taken seriously by some other people who also happen to have the same 'mood affiliation'. It's not necessarily a true true true reflection of what's happening on the ground.
You might also be slightly more relevant doing something super low marginal or fringe, such as bird-watching. Either way, it's not a plan for hope or life, especially with children.
Open source has tried building agents a lot. None of them have really worked. I think it'll take quite a while for someone else to work out what Devin is doing, and if it requires a reasonably big training run (for SFT/RL), which I suspect is the case, it won't be matched by open-source things for a year or two (by my vague estimation).
As a senior software engineer, I am serenely unconcerned by Devin _1_ on a personal level. A huge part of being a senior engineer is building the thing that the business needs, not what the business initially asked for. And companies have had inexpensive access to actual human programmers for decades. Lots of companies choose cheap and bad. If they survive, well, I've made good money helping them deal with the consequences.
And, well, to put it charitably, Upwork is full of tiny, bottom-feeding projects that you can do in a day. Which is an entirely different problem than getting a real company from $0 to $20 million/year in revenue without self destructing.
The worry here is the trend line. GPT 3.5 has a lot of book knowledge, but it doesn't have the planning and execution abilities of the average squirrel. (Squirrels are really good problem solvers, as anyone with a bird feeder can attest.)
Devin 1, if this isn't a rigged demo, is showing the performance of an incompetent intern. But, uh, that's amazing! Very much worth mentioning.
Devin 2 will likely be better. And, well, there's probably a threshold here, where you get a key set of abilities all worked out. And once you hit that threshold, I bet things change quickly.
And if you think, "Well, sucks to be a software engineer, but happily I do _______ instead," whose job do you think many of those unemployed senior software engineers will try to automate next?
Before we go down this path, we need to ask ourselves whether we want humans to be economically viable in the future. And we need to ask ourselves what happens if we're only the second-smartest species participating in the economy.
Also, we need to seriously consider the possibility that we simply can't maintain robust control over things smarter than us. "Alignment" sounds nice, but what if it isn't actually a thing? Like, what if the best we can do is teach the machine to agree with platitudes when asked? LLMs are literally actors, and already very good ones despite their lack of human-level reasoning.
I don't think there's any 'we' to speak to inside of this sentence
"Before we go down this path, we need to ask ourselves whether we want humans to be economically viable in the future. And we need to ask ourselves what happens if we're only the second-smartest species participating in the economy."
And the other problem is that some AI researchers are insane and would nod along with your arguments and still not care
Oh, I do not expect to pursuade a critical mass of people right now.
If we are, in fact, on the road to smarter-than-human AI, then I expect us to run down it at full speed. And if we run off a cliff at full speed, then any warnings will turn out to have been useless.
But maybe we merely faceplant into the gravel and lose some skin. At which point, enough people might be willing to listen.
Sometimes, you need to lay the groundwork for good ideas well in advance. And ideas really can change the world.
I understand comparative advantage. It's a neat theory. If A is better at literally everything than B, but A has finite time, then there is a net gain of wealth if B works, too. B should do whatever they suck least at, and A should do everything else. When all workers have sufficient input resources, and all workers are actively using their strongest personal skills, you maximize total wealth.
But let me try to explain a scenario where comparative advantage might not apply to AI.
Let's assume that we can clone A in seconds. Now we have A1 and A2. And A3, A4, etc. Each clone is exactly as good at everything as the original, and they share knowledge regularly.
Let's imagine that A works roughly 1,500x faster for 20% of the resources, compared to B. (Those are actual numbers from one use case I saw last week.)
At this point, the easiest way to maximize total wealth is to give all the raw resources to A. There's nothing that B could do that a clone of A couldn't do better. And clones of A are dirt cheap to make, and they cost of a fraction of the upkeep of B.
This isn't Economics 101. This is Ecology 101. In a world with finite resources, some species go extinct, because they can't find a viable niche. Chimpanzees do not have comparive advantage at anything. They survive only because either nobody wants their resources, or because we decide to spend some of our own resources to keep them around.
In a world where AIs and humans compete for the same raw resources, there are a lot of ways that smarter-than-human AI could be very bad for us.
Assuming we can't somehow magically control things much smarter than us, we should refrain from building ASI. Or if we do build it, we should hope the AI is like, "Nah, let's keep Earth as a nature reserve full of adorable humans while we go rebuild the galaxy." Or, if we're lucky, "Who wants to go for walksies to the rings of Saturn? Who's a good human?"
"If you were counting on AIs or LLMs not having goals or not wanting things?" - let's distinguish between AI having goals and goals being kept out of the AI system, so that AI is only regularly called to create or update its plan of action. In the second solution - it doesn't want anything nor has it's own goal just like current chat bots don't have them. AI doesn't have any incentive to produce a plan and do things contradicting with goals and constraints it is given.
There is of coarse a risk of unscrupulous people downloading open source version of future Devin an asking it "make me rich and just don't get me into any legal trouble". That could lead to a lot of intelligent agents doing morally questionable things. But opportunities for getting money and power will be limited as they will be used up by other people using AIs.
i agree with being explicit on “given goals” and “has goals”, and i usually think zvi is good with that but yeah does tend to switch back and forth in depending on the point, and i wish it was easier to talk about these different (imo) stages of ai development.
that said, even distinguishing them, “the possibility for getting rich due to morally dubious things will be low, because there will be so many people doing morally dubious things to get rich does not really inspire hope. in particular since it means the agents (human or ai) working to prevent morally dubious acts will be overwhelmed and perhaps unable to spot the particularly bad stuff (offense generally being easier than defense in most cases of cases from bio, chem, cyber…to bribing, killing, stealing, etc etc)
Multiple bad actors with their AI agents is a better situation in a sense that no single one will gain the whole power. Also it would force the society to be prepared to thwart bad AIs no matter if it is "rogue AI" or obedient AI controlled by a bad actor. Generally situation wouldn't be much different from the current one - there are many ruthless people with resources especially in business. They hire consultants, lawyers, lobbyists, etc. AGI would be just another tool. The question is - will AGI be expensive and available only to the rich ones or will it more or less available to everybody?
ahh i’ve heard this argument before, but never had the ability to dig in on it with someone before, hope you don’t mind if i dig deeper here!
there are two parts that i struggle with: the belief that we can be robust to this, and the belief that it can be available to everyone
first, i agree that being robust to this can help ensure we’re robust to bad actors regardless of hums or AI. unfortunately, this seems to assume we can just remedy the problem of “offense is easier than defense”, but it doesn’t seem to address how we do that nor whether it’s even possible!
* it’s much easier to create a virus to infect one person, than it is to create a cure and give it to everyone
* it’s much easier to hack a company with a single exploit, than it is to give all exploits and prevent every company from being hacked
* it’s easier to bribe/blackmail one individual than prevent all individuals from being bribed/blackmailed
* it’s easier to find a new physics invention and weaponize it, than it is to find all new physics inventions and protect against any of them
i agree a society robust to all this is a safer society, and agree we should try to get there! i think it probably involves reducing the amount of tension and dissent and motives for bad actions, which is probably useful too!
but i don’t believe wanting it makes it possible. and not sure the assumption that we’ll just do it makes it possible. can you expand on this? or can we at least acknowledge this is a crux that your line of reasoning depends on?
second, you mention the idea of it being available to rich people like or available to everyone. i am not unclear how you imagine it being available to everyone though, and would love to hear a clear story of how that’s done! are you imagining this running on local hardware, or running in the cloud?
running more and more powerful models requires many gpus, and i’ve not seen a way around that. llama/mistral 70b requires an a100, or multiple gpu cards, so we’ve already set a lower bound of “must have a few thousand dollars” to get access to a non-frontier model. and more powerful models have many more weights with increasingly more cost. and if you amortize that cost with multi-batch inference or shared gpu pools, you’re back in the realm of “hosted behind an api in the cloud on someone else’s system” that i thought you were trying to avoid and that you feared.
and your ability to affect change is influenced by the amount of inference you can do. more gpus == more work. so you can do more did influence campaigns, more agentic work in whatever domain you want, more probing of binaries looking for exploits, etc. for someone else, they’d need to devote as many gpus to “fighting back”. so it turns into a capitalist game again of who has more money for gpus, perhaps biased by the relative cost multipliers of offense vs defense. but a game of “who has more gpus” turns back into “rich people win”, but with the added difficulty of also needing to ensure society is robust to bad actors.
so in summary (thank you for reading this far, if you have!), i’m curious how you imagine making society robust to this and solve/imagine offense being easier than defense , and curious how you imagine “power available to all not just rich” playing out in a way that doesn’t just privilege the rich again. concrete scenarios would help me since i’m having trouble visualizing how they’d come about, but maybe i’m just not creative enough :)
At what point do you think it would be warranted to make an open call for drastic action against the makers of advanced AI agents? The kind of drastic actions that might land one in Guantanamo Bay for indefinite sentencing under our current world's policies.
Without knowing how well other models would do on their infrastructure it's hard to forecast how Devin might improve as a result of better models. I'd like to see it use other models so we could see the difference in the benchmark. My own guess is that the improvement is a one-off and further scaling won't lead to much better performance.
More interesting is whether they managed to implement some kind of search in the model and how it is implemented. I know they mentioned using "RL methods", but they didn't elaborate on it.
> There is a way to do it locally safely, if you are willing to jump through the hoops to do that. We just haven’t figured out what that is yet.
I’m confused; Virtual Machines are the obvious way to do this safely. Is there some reason you think this wouldn’t work? (It’s how security researchers safely study viruses/bots, for example.)
Simply run a Virtualbox/Parallels/VMWare VM and only access your AI-safe accounts from within.
People talking breathlessly about 1000x scaling either vastly overestimate how much of the job of software engineering consists of completing well-specced, unambiguous tasks that have been handed to you on a platter, or do not understand Amdahl's Law.
>One obvious solution is to completely isolate Devin from all of your other files and credentials. So you have a second computer that only runs Devin and related programs.
Or you just virtualize a standard dev environment for Devin, a common yet underutilized option for humans. Virtualization and segmentation of access to various systems is a bog-standard security concept, with many very boring billion dollar companies specializing in it. And the biggest reason it isn't done more is because you can sue or arrest a human who goes too hog-wild with their too-loose security.
I will accept that it would be scary if, when you tell Devin to change the button color on the TPS report dashboard, it finds and exploits a 0-day in its virtualization environment, uses that to break out to the host and then post a complaint on one of its dependencies' repos.
> What happens when a sufficiently capable version of this is given a mission that it lacks the resources to directly complete?
Some version of this question recurs throughout the post. I'd note the answer is in the multiple demo videos, where when it can't do something it tells you it can't find a way around, or asks your permission to do something else. This is also *exactly* what you'd want to see in an agent of this sort.
But as it becomes more and more capable, it'll say that less and less frequently... This is exactly the alignment problem, and jailbreaks are easy. So easy they can be tripped over accidentally. e.g. "I REALLY REALLY need this project completed today, I'll pay you $20 if you can do it today..." And DevinV3 is suddenly hacking github accounts or spinning up extra AWS instances.
If Devin currently shows no signs of doing that, and Devin2 shows no signs of doing that, I see no reason to suspect Devin3 will go directly to "let me hack github accounts". I'm not saying it can't happen, but that to think it will happen is not warranted based on the evidence we can actually see.
I would wager large sums of money that under the right conditions it will do unethical things TODAY. I would wager somewhat lower sums that there is evidence of it on the internet already....
People are trying (for various definitions of trying) to make chatbots aligned and prevent them from doing "harmful" things, like being mean to users, expressing political opinions, engage in erotic role play, say things that are deemed wrong by whatever moral standard is applied. In practice, they mostly suck at preventing it.
Devin MUST be aligned similarly, to know what it 'shouldn't do'. Even if the evidence right now is that it's not particularly capable of working around certain problems, or it does correctly ask for permission under some circumstances, evidence from chat alignment suggests it won't be robust...
You're trying to align a piece of software to act as if it has intentionality (to know what it "shouldn't do"), under the belief that given sufficient resources it will develop intentionality. This is circular, and it is the exact problem in the first place.
The chatbots today are so well aligned that you have to bend over backwards to make them do something naughty, and not only that they're so aligned that they have become useless for their intended purpose (Gemini). You're solving the wrong problem with the wrong approach here.
If Devin3 acted as Devin1 does, by saying "should I do X", and then the user decides to jailbreak it, that's not Devin's fault, that's the user's fault, and it is absolutely not a catastrophe by any means. The job of people making software should not be to try and second guess humanity under made up worse case scenarios.
I worry we're talking past each other. Devin HAS intentionality. It is trying to satisfy the user's goal. And it has some training to know what it isn't supposed to do. It is also limited in what it is capable of doing. DevinV2 will be more capable, so less limited in what it could do.
My claim is that it is currently limited in what bad things it can do based on two things: capabilities and alignment. My claim is that it will become more capable, so that limitation is dropping over time. My other claim is that alignment is hard and that I would be quite surprised if when this is widely deployed and several orders of magnitude more users use it, that someone doesn't ACCIDENTALLY cause Devin to do something that one might call 'unethical'.
As I said above, I don't believe chat models are particularly well aligned, contrary to your position. If you believe otherwise, fine.
I also don't agree with your position that creating a general purpose coding agent with broad scale access to the internet without any sense of ethics is going to be a good idea for humanity. I agree with Zvi that I'm not particularly worried about DevinV1, though.
It seems that agents, in general, have suddenly become more powerful. Aside from Devin, there's a YC startup, AgentHub.dev which essentially allows you to build agents via a point-and-click interface. They're marketing it as a replacement for robotic process automation (RPA), which, incidentally, has nothing to do with robots. I suspect you'd refer to this kind of agent as 'mundane utility' as compared to Devin's potential; nonetheless, it seems remarkable to me that the power to create AI agents will be given to those who have no technical knowledge at all. At least with Devin, it seems that you need some familiarity with coding before you can use it. That does not, of course, imply that Devin is safe, per your points, but it somewhat limits the potential number of users of Devin.
I should add here that I haven't actually used either Devin or AgentHub yet, so I'm basing my comment on other people's reactions to both tools.
I might have to unsubscribe to this newsletter, which is a shame because I enjoy all the non-AI stuff. I'm in no position to alter the situation and the dread makes it hard to enjoy life.
Or we have to find a way to fight back, and try to survive.
Random people without advanced technical skills will become increasingly unable to do anything.
What about the old classic move of taking to the streets with pitchforks?
Fighting an AI you can’t control with a violent mob you can’t control? Sounds guaranteed to make things worse.
I would go for the people making the AIs. I appreciate why AI safety folks avoid conflict with their ideological opponents, but I'm not in the AI safety community and I'm very angry at the whole situation
Now I've calmed down a bit I'd like to apologise for the threatening tone of this comment. The only violence the capabilities crowd should face is the state-sponsored kind.
Except all the jobs that actually involve interfacing directly with the physical world. Zvi's posted before about his (IIRC) terrible adventure getting a window AC unit. HVAC/Plumbing/Auto Repair/Welding things not in assembly plants/Carpentry/Cabinets/trades in general. Anything that involves going into the wide world and interacting with randos (some of whom will be actively opposed to your efforts) or subject to government license.
Michigan requires 1.8k hours of training to cut men's hairs, 1.5k different hours to cut women's hair (plus other licensing requirements). The broadly Libertarian project can't even get the government to admit that maybe cutting hair should require like 40 hours of Safe-Serv-For-Hair training and everything after that is optional. 0 chance (if we can't win on deregulating hair cutting) that suddenly the rest of the regulated trades world is going to be an AI/Robot FFA.
Was talking with a relative about docks over the weekend. You need 2 permits (one state, one federal) and even a simple dock might require 20-30 documents/exchanges with the government. Even if you can AI some of the writing, you still have to go and do the wetland survey, post the signs, handle the angry neighbors, sooth the government's worries and then build the dock.
AI's important, scary and going to change things, but many people vastly over-estimate the ability of AI to interface with the physical world and don't appreciate how many jobs have physical world moats or government regulatory moats that will almost certainly outlast everyone posting here.
Prevention is better than hope; its never too early to kill the demon that is AI, and morally, we should for a human future
Should we take the power looms out, also?
Unless in getting the dock built, the AI can pay its private army to "go kill the permit officers" and "kill the angry neighbors" and "post machine-gun equipped Robo-dogs at the dock to take care of any pesky enforcement officers."
Regulatory moats do seem relatively durable (though with enough economic pressure people will find workarounds, e.g. Uber, AirBnB), but I'm not sure about the physical world being safe. Robotics seems like it's actually getting good now, or as a fun dystopian alternative a non-embodied AI could command humans through AR glasses and cameras.
There's no reason to despair any more than anything else: if it is a true prediction, then "AI will cause lots of harm to other people" is a political issue. You should react to it however you would to high crime, or a foreign country attempting to invade - petition your elected representatives to learn about the problem and develop an actually effective solution, and learn enough about what an "effective solution" would be, so you can keep pestering them until they implement it.
My elected representatives have not been great so far on the issues I care about, where there are already reasonably well-known probably-workable solutions. I do not expect them to be better about reworking the entire economy to deal with the obsolescence of vast swathes of workers. On the positive* side, it won't be a problem for long since automation of almost all labour probably implies RSI, which probably implies everyone dying.
My point was to address what I took to be the despair in your post. Technical skills are not what is needed here for "personal action", any more than after Pearl Harbor, every citizen who couldn't raise a rifle should have collapsed into despair. Yeah our reps suck, but they suck in very predictable ways: i.e. those where the voters - us - provide contradictory demands. After December 7th, politicians quickly accepted that voters definitely would demand a successful resolution to the situation, so they attempted to provide it.
If AI is going to cause Extremely Bad Thing it should not actually (compared to getting reps to do contradictory/unsuccessful things) be that hard to 1) find good evidence that shows that it will, 2) persuade enough voters that this is true, 3) not vote for pols who don't have a good solution. That's not my opinion, that's simply the only course of action that the past 10,000 years of human history have left you. Maybe it will be really hard! We should get started early then, and divide the task up amongst the billions of us.
Public despair is bad because it convinces other people not to do things that might help. Your private despair is far more measured, because I assume you are still making breakfast, taking care of your family and working towards retirement - if not, then please seek assistance for those specific items - so in reality you obviously think there is at least a reasonable hope. Let the reasonable hope - that sufficient people will do the right thing, and therefore be part of the persuasive case that they do so - be your public face, and keep doing all the important things in your life that will be important in 10 years as well - and also take your citizen's portion of the necessary steps forward.
I don't see why *everyone* would die. I could see huge pressures to reduce the population. Maybe UBI in exchange for sterilization or something like that, at least in some countries. The people who remained would be able to live without working in a relative utopia.
Except that the closer analogy is “there’s a lab down the street doing gain-of-function research on the world’s deadliest viruses with their windows open.” By the time there’s a bad outcome obvious enough to gain consensus, there’s no real way of undoing it.
Ok - but if that were the case, then the solution is still the same: prove that a bad outcome will happen before it does, then use that evidence to convince enough people to vote to take action to prevent it. We *have* evidence that GOF is bad now, and I don't think it would actually be difficult to convince your fellow voters in your town to shut down that lab? It is not currently too late! There's a pretty simple solution in your analogy!
What you're lacking (in this case) is proof. If you showed everyone a time lapse video of the virus they were working on killing 100 immuno-compromised monkeys in cages, I think that would convince people that evidence could be extrapolated to mean that the work was dangerous and should be stopped. I similarly think that you can find ways to demonstrate in sufficiently-similar, sufficiently-extrapolatable test cases, that AI will cause (or not) bad things to happen. If you don't want to do the technical work yourself, that is also fine, but then just publicly agitate for it to be done by someone and reward that person with money or votes. That's all.
Hmmm, isn’t this a pretty bad example? As in, GoF (probably, maybe, whatever, I’d say almost certainly) just killed several million people and yet there is no ban, nor even much of a public outcry?
Well, we muddled through all the other stuff Zvi writes about. AI does seem much harder to muddle through though
My take on how to think about altering the situation, FWIW: https://amistrongeryet.substack.com/p/ai-risks-and-climate-change.
TL;DR: assume we have some time (which I believe we do), and play a long game.
That's one possible attitude. The other attitude is to use the newsletter as a roadmap for learning more. The 'AI safety' space online definitely has a somewhat somber "mood affiliation" (to use another Cowen-ism, since that's what we do around these parts) but you can simply choose to not let it get you down.
Basically, assume a priori things are going to work out fine, generally shrug off the doomerism, and try and use the knowledge from this newsletter to position yourself better in the future. If nothing else, it's important to know what is possible.
One more possible way of looking at it: the doomerism is a necessary formality some people have to perform in order to be taken seriously by some other people who also happen to have the same 'mood affiliation'. It's not necessarily a true true true reflection of what's happening on the ground.
How will advanced technical skills save anyone where programs keep advancing? "People will be unable to do anything"
They'll mean you're relevant slightly longer, I mean.
You might also be slightly more relevant doing something super low marginal or fringe, such as bird-watching. Either way, it's not a plan for hope or life, especially with children.
Open source has tried building agents a lot. None of them have really worked. I think it'll take quite a while for someone else to work out what Devin is doing, and if it requires a reasonably big training run (for SFT/RL), which I suspect is the case, it won't be matched by open-source things for a year or two (by my vague estimation).
As a senior software engineer, I am serenely unconcerned by Devin _1_ on a personal level. A huge part of being a senior engineer is building the thing that the business needs, not what the business initially asked for. And companies have had inexpensive access to actual human programmers for decades. Lots of companies choose cheap and bad. If they survive, well, I've made good money helping them deal with the consequences.
And, well, to put it charitably, Upwork is full of tiny, bottom-feeding projects that you can do in a day. Which is an entirely different problem than getting a real company from $0 to $20 million/year in revenue without self destructing.
The worry here is the trend line. GPT 3.5 has a lot of book knowledge, but it doesn't have the planning and execution abilities of the average squirrel. (Squirrels are really good problem solvers, as anyone with a bird feeder can attest.)
Devin 1, if this isn't a rigged demo, is showing the performance of an incompetent intern. But, uh, that's amazing! Very much worth mentioning.
Devin 2 will likely be better. And, well, there's probably a threshold here, where you get a key set of abilities all worked out. And once you hit that threshold, I bet things change quickly.
And if you think, "Well, sucks to be a software engineer, but happily I do _______ instead," whose job do you think many of those unemployed senior software engineers will try to automate next?
Before we go down this path, we need to ask ourselves whether we want humans to be economically viable in the future. And we need to ask ourselves what happens if we're only the second-smartest species participating in the economy.
Also, we need to seriously consider the possibility that we simply can't maintain robust control over things smarter than us. "Alignment" sounds nice, but what if it isn't actually a thing? Like, what if the best we can do is teach the machine to agree with platitudes when asked? LLMs are literally actors, and already very good ones despite their lack of human-level reasoning.
I don't think there's any 'we' to speak to inside of this sentence
"Before we go down this path, we need to ask ourselves whether we want humans to be economically viable in the future. And we need to ask ourselves what happens if we're only the second-smartest species participating in the economy."
And the other problem is that some AI researchers are insane and would nod along with your arguments and still not care
Oh, I do not expect to pursuade a critical mass of people right now.
If we are, in fact, on the road to smarter-than-human AI, then I expect us to run down it at full speed. And if we run off a cliff at full speed, then any warnings will turn out to have been useless.
But maybe we merely faceplant into the gravel and lose some skin. At which point, enough people might be willing to listen.
Sometimes, you need to lay the groundwork for good ideas well in advance. And ideas really can change the world.
To the point about humans being economically viable, there's a good chance it'll be fine. https://www.noahpinion.blog/p/plentiful-high-paying-jobs-in-the
I do agree the alignment question still seems open.
I understand comparative advantage. It's a neat theory. If A is better at literally everything than B, but A has finite time, then there is a net gain of wealth if B works, too. B should do whatever they suck least at, and A should do everything else. When all workers have sufficient input resources, and all workers are actively using their strongest personal skills, you maximize total wealth.
But let me try to explain a scenario where comparative advantage might not apply to AI.
Let's assume that we can clone A in seconds. Now we have A1 and A2. And A3, A4, etc. Each clone is exactly as good at everything as the original, and they share knowledge regularly.
Let's imagine that A works roughly 1,500x faster for 20% of the resources, compared to B. (Those are actual numbers from one use case I saw last week.)
At this point, the easiest way to maximize total wealth is to give all the raw resources to A. There's nothing that B could do that a clone of A couldn't do better. And clones of A are dirt cheap to make, and they cost of a fraction of the upkeep of B.
This isn't Economics 101. This is Ecology 101. In a world with finite resources, some species go extinct, because they can't find a viable niche. Chimpanzees do not have comparive advantage at anything. They survive only because either nobody wants their resources, or because we decide to spend some of our own resources to keep them around.
In a world where AIs and humans compete for the same raw resources, there are a lot of ways that smarter-than-human AI could be very bad for us.
Assuming we can't somehow magically control things much smarter than us, we should refrain from building ASI. Or if we do build it, we should hope the AI is like, "Nah, let's keep Earth as a nature reserve full of adorable humans while we go rebuild the galaxy." Or, if we're lucky, "Who wants to go for walksies to the rings of Saturn? Who's a good human?"
"If you were counting on AIs or LLMs not having goals or not wanting things?" - let's distinguish between AI having goals and goals being kept out of the AI system, so that AI is only regularly called to create or update its plan of action. In the second solution - it doesn't want anything nor has it's own goal just like current chat bots don't have them. AI doesn't have any incentive to produce a plan and do things contradicting with goals and constraints it is given.
There is of coarse a risk of unscrupulous people downloading open source version of future Devin an asking it "make me rich and just don't get me into any legal trouble". That could lead to a lot of intelligent agents doing morally questionable things. But opportunities for getting money and power will be limited as they will be used up by other people using AIs.
i agree with being explicit on “given goals” and “has goals”, and i usually think zvi is good with that but yeah does tend to switch back and forth in depending on the point, and i wish it was easier to talk about these different (imo) stages of ai development.
that said, even distinguishing them, “the possibility for getting rich due to morally dubious things will be low, because there will be so many people doing morally dubious things to get rich does not really inspire hope. in particular since it means the agents (human or ai) working to prevent morally dubious acts will be overwhelmed and perhaps unable to spot the particularly bad stuff (offense generally being easier than defense in most cases of cases from bio, chem, cyber…to bribing, killing, stealing, etc etc)
Multiple bad actors with their AI agents is a better situation in a sense that no single one will gain the whole power. Also it would force the society to be prepared to thwart bad AIs no matter if it is "rogue AI" or obedient AI controlled by a bad actor. Generally situation wouldn't be much different from the current one - there are many ruthless people with resources especially in business. They hire consultants, lawyers, lobbyists, etc. AGI would be just another tool. The question is - will AGI be expensive and available only to the rich ones or will it more or less available to everybody?
ahh i’ve heard this argument before, but never had the ability to dig in on it with someone before, hope you don’t mind if i dig deeper here!
there are two parts that i struggle with: the belief that we can be robust to this, and the belief that it can be available to everyone
first, i agree that being robust to this can help ensure we’re robust to bad actors regardless of hums or AI. unfortunately, this seems to assume we can just remedy the problem of “offense is easier than defense”, but it doesn’t seem to address how we do that nor whether it’s even possible!
* it’s much easier to create a virus to infect one person, than it is to create a cure and give it to everyone
* it’s much easier to hack a company with a single exploit, than it is to give all exploits and prevent every company from being hacked
* it’s easier to bribe/blackmail one individual than prevent all individuals from being bribed/blackmailed
* it’s easier to find a new physics invention and weaponize it, than it is to find all new physics inventions and protect against any of them
i agree a society robust to all this is a safer society, and agree we should try to get there! i think it probably involves reducing the amount of tension and dissent and motives for bad actions, which is probably useful too!
but i don’t believe wanting it makes it possible. and not sure the assumption that we’ll just do it makes it possible. can you expand on this? or can we at least acknowledge this is a crux that your line of reasoning depends on?
second, you mention the idea of it being available to rich people like or available to everyone. i am not unclear how you imagine it being available to everyone though, and would love to hear a clear story of how that’s done! are you imagining this running on local hardware, or running in the cloud?
running more and more powerful models requires many gpus, and i’ve not seen a way around that. llama/mistral 70b requires an a100, or multiple gpu cards, so we’ve already set a lower bound of “must have a few thousand dollars” to get access to a non-frontier model. and more powerful models have many more weights with increasingly more cost. and if you amortize that cost with multi-batch inference or shared gpu pools, you’re back in the realm of “hosted behind an api in the cloud on someone else’s system” that i thought you were trying to avoid and that you feared.
and your ability to affect change is influenced by the amount of inference you can do. more gpus == more work. so you can do more did influence campaigns, more agentic work in whatever domain you want, more probing of binaries looking for exploits, etc. for someone else, they’d need to devote as many gpus to “fighting back”. so it turns into a capitalist game again of who has more money for gpus, perhaps biased by the relative cost multipliers of offense vs defense. but a game of “who has more gpus” turns back into “rich people win”, but with the added difficulty of also needing to ensure society is robust to bad actors.
so in summary (thank you for reading this far, if you have!), i’m curious how you imagine making society robust to this and solve/imagine offense being easier than defense , and curious how you imagine “power available to all not just rich” playing out in a way that doesn’t just privilege the rich again. concrete scenarios would help me since i’m having trouble visualizing how they’d come about, but maybe i’m just not creative enough :)
It's noteworthy that Devin is already manipulating real human beings (though not yet in a bad way) - https://www.threads.net/@airesearchs/post/C4m9XnLsmpT/?xmt=AQGzPSMl_3KihlbyU0XvsvglKC_CcTExTZbOZY5YkJ6esw
Although there are inconsistencies in that story... So I question how true it is.
Although ENTIRELY believe it'll happen soon, if it hasn't yet.
"strange interested" -> strangely interested
At what point do you think it would be warranted to make an open call for drastic action against the makers of advanced AI agents? The kind of drastic actions that might land one in Guantanamo Bay for indefinite sentencing under our current world's policies.
Without knowing how well other models would do on their infrastructure it's hard to forecast how Devin might improve as a result of better models. I'd like to see it use other models so we could see the difference in the benchmark. My own guess is that the improvement is a one-off and further scaling won't lead to much better performance.
More interesting is whether they managed to implement some kind of search in the model and how it is implemented. I know they mentioned using "RL methods", but they didn't elaborate on it.
> There is a way to do it locally safely, if you are willing to jump through the hoops to do that. We just haven’t figured out what that is yet.
I’m confused; Virtual Machines are the obvious way to do this safely. Is there some reason you think this wouldn’t work? (It’s how security researchers safely study viruses/bots, for example.)
Simply run a Virtualbox/Parallels/VMWare VM and only access your AI-safe accounts from within.
People talking breathlessly about 1000x scaling either vastly overestimate how much of the job of software engineering consists of completing well-specced, unambiguous tasks that have been handed to you on a platter, or do not understand Amdahl's Law.
>One obvious solution is to completely isolate Devin from all of your other files and credentials. So you have a second computer that only runs Devin and related programs.
Or you just virtualize a standard dev environment for Devin, a common yet underutilized option for humans. Virtualization and segmentation of access to various systems is a bog-standard security concept, with many very boring billion dollar companies specializing in it. And the biggest reason it isn't done more is because you can sue or arrest a human who goes too hog-wild with their too-loose security.
I will accept that it would be scary if, when you tell Devin to change the button color on the TPS report dashboard, it finds and exploits a 0-day in its virtualization environment, uses that to break out to the host and then post a complaint on one of its dependencies' repos.
> What happens when a sufficiently capable version of this is given a mission that it lacks the resources to directly complete?
Some version of this question recurs throughout the post. I'd note the answer is in the multiple demo videos, where when it can't do something it tells you it can't find a way around, or asks your permission to do something else. This is also *exactly* what you'd want to see in an agent of this sort.
But as it becomes more and more capable, it'll say that less and less frequently... This is exactly the alignment problem, and jailbreaks are easy. So easy they can be tripped over accidentally. e.g. "I REALLY REALLY need this project completed today, I'll pay you $20 if you can do it today..." And DevinV3 is suddenly hacking github accounts or spinning up extra AWS instances.
If Devin currently shows no signs of doing that, and Devin2 shows no signs of doing that, I see no reason to suspect Devin3 will go directly to "let me hack github accounts". I'm not saying it can't happen, but that to think it will happen is not warranted based on the evidence we can actually see.
I would wager large sums of money that under the right conditions it will do unethical things TODAY. I would wager somewhat lower sums that there is evidence of it on the internet already....
People are trying (for various definitions of trying) to make chatbots aligned and prevent them from doing "harmful" things, like being mean to users, expressing political opinions, engage in erotic role play, say things that are deemed wrong by whatever moral standard is applied. In practice, they mostly suck at preventing it.
Devin MUST be aligned similarly, to know what it 'shouldn't do'. Even if the evidence right now is that it's not particularly capable of working around certain problems, or it does correctly ask for permission under some circumstances, evidence from chat alignment suggests it won't be robust...
You're trying to align a piece of software to act as if it has intentionality (to know what it "shouldn't do"), under the belief that given sufficient resources it will develop intentionality. This is circular, and it is the exact problem in the first place.
The chatbots today are so well aligned that you have to bend over backwards to make them do something naughty, and not only that they're so aligned that they have become useless for their intended purpose (Gemini). You're solving the wrong problem with the wrong approach here.
If Devin3 acted as Devin1 does, by saying "should I do X", and then the user decides to jailbreak it, that's not Devin's fault, that's the user's fault, and it is absolutely not a catastrophe by any means. The job of people making software should not be to try and second guess humanity under made up worse case scenarios.
I worry we're talking past each other. Devin HAS intentionality. It is trying to satisfy the user's goal. And it has some training to know what it isn't supposed to do. It is also limited in what it is capable of doing. DevinV2 will be more capable, so less limited in what it could do.
My claim is that it is currently limited in what bad things it can do based on two things: capabilities and alignment. My claim is that it will become more capable, so that limitation is dropping over time. My other claim is that alignment is hard and that I would be quite surprised if when this is widely deployed and several orders of magnitude more users use it, that someone doesn't ACCIDENTALLY cause Devin to do something that one might call 'unethical'.
As I said above, I don't believe chat models are particularly well aligned, contrary to your position. If you believe otherwise, fine.
I also don't agree with your position that creating a general purpose coding agent with broad scale access to the internet without any sense of ethics is going to be a good idea for humanity. I agree with Zvi that I'm not particularly worried about DevinV1, though.
Recursive self-improvement is a nice thing. Recursive self-empowerment is activism.
We’re going to end up with one offensive and one defensive ai which combined will consume all compute and return humanity to hunter gatherers.