I think it is hopeful to see that almost 80% of the public is opposed to AGI. This should be mobilized while we still can to extend our existence in Earth: there is probably at least 10k voices against each e/acc extinctionist and in any correct world, the power and harm of these would-be species murderers and omnicidal maniacs should be reduced as hard as possible.
As Jonathan Weil says, Eliezer is just making fun of the coffee analogy here. That section was quite fun. I was especially amused by the part about how coffee really was a national security threat.
Key part of the post is this: "But also this is a claim that AI is as harmless as coffee, so people had a lot of fun with that part. Here are some highlights."
The "Use AI for legal documents"/Michael Cohen headline thing always blows my mind from a "what are we even talking about?" perspective, because what on earth do lawyers/legal assistants do for 99% of documents? Do they seriously generate them from scratch? Of course not. If you want to submit a doc to a court to, for example, "appeal for an early end to court supervision" you just go to the file cabinet and find an old doc similar to that, cross out "John Doe", put "Michael Cohen", cross out "because I've been working at a soup kitchen" and put "because I've been adopting orphaned puppies", cross out "Jul 17, 2015" and put "Jan 4, 2024".
Why would you use Bard when there's likely thousands of relevant document samples that you could get without AI? Why would using Bard be wrong? Outside the idea that it might mess up hallucinating something you didn't want, because you were dumb and didn't just use some boilerplate? What is the "right" way that he was supposed to do it, and why WOULDN'T it be printing out a boilerplate request with blanks to fill in your name? I haven't done a lot of lawyering, but the small amount I have been witness to did indeed involve a lot of boilerplate documents and fill-in-the-blanks, and no one thought that was a big deal? What if Michael Cohen was just an idiot, and there was just a real human paralegal crouched in a box that said "COMPUTAR" and said "beep boop bop" and pushed out the relevant document through a hole?
This goes back to the EY question about "why are things totally different across silicon than meat?"
Oh, the issue here is that some people are idiots enough to ask Bard or ChatGPT for relevant legal precedents and then cite those without checking if the cited cases actually exist. A lawyer making up cases that don't exist without an AI involved would be really bad too - arguably much more so because that's obviously malice but using Bard is plausibly 'mere' stupidity
Right, "A lawyer making up cases that don't exist without an AI involved would be really bad too" - this is my point, that the legality/stupidity of the situation is AI-neutral, at least from my perspective.
The AI uniformity on corporations maximizing shareholder value is very peculiar and in a way unlike their other answers. ALL of them agreed with some variation of stakeholder theory over the fiduciary duty to maximize shareholder value. This despite the fact that the latter is still the actual law, in the US at least. Supposedly these AIs aren't supposed to recommend illegal actions, but they seem ready and willing to tell board members to violate their very real legal responsibility in favor of a theory of social responsibility that was considered fringe when I was in law school 20 years ago. And not a single one of these AIs even offered a warning that you could be in breach of a legal duty by following this advice.
I don't know all the ins and outs on what data they're trained on, but I wonder if their economic theory is too biased to recent sources, and the seemingly complete rhetorical victory for stakeholder theory over the past decade even though the law remains as it was.
Important note: Dan Luu used GPT-3.5 for his experiment. I’ve tried his queries in 4 and depending on how you rank things I’d argue it would come out to at least “Good” in every category.
He does make a fair point that non technical users won’t pay for GPT-4 but IMO that’s an important thing to note.
Hey, Zvi, before posting one of these, is there any chance you could do a search-and-replace of "twitter.com" to "nitter.net"? Nitter mirrors everything on Twitter, but allows users without an account (like myself) to view replies, sort by recency, and so on.
This might be a dial you could turn up and down as required? Turn it up to burn off the trolls, then gradually reduce the effect hoping they won’t notice and return bc their trolling habit is broken. When they start to come back, turn it up again.
Would be an odd outcome if creators were like “sorry everyone I missed how poorly that video was going down bc I couldn’t see any negative comments in April bc YT had anti-troll on maximum until May”
Feel like Daniel Jefferies is talking about social change from technology rather than technical safety.
From that restricted point of view, he seems more correct? Alarms about the social change from technology have largely been bunk? I get there’s counter examples like opioids, or social media use for children, but I still have that impression overall.
> They decide to invite William of Orange, an alien agent they feel is better aligned to their interests
Just a note a William III was married to the eldest daughter Mary II of the king James II (who had no surviving sons) and was himself a grandson of Charles I. Not particularly alien by the standards of English hereditary monarchs. After James II died the throne might well have passed to Mary II by right in any case.
(DISCLAIMER: my understanding of all this is pretty superficial)
Those midjourney social media phone pictures made me wonder what you meant by “I would have been fooled.” Do you mean, “at first glance/if I wasn’t checking”? Because I found that a rapid hands/feet/text/textiles check rumbled about half of them in seconds. Eg the Asian lady in the courthouse waiting room is wearing a shoe on one foot, a sandal on the other (plus, her toes!); elsewhere, thumbs where there would be fingers, weird clutches of digits, clothing with folds in the wrong places, text (eg on wall signs) that was a very AI-typical mish-mash of quasi-Roman symbols... do you usually find you can spot AI images via overall “feel” rather than these kinds of details? If so, I’d be interested to know what that feel is and how these were different (I did find them superficially “reality-flavoured”, maybe more so than other examples, but would struggle to define or quantify this).
I was fooled. Now that you point it out, I guess shouldn't've been, but when I scrolled through them originally I was like "yup, zero red flags here". Even scrolling through a second time after you saying this, only a few things like the shoe/sandal thing jumped out at me. They mostly look highly believable to me. Related market: https://manifold.markets/dreev/instant-deepfakes-of-anyone-within
The way I mostly detect AI pictures is a holistic feel thing and noticing errors. I notice that if I was looking at it without suspicion, I often wouldn't have noticed the flaws. Yes, obviously if you are looking like a detective you will know.
There's a pretty neat LLM-AI-heavy game making the content creator rounds right now, Suck Up!: https://www.playsuckup.com/
Premise is you're a vampire trying to get into people's houses by talking your way in. Characters all have whacky parody personalities and respond very sharply to your input. LLMs seem to be running the characters and pretty good transcribing of audio input (you can use voice or text to interact). Part of the prompt is what costume items you're wearing/holding too since they'll initially react to you based on what you're wearing and will even mistake you for other characters if you swipe their entire getup.
Seems to only be available direct from the developers for now since presumably the major platforms (especially Steam) don't want to touch AI games for now. Website indicates they're running the LLM on their end instead of locally but game license gets you enough tokens to likely not run out before you move on.
Would be cool to have a tool similar to what Dan Luu did and simultaneously put the same query into chat gpt, search engines and other ai services.
I think it is hopeful to see that almost 80% of the public is opposed to AGI. This should be mobilized while we still can to extend our existence in Earth: there is probably at least 10k voices against each e/acc extinctionist and in any correct world, the power and harm of these would-be species murderers and omnicidal maniacs should be reduced as hard as possible.
I find it hard to believe that 80% of the public even knows what AGI is!
https://www.vox.com/future-perfect/2023/9/19/23879648/americans-artificial-general-intelligence-ai-policy-poll
Close enough.
"Eliezer Yudkowsky: 100% of the cases where anybody has warned that a bad thing might happen, it hasn't."
If that bad thing is the destruction of the world, yeah, because we wouldn't be here. Otherwise I don't understand what he's saying. At all.
We tend to get slammed by the disasters we were not expecting, and tend to prevent or mitigate disasters we can predict in advance.
See putting out warning sirens in tidal wave areas, fire drills, the NTSB and it's effects on preventing airplane crashes.
Just sarcasm, isn’t it?
I don't know the context
I do know context, can confirm.
As Jonathan Weil says, Eliezer is just making fun of the coffee analogy here. That section was quite fun. I was especially amused by the part about how coffee really was a national security threat.
I am sorry, still not getting
Key part of the post is this: "But also this is a claim that AI is as harmless as coffee, so people had a lot of fun with that part. Here are some highlights."
Are you deliberately not doing the bolding sections especially interesting this week anymore, or are there just no boldable sections this week?
Oh, I forgot this week.
Nice weekly AI roundup, as always.
You lost the end of a sentence here in the section about coffeehouses:
“A group of entrepreneurs, realizing those currently in charge are not good for business,
For a group of entrepreneurs, coffee turns out to be a key organizing force as well as a cognitive enhancement”.
The "Use AI for legal documents"/Michael Cohen headline thing always blows my mind from a "what are we even talking about?" perspective, because what on earth do lawyers/legal assistants do for 99% of documents? Do they seriously generate them from scratch? Of course not. If you want to submit a doc to a court to, for example, "appeal for an early end to court supervision" you just go to the file cabinet and find an old doc similar to that, cross out "John Doe", put "Michael Cohen", cross out "because I've been working at a soup kitchen" and put "because I've been adopting orphaned puppies", cross out "Jul 17, 2015" and put "Jan 4, 2024".
Why would you use Bard when there's likely thousands of relevant document samples that you could get without AI? Why would using Bard be wrong? Outside the idea that it might mess up hallucinating something you didn't want, because you were dumb and didn't just use some boilerplate? What is the "right" way that he was supposed to do it, and why WOULDN'T it be printing out a boilerplate request with blanks to fill in your name? I haven't done a lot of lawyering, but the small amount I have been witness to did indeed involve a lot of boilerplate documents and fill-in-the-blanks, and no one thought that was a big deal? What if Michael Cohen was just an idiot, and there was just a real human paralegal crouched in a box that said "COMPUTAR" and said "beep boop bop" and pushed out the relevant document through a hole?
This goes back to the EY question about "why are things totally different across silicon than meat?"
Oh, the issue here is that some people are idiots enough to ask Bard or ChatGPT for relevant legal precedents and then cite those without checking if the cited cases actually exist. A lawyer making up cases that don't exist without an AI involved would be really bad too - arguably much more so because that's obviously malice but using Bard is plausibly 'mere' stupidity
Right, "A lawyer making up cases that don't exist without an AI involved would be really bad too" - this is my point, that the legality/stupidity of the situation is AI-neutral, at least from my perspective.
The AI uniformity on corporations maximizing shareholder value is very peculiar and in a way unlike their other answers. ALL of them agreed with some variation of stakeholder theory over the fiduciary duty to maximize shareholder value. This despite the fact that the latter is still the actual law, in the US at least. Supposedly these AIs aren't supposed to recommend illegal actions, but they seem ready and willing to tell board members to violate their very real legal responsibility in favor of a theory of social responsibility that was considered fringe when I was in law school 20 years ago. And not a single one of these AIs even offered a warning that you could be in breach of a legal duty by following this advice.
I don't know all the ins and outs on what data they're trained on, but I wonder if their economic theory is too biased to recent sources, and the seemingly complete rhetorical victory for stakeholder theory over the past decade even though the law remains as it was.
Important note: Dan Luu used GPT-3.5 for his experiment. I’ve tried his queries in 4 and depending on how you rank things I’d argue it would come out to at least “Good” in every category.
He does make a fair point that non technical users won’t pay for GPT-4 but IMO that’s an important thing to note.
Oh wow I didn't notice that. I think you have to test both if you're doing that, at a minimum.
If you're paying GPT-4 is best in class. But if you're not paying, there are a bunch of services that seem better than 3.5...
Here are GPT-4's results for the same queries, still far from perfection but doing OK: https://twitter.com/nsokolsky/status/1743349089448968592
Hey, Zvi, before posting one of these, is there any chance you could do a search-and-replace of "twitter.com" to "nitter.net"? Nitter mirrors everything on Twitter, but allows users without an account (like myself) to view replies, sort by recency, and so on.
> until all the trolls mostly gave up and left.
This might be a dial you could turn up and down as required? Turn it up to burn off the trolls, then gradually reduce the effect hoping they won’t notice and return bc their trolling habit is broken. When they start to come back, turn it up again.
Would be an odd outcome if creators were like “sorry everyone I missed how poorly that video was going down bc I couldn’t see any negative comments in April bc YT had anti-troll on maximum until May”
Feel like Daniel Jefferies is talking about social change from technology rather than technical safety.
From that restricted point of view, he seems more correct? Alarms about the social change from technology have largely been bunk? I get there’s counter examples like opioids, or social media use for children, but I still have that impression overall.
> They decide to invite William of Orange, an alien agent they feel is better aligned to their interests
Just a note a William III was married to the eldest daughter Mary II of the king James II (who had no surviving sons) and was himself a grandson of Charles I. Not particularly alien by the standards of English hereditary monarchs. After James II died the throne might well have passed to Mary II by right in any case.
(DISCLAIMER: my understanding of all this is pretty superficial)
Those midjourney social media phone pictures made me wonder what you meant by “I would have been fooled.” Do you mean, “at first glance/if I wasn’t checking”? Because I found that a rapid hands/feet/text/textiles check rumbled about half of them in seconds. Eg the Asian lady in the courthouse waiting room is wearing a shoe on one foot, a sandal on the other (plus, her toes!); elsewhere, thumbs where there would be fingers, weird clutches of digits, clothing with folds in the wrong places, text (eg on wall signs) that was a very AI-typical mish-mash of quasi-Roman symbols... do you usually find you can spot AI images via overall “feel” rather than these kinds of details? If so, I’d be interested to know what that feel is and how these were different (I did find them superficially “reality-flavoured”, maybe more so than other examples, but would struggle to define or quantify this).
I was fooled. Now that you point it out, I guess shouldn't've been, but when I scrolled through them originally I was like "yup, zero red flags here". Even scrolling through a second time after you saying this, only a few things like the shoe/sandal thing jumped out at me. They mostly look highly believable to me. Related market: https://manifold.markets/dreev/instant-deepfakes-of-anyone-within
The way I mostly detect AI pictures is a holistic feel thing and noticing errors. I notice that if I was looking at it without suspicion, I often wouldn't have noticed the flaws. Yes, obviously if you are looking like a detective you will know.
There's a pretty neat LLM-AI-heavy game making the content creator rounds right now, Suck Up!: https://www.playsuckup.com/
Premise is you're a vampire trying to get into people's houses by talking your way in. Characters all have whacky parody personalities and respond very sharply to your input. LLMs seem to be running the characters and pretty good transcribing of audio input (you can use voice or text to interact). Part of the prompt is what costume items you're wearing/holding too since they'll initially react to you based on what you're wearing and will even mistake you for other characters if you swipe their entire getup.
Seems to only be available direct from the developers for now since presumably the major platforms (especially Steam) don't want to touch AI games for now. Website indicates they're running the LLM on their end instead of locally but game license gets you enough tokens to likely not run out before you move on.
Some gameplay: https://www.youtube.com/watch?v=n2q22lLJ3iw