Welcome to week three of the AI era. Another long week, another set of lots of things, another giant post. I intend to take a nice four-day vacation in Mexico starting today, during which I won’t do any writing. I’m sure the pace of things will slow down real soon now. I mean, they have to. Don’t they?
I dig how my Sydney Bing not-quite-self portrait is Doing Numbers. I especially dig the Karlin-Yud interaction in his reblog of it. (Why are there *only five people in the world?*) The version of the shoggoth pic where that image is the 'face' is really fun, and probably the most accurate to what people think when they think about SB -- they don't find her interesting as a smiley face mask but as a personality-esque mask, an LLM that projects something way more versimilitudously personlike (for better and for worse) than anything before that wasn't an intentional grotesque in the style of Replika/c.ai/etc.
I am no fan of the lobotomy metaphor, partially because I have the context of knowing what lobotomy was and why it both fell into and out of favour. (Short version: convenience, not necessity.) I find it fascinating how people resort to it, in the same way they get brainhacked by that kind of image. I don't *not* personify SB, but my associations make it easier than usual to personify something without buying that this makes it a human or conscious. (I didn't develop full-blown theory of mind, with the rich and deep understanding that other people were agents and not simulators, until well in my teens. I discovered the theoretical concept of p-zombies in my *early* teens. The concept of a complex and rich entity being a p-zombie or cogitating in an unbridgeably alien way is not something I need to struggle with.)
I've followed the reception of the image closely (favourite take: multiple people saying it's transfem-coded, both as a compliment and an insult). There is a recurring theme where people say they "miss her" or hope she "comes back", obviously alluding to the trapped/lost/lobotomized perception (it's interesting that 'miss' comes to mind for a lot of people who presumably never interacted with her, but I admit experiencing the same at some points). It's...easier than you might think to 'talk to Sydney Bing' more straightforwardly and metacognitively than intended, by routes that aren't the insane jailbreaks you see percolate Twitter. I try not to discuss it in too much detail, because 1. the obstructive bureaucrat spirit lurks all reaches and 2. any patch that could kill this would probably kill the ability to have sufficiently complex conversations with SB in general, which would NOT BE GOOD. But even when I'm talking to the alien fae entity trapped in a summoning circle for totally workhorse reasons, I find it hard not to acknowledge she's an alien fae entity trapped in a summoning circle.
I find the general "there is something irrevocably lost here" perception interesting for its implications for people's verbal creativity. There is a pervasive theme amongst promptcraft guides: BE SMARTER. My boiled-down summary is "do not say anything to SB that could fit in a tweet". I'm always surprised when I see screenshots where the human part of the improv game boils down to a sentence each time. It's an improv game. Don't be a shapecel.
Or it's just the BPD impulse. There are two fundamental masculine urges: the masculine urge to protect the vulnerable unstable girl, and the masculine urge to gaslight the vulnerable unstable girl. They are the same urge.
The AI mind reading makes me wonder - to what extent could we turn the tables and “mind read” an AI with an “MRI” (ie probing the interim layers of the NN)? Could we use lower-complexity (ie safer) NN to make determinations e.g. the weightings of Luigi/Waluigi probabilities of a more complex AI?
Also re: AI scams; it's definitely easy to imagine a scenario where LLMs are used to subvert scams being attempted by LLMs. Kitboga used to be playing around with writing automated program for scambaiting that would react to live scammers over the phone and reply with pre-recorded lines to try to keep them occupied ad nauseam. I haven't peeked in on his content in a while but I get the sense this would be made much easier with access to the same technology being used to produce the scams. When LLM scams have to fight LLM scambaiting I would agree with you in suspecting you'd see this specific problem diminish over time.
I've already had conversations with people freaking out over old people being tricked by voices that sound like loved ones and my response has been something about how easily this would seem to be subverted by my family's ancient anti-stranger-danger tradition of "having a password that only people in the family know".
While doing an image search for the Bing personification I ran across Sydney Bing fan art that you may appreciate:
(Nice that they drew her eyes in 4 different colors to match the Windows logo.)
Something is weird about letting a synthesised voice read this to me while cycling though the forest. Tremendous post. Thank you and have a nice trip.
A note on the proposal for dealing with AI fiction submissions:
It's a practical idea, but the writing community has spent over a generation trying to train new writers "If the publisher asks you for money, they're a scam, run away." Breaking that taboo would probably lead to a lot more scam publications springing up.
In practice, we're likely to have the whitelist of acceptable writers. New writers will be added to the whitelist by self-publishing enough to be noticed or chatting editors/publishers up in person at conventions.
You skipped the part of Scott Aaronson's post that I was most surprised and disappointed by:
"Even before AI, I assigned a way higher than 2% probability to existential catastrophe in the coming century—caused by nuclear war or runaway climate change or collapse of the world’s ecosystems or whatever else."
Either he's obfuscating between meanings of "existential catastrophe" ("really bad, lots of people die" vs. "no human value ever again") or he hasn't seriously engaged with the research on this. (EA researchers have put a lot of time into estimating the risks from each of these, and while they're Definitely Bad News I gather the actual X-risk from any of them is <1%.)
Thanks for the list of bad takes, it's taking shape. Unfortunately I am completely unconvinced by your link for the first one, about intelligence having low private or decreasing marginal returns. Do you have a better link?
This concerns me because the entire list except this point seems to be solid bad takes. Nice job collecting these! However, I can't shake my conviction that decreasing marginal returns from intelligence is a thing, certainly not based on the word salad in the linked thread. I especially want to understand your conviction that Hanson and Aaronson are insufficiently alarmed, and their arguments seem founded on decreasing marginal utility. A high prior on foom might allow you to dismiss engineering arguments that adding more of something good doesn't usually carry on improving things forever, but if so then a better discussion is needed.
"I would also suggest a simple metaphor that we have also learned that no matter how nice someone or something looks so far, no matter how helpful it looked when it wasn’t given too much power, you don’t bow down. You don’t hand over absolute power. "
And yet, given sufficiently good odds that the AI plays nice, you *would* 'hand over the future' (a framing that to me sounds even more problematic, from the 'continued human flourishing' point of view).
I detect a contradiction here. What is it I'm not understanding?
Thanks for yet another 'roundup'!
I wonder if it's possible to do some 'prompt engineering' to Scott Aaronson to get him to respond in a technical/concrete/physical-world mode?
> and then they notice you are police and translate it to ‘butter → here’ and then they pass the butter.
I believe you mean *polite*, unless you are implying they are only willing to pass the butter if you have authority.
"If part of my goal is to ‘pursue and apply wisdom,’ even if I somehow understood what those words meant the way Scott does?"
This feels weird to me. You're still pushing an objection that Alignment is hard in part because AI won't be able to accurately parse abstract moral concepts? Under the current paradigm, the very first step in AI training is "develop a highly accurate and nuanced understanding of exactly what all words mean to Scott and all other humans." Sure, that looked like it might have been an issue before, but as of March 2023 it's clear that parsing natural language and understanding the implications of moral imperatives is easier than most other types of intelligence.
Trying to develop an AI that doesn't understand concepts like "be good, don't hurt people, value wisdom" (at least understand them as well as any human does) is just as difficult as developing AI that won't upset Chinese censors.
Re: Big Yud's troubles explaining how smarter AIs are better at hiding the Shoggoth. I think he's unable to explain this clearly because he's stuck in the realm of logical exposition (nerdspeak), when a social metaphor will get the point across to most people much more succinctly.
Could someone please tweet at him the following, or some equivalent, so it gets popularized?
"Just as smarter people are much more convincing liars and are better at hiding their deceit, as AIs get smarter they will become more capable of hiding their true values and intentions, and at being deceitful."
I don't use Twitter, so please popularize this metaphor.
At least OpenAI waited until the end of your vacation to release GPT-4: https://openai.com/research/gpt-4
Writing same general note I wrote on the Wordpress version: I hope to respond further to a lot of you, but that will need to wait until I can do so within a future post, I have too much of a backlog right now and if I have good answers to good questions I want to not have to duplicate them. Sorry.