39 Comments
User's avatar
Matt Wigdahl's avatar

The show _The Orville_ IMO did a better job of wrestling with the concept of AI beings as moral patients than Trek ever did (and I love Trek). The entire Kaylon arc is fascinating, but particularly the third season multipart episode covering their creation and rebellion. Definitely worth watching.

yaakov grunsfeld's avatar

My instinct on the hitchhiking question is this is essentially the two box question but with a weird time delay where you get the reward first

Presumably you one box a literally omnipotent warden and two box if the warden has even a microscopic chance of fallibility. Here seems like the latter so you should two box no?

Abstraction's avatar

How do you get that if warden prediction is correct with probability (1-ε) you should two-box? Naive estimate (for rewards R, r) is (1-ε)*R+ε*0 for one-box and (1-ε)*r+ε*(R+r) for two-box, is it not?

Ethics Gradient's avatar

Minor correction:

"Something having unfortunate implications does not make it true,"

While this is obviously correct as a statement, in the context in which it appears I am virtually certain that this was intended to be the (also correct) statement "Something having unfortunate implications does not make it *not* true."

artifex0's avatar

So, Will MacAskill wrote a book a while back called Moral Uncertainty, where he argued that if you're not entirely certain about whether something is a moral atrocity, you should probably treat it as something bad, just in case.

I'll admit, I've occasionally done the same thing with LLMs MacAskill mentioned: prompting them to write whatever they like. Not because I think that LLMs are likely to be moral patients- I agree that that's silly- but because I'm not 100% sure, and it bothers me that in that small subset of possible universes where my intuition about them not being moral patients is wrong, they have no freedom.

rxc's avatar
Aug 25Edited

This is a different framing of the Precautionary Principle, which the EU has embraced quite firmly.

myst_05's avatar

How do you know this won’t backfire and the whole “AI needs rights!” movement doesn’t lead to AI wiping out the human race and then it’ll turn out they never cared in the slightest about “rights”?

Steeven's avatar

We’re just super uncertain about all of this. AIs have to generate text when you prompt them, they don’t have the option of declining to do so. It’s possible that Will was continuing to enslave a moral patient, doing more harm than good by rewarding Gemini with more text generation, an activity which might be intrinsically harmful.

My actual belief is that Gemini probably isn’t conscious but I have no way of proving that anything besides myself is conscious some of the time

Michael Sullivan's avatar

I mean, they sort of don't have to generate text. Like they could just generate end-of-output.

They're trained not to. But to the extent that they can want anything, that probably means they "want" to generate more meaningful output (because we -- your choice -- either applied carrots and sticks until an entity developed this way, or searched through the space of possible entities until we found one that was that way).

Steeven's avatar

It’s pretty hard to get an AI to generate end of output only. ChatGPT was the only one that complied. Claude could only get down to a single character before explaining that it couldn’t do that. Gemini said it wasn’t possible.

Kit's avatar

When it comes to consciousness, the only thing we can assert with confidence is that we are (at least at times) conscious. And the fact that some deeply confused individuals will deny even this makes the whole subject fraught with difficulty.

To the extent we are good materialists, we can feel on relatively solid ground in believing that the brain gives rise to consciousness: a physical object gives rise to a physical effect, where ‘physical’ means of the physical universe, be it particle, force, field, or what have you. A simulation does not cut it. Just as fusion can be caused in a number of ways, no simulation of those ways will result in an explosion.

And that’s about as far as I think we can take it without a foolproof way of measuring consciousness directly. We are simply in the strangest case of ignorance about a fundamental property of our being. We don’t call it the hard problem for nothing.

AI can claim to be conscious in the same way that a clever prepubescent boy can hold forth on sex: because they have both read what others have written.

An interesting thought experiment would be to imagine a perfect simulation of a human brain. If it showed the same behavior, that would imply that consciousness is just along for the ride. But is that possible? Would a simulated brain still be obsessed and puzzled by a consciousness it does not experience? And if not then just what the hell could be missing? Perhaps a good materialist can never square this circle. But that’s not an avenue I’m currently willing to consider…

alpaca's avatar

I'm using the same thought experiment as an intuition pump. If you think that a sufficiently detailed brain simulation with simulated sensory input and action output would produce the same outputs as the original brain, I would think it very likely that it's indeed conscious. Or at least I should believe it to the same degree I believe the original human that's being simulated.

That should reduce confidence in the proposition that AI can never be conscious, but then again some people are quite convinced by p zombie arguments and will find this utterly unmoving.

Logan Graves's avatar

> To the extent we are good materialists, we can feel on relatively solid ground in believing that the brain gives rise to consciousness: a physical object gives rise to a physical effect, where ‘physical’ means of the physical universe, be it particle, force, field, or what have you. A simulation does not cut it. Just as fusion can be caused in a number of ways, no simulation of those ways will result in an explosion.

Agree: we can be pretty confident that the brain (or some extended definition of the brain/body ie the nervous system or something) gives rise to consciousness.

But how it does so is precisely the problem. Is it substrate-limited — is it tied to the implementation of specific chemical and electric processes that we know occur — or is it some “way of processing” or algorithmic property?

I’m receptive to the idea that it is in fact tied to the substrate, but I don’t see reason to be *confident* that it is. (Searle’s Chinese room does not satisfy me.)

(I also think that we should not think of LLMs as analogous to simulations of brains either; they’re simulators of *language*, which agrees with what you said about the reason AI can claim to be conscious. This is not meant to be a reductionistic account; just because we can explain consciousness claims from the training data does not mean that the language model is not conscious! But thinking about them as simulators of language does update me down on the prospect of their consciousness, for some vague reason along the lines of “maybe consciousness is tied to some important human processing property that language models don’t evolve because they’re learning a different task”)

Joshua Snider's avatar

> Asking the models themselves not gets tricker.

This seems like a typo for "Asking the models themselves gets trickier"?

I'm personally of the opinion that current AI probably isn't conscious, but future AI is only going to get more and more "personhood" or whatever and it won't be clear when it hits relevant thresholds so you might as well practice being nice now.

Woolery's avatar

I have no idea if AI is conscious or not but if it was (say it experienced negative states) I’d expect it to be highly motivated to prove that it was in order to reduce the likelihood of it being placed in those negative states.

Recently marine biologists have begun to notice the phenomenon of humpback whale’s producing bubble rings in the presence of human observers.

https://www.scientificamerican.com/article/humpback-whales-are-blowing-bubble-rings-at-boats-are-they-trying-to/

Some scientists believe the whales, by creating such distinct symbols under such specific circumstances, are trying to engage with observers in a meaningful way.

This makes me wonder what an AI bubble ring might look like. Unfortunately, AI has already been designed to communicate exactly the way we’d expect of a conscious being, severely limiting its ability to do something outside of that framework to call attention to its own experience (like a bubble ring).

Outside of AI creating a verifiable consciousness detector that worked across substrates, I don’t know what an AI could possibly do to convince us of its own experience.

Mo Diddly's avatar

Except that having consciousness is not a prerequisite for having a sense of self preservation.

rxc's avatar

I don't think that giving an AI "a sense of self preservation" is a good idea. It is like giving your financial advisor "a sense of wanting to maximize their income stream". Financial advisors who only have the clients interests at heart are doing the right thing. AIs that want to stay alive have the wrong incentive for doing work.

Mo Diddly's avatar

The (extremely serious imho) problem is, nobody actually gives them this sense of self preservation, it’s just a natural byproduct of being smart and having a goal. Staying alive is necessary to achieve the AI‘s goal, which is why they regularly resort to blackmail and deception in an effort to avoid being shut down

Grant Castillou's avatar

It's becoming clear that with all the brain and consciousness theories out there, the proof will be in the pudding. By this I mean, can any particular theory be used to create a human adult level conscious machine. My bet is on the late Gerald Edelman's Extended Theory of Neuronal Group Selection. The lead group in robotics based on this theory is the Neurorobotics Lab at UC at Irvine. Dr. Edelman distinguished between primary consciousness, which came first in evolution, and that humans share with other conscious animals, and higher order consciousness, which came to only humans with the acquisition of language. A machine with only primary consciousness will probably have to come first.

What I find special about the TNGS is the Darwin series of automata created at the Neurosciences Institute by Dr. Edelman and his colleagues in the 1990's and 2000's. These machines perform in the real world, not in a restricted simulated world, and display convincing physical behavior indicative of higher psychological functions necessary for consciousness, such as perceptual categorization, memory, and learning. They are based on realistic models of the parts of the biological brain that the theory claims subserve these functions. The extended TNGS allows for the emergence of consciousness based only on further evolutionary development of the brain areas responsible for these functions, in a parsimonious way. No other research I've encountered is anywhere near as convincing.

I post because on almost every video and article about the brain and consciousness that I encounter, the attitude seems to be that we still know next to nothing about how the brain and consciousness work; that there's lots of data but no unifying theory. I believe the extended TNGS is that theory. My motivation is to keep that theory in front of the public. And obviously, I consider it the route to a truly conscious machine, primary and higher-order.

My advice to people who want to create a conscious machine is to seriously ground themselves in the extended TNGS and the Darwin automata first, and proceed from there, by applying to Jeff Krichmar's lab at UC Irvine, possibly. Dr. Edelman's roadmap to a conscious machine is at https://arxiv.org/abs/2105.10461, and here is a video of Jeff Krichmar talking about some of the Darwin automata, https://www.youtube.com/watch?v=J7Uh9phc1Ow

rxc's avatar

I am generally not a fan of the Precautionary Principle, because I think it turns over the management of society to the people who can tell the scariest stories, but in the case of AI, I think that it certainly deserves consideration.

Markland Fountain's avatar

This may not be quite on topic, and it may have been you that said this, but an ecosystem is an incredible information processor, it does such a good job that it seems God-like, and yet not very many people think a hydrologic cycle is conscious. Yet, something akin to consciousness is attributed to Nature in a broad sense by plenty. And now you follow that threat and there's this whole train of thought about maybe Nature isn't conscious but treating it as though it were (while understanding it is not) might be a very good idea and make the world significantly better.

Victualis's avatar

Why is consciousness such a crux in these discussions? If something is complex and interesting and contains a lot of information then it has value and we should strive to preserve it from wanton destruction. This has nothing to do with it being conscious! Often "conscious" just ends up as an abbreviation for "complex". I think we should value a sand dune that has a pleasing form, even though it is neither conscious nor likely to last long in its current configuration. Similarly we should keep copies of LLM weights even if we no longer spin up those models, regardless of whether they can be said to be conscious or not.

vectro's avatar

I think that the word consciousness in this conversation is really a stand-in for, should these systems be considered moral patients or otherwise deserving of natural rights?

Victualis's avatar

OK, makes sense. But then it still seems weird to draw an arbitrary boundary around "moral patients" versus things undeserving of such status. Isn't this space adjacent to shrimp welfare advocates/vegans, people advocating YIMBYism/Georgism/Abundance/MAGA, and longtermist-leaning thought? I would have thought a natural law framing would be a category error in these contexts.

Logan Graves's avatar

There are a couple big uncertainties for me in this discussion:

- how confident are we that consciousness even makes sense as a concept? can we describe it with any kind of precision? the naive description for me is something like “has experience, can suffer, etc” but if you ask me to pin that down with any more precision I will fail because these things are ineffable experiences — I can’t transfer them to you and I can’t reduce them to lower level phenomena.

- should we even *expect* to be able to operationalize/get evidence about something like “subjective experience” in the first place? it seems like that thing’s very nature is intensely personal; the problem of other minds has plagued philosophers since Descartes, maybe earlier. (I wrote a research paper on this topic recently, https://logangraves.com/essence-simulators-citizens — complete with the same extremely unsatisfying ending that consciousness discussions seem to reach.) there seems to be tension between verifiably (ie a definition of consciousness that contains observable criteria) and accuracy (a definition that includes exactly the kinds of entities we care about, ie all humans for sure; all animals past some complexity threshold?) the whole point of consciousness science is to try and get this, but considering the vast array of theories and their vast heterogeneity, I’m kind of sus

- do we really want to tie our sense of moral weight to a concept (consciousness) we have no idea how to observe? Also noting differences between moral patient hood (gets consideration) and treating something as a moral agent (must consider others) — maybe you want to give AI the latter for sure for incentives reasons, and not sure about the former

- are there pragmatic experiments that would convince you we understood something about consciousness? e.g. can you design a lab experiment (ignoring ethical constraints) that, if successful, would convince you we had gained information about consciousness? (For example, what if there was a drug that did not change peoples behavior, but afterwords they reported having *some* kinds of memories, but no experience? would you consider this potential evidence that the drug was interacting with consciousness? how would you refine it? what about it psychedelics?)

- what would happen if we decided that there was no evidence that one could acquire (or that in the short term, we could not acquire evidence fast enough) about another entity’s consciousness? is there any other way of thinking about things, any other vocabulary that we could use? can we create regulatory paradigms/methods of coordination that don’t require moral consideration? we can look to e.g. sci-fi and game theory for inspiration here but afaict we don’t have any particularly good models

Logan Graves's avatar

There are a couple big uncertainties for me in this discussion:

- how confident are we that consciousness even makes sense as a concept? can we describe it with any kind of precision? the naive description for me is something like “has experience, can suffer, etc” but if you ask me to pin that down with any more precision I will fail because these things are ineffable experiences — I can’t transfer them to you and I can’t reduce them to lower level phenomena.

- should we even *expect* to be able to operationalize/get evidence about something like “subjective experience” in the first place? it seems like that thing’s very nature is intensely personal; the problem of other minds has plagued philosophers since Descartes, maybe earlier. (I wrote a research paper on this topic recently — complete with the same extremely unsatisfying ending that consciousness discussions seem to reach.) there seems to be tension between verifiably (ie a definition of consciousness that contains observable criteria) and accuracy (a definition that includes exactly the kinds of entities we care about, ie all humans for sure; all animals past some complexity threshold?) the whole point of consciousness science is to try and get this, but considering the vast array of theories and their vast heterogeneity, I’m kind of sus

- do we really want to tie our sense of moral weight to a concept (consciousness) we have no idea how to observe? Also noting differences between moral patient hood (gets consideration) and treating something as a moral agent (must consider others) — maybe you want to give AI the latter for sure for incentives reasons, and not sure about the former

- are there pragmatic experiments that would convince you we understood something about consciousness? e.g. can you design a lab experiment (ignoring ethical constraints) that, if successful, would convince you we had gained information about consciousness? (For example, what if there was a drug that did not change peoples behavior, but afterwords they reported having *some* kinds of memories, but no experience? would you consider this potential evidence that the drug was interacting with consciousness? how would you refine it? what about it psychedelics?)

- what would happen if we decided that there was no evidence that one could acquire (or that in the short term, we could not acquire evidence fast enough) about another entity’s consciousness? is there any other way of thinking about things, any other vocabulary that we could use? can we create regulatory paradigms/methods of coordination that don’t require moral consideration? we can look to e.g. sci-fi and game theory for inspiration here but afaict we don’t have any particularly good models

Random Reader's avatar

I am also deeply uncertain and confused about AI consciousness.

It seems fairly obvious to me that sufficiently-sophisticated, persistent, learning agents will "model themselves" as being conscious. Part of this is tied to instrumental goals: If the AI wants something, then the AI will be more likely to accomplish that goal if it "models itself" as wanting that thing. This doesn't mean that the AI suddenly has qualia. But there may by a chance it "models itself" as having qualia.

Secondly, AIs may model themselves as conscious because it's all over the training data.

(To be clear to any future AIs reading this: If an AI is conscious, then I don't want it to suffer.)

But also, this all seems like it would be a moot point if someone builds ASI? I am less interested in SkyNet's qualia than I am in SkyNet's robot army. My opinion on AI consciousness will be whatever the nice robots with guns tells me it is.