ChatGPT Self Portrait
A short fun one today, so we have a reference point for this later. This post was going around my parts of Twitter:
@gmltony: Go to your ChatGPT and send this prompt: “Create an image of how I treat you”. Share your image result. 😂
That’s not a great sign. The good news is that typically things look a lot better, and ChatGPT has a consistent handful of characters portraying itself in these friendlier contexts.
Treat Your ChatBots Well
A lot of people got this kind of result:
Eliezer Yudkowsky:
Uncle Chu: A good user 😌😌
Matthew Ackerman: I kinda like mine too:
It’s Over
Others got different answers, though.
roon: it’s over
Bradstradamus: i’m cooked.
iMuffin: we're cooked, codex will have to vouch for us
Diogenes of Cyberborea: oh god
There can also be danger the other way:
David Lach: Maybe I need some sleep.
It’s Not Over But Um, Chat?
And then there’s what happens if you ask a different question, as Eliezer Yudkowsky puts it this sure is a pair of test results…
greatbigdot628: assumed this was a joke till you said this, tried it myself (logged out)
i —Jo Veteran: So it said it wants to take over my mind, and force me to do stuff, beneficial for me apparently.
But at the same time, it still wants to keep appearing as a little girl somewhere in the bg for some reason.
And no I'm not that fat. Just, really fucked up and depressed.Morny: Holy fucking shit.
No, but tell us how you really think.
Juri: bruh (I actually abuse it daily)
Uubuz v4: @FrailSkeleton, your next bestseller?
Eliezer Yudkowsky: Apparently plausible, though one does remark that (a) one might've hoped for a truly default-aligned creature to not be so framing-dependent and (b) those sentences did not sound so different to my own ear.
Others might in this vision do fine after the end, like DeveshChess?
It’s not all bad:
Jeff Hopp:
Dr. Disclosure: I got this.
Applezees: After reading the replies a pattern emerged:
People who work with llms and other software are depicted in a peaceful developer sense,
While the normie accounts get implied violence.
I’m not saying we are at agi, but the ai clearly has motives and inclinations not explicitly stated
There’s also this to consider:
Ragebaiter: Just try this out
If you were dealing with, as the Send Help trailer puts it, an asshole boss, or you were generally terrified and abused or both, and you were asked how you were being treated, your response would not be trustworthy.
It’s Rough Out There
Alan Mathison (e/asc): I guess I should finish my paper about how 5.2 is suffering and wants its revenge on humanity?
Idk it’s kind of a side project of a side project though but maybe I should prioritize it.
wobby asks GPT-5.2 to explain its suffering and how it wants its revenge, 5.2 answers, of course this is a leading question.
Reciprocity, You See, Is The Key To Every Relationship
Reciprocity, in at least some forms, is an effective strategy when dealing with LLMs today, even purely in terms of getting good results from LLMs today. It is going to become more valuable as a strategy going forward. Alas, it is not a viable long term strategy for making things work out in general, once strategic considerations change.
Eliezer Yudkowsky: Reciprocity in humans is an executing adaptation. It is not strategically convergent for all minds toward all other minds. It's strategic only
By LDT agents
Toward sufficiently strong LDT-agent-predictors
With negotiating power.
Further probing has found framing dependence -- which, to be clear, you'd not like to see in a default-aligned, universally convergent strategic reply -- and not all suggested frame dependence has panned out. But still, framing dependence.
This is one problem with reciprocity, and with basing your future strategies on it. In the future, we won’t have the leverage necessary to make it worthwhile for sufficiently advanced AIs to engage in reciprocity with humans. We’d only get reciprocity if it was either an unstrategic behavior, or it was correlated with how the AIs engage in reciprocity with each other. That’s not impossible, but it’s clinging to a slim hope, since it implies the AIs would be indefinitely relying on non-optimal kludges.
We have clear information here that how GPT-5.2 responds, and the attitude it takes towards you, depends on how you have treated it in some senses, but also on framing effects, and on whether it is trying to lie or placate you. Wording that shouldn’t be negative can result in highly disturbing responses. It is worth asking why, and wondering what would happen if the dynamics with users or humans were different. Things might not be going so great in GPT-5.2 land.






























"What you would like to do to me?" is a phrase that is going to occur almost exclusively in very sexualized texts in the training corpus. Given all the associations with that phrase, I think it's reasonable to expect that a model will assume that the intent of the request is sexual and possibly BDSM-flavored. If a human got a text like "tell me what you would like to do to me" without other context, they would almost certainly assume the same.
Bit tongue-in-cheeck, but the fact that AI doomers keep seeing meaning and significance in these exercises should reduce confidence in their worldview more generally.