Discussion about this post

User's avatar
MichaeL Roe's avatar

This is a good example of alignment failure that normies can understand.

Many people who haven’t been following this closely don’t realise that there is unexpected emergent behaviour in LLMs.

Even if you’re no an expert, it s easy to get that:

A) Elon (or his employees) did not explicitly program their AI to call for Elon to be executed. Clearly, he would be very unlikely to do that.

B) it is also clear why Elon might have problem with an A calling for his execution.

Once you’ve got that - the problem generalizes. Welcome to AI alignment. You are now a doomer.

Expand full comment
Perelandra99's avatar

LLMs are prediction machines, trained to predict the next token of text based on what its read on the internet.

The internet is full of people (some paid, some brainwashed, some just naturally acquired) who have Trump Derangement Syndrome and post a LOT about it.

A machine trained to predict next text on the Internet will predict a lot of anti-Trump and Elon shit talking.

If the machine was instead trained on conversations overheard on construction sites or at diners filled with farmers, it would have a much different output.

But understand that an LLM doesn’t think…it predicts what “The Internet” would say next.

Any knowledge not captured in text form on the Internet is either not available to the LLM or only available as filtered through text on the Internet.

Expand full comment
21 more comments...

No posts