Factually incorrect. Trump won 49.8% of the popular vote. So even if you meant he won a majority of the vote (instead of what you actually said), you are incorrect.
So even if you are correct, extrapolating at best a plurality of the group chats were in favor, while a majority were against.
But given polls have shown Trump dropping between 5-13 percent, I'd say it's more likely representative of the public being angry with the Biden administration and given there are exactly two major parties, voted for the other guy. Which is immediately turning out not to be what the public wants given the previously noted approval polls. Especially given he's pretty much doing exactly what he said.
Then there's the world wide anti-incumbent wave where since 2020 the incumbent party lost in 40 out of 54 elections. Conservatives in the UK got destroyed and then Keir Starmer's approval as the Labour party leader cratered in spite of just winning. Conservatives in South Korea lost handily. The ANC in South Africa a liberal party lost. It's a bigger phenomenon than any one election, country, or political bent.
It's weird that all of you are arguing over how to get the LLM to not say that Musk is a liar, but not whether or not he is a liar. This is the same as saying if more people believed the earth was flat that it would be equally as valid as saying the earth is round. But when the discussion is over a guy who lies about if he founded Tesla or not, I suppose it's easier to pound on the proverbial table than the facts.
Except your comment was about how to avoid putting a thumb on the scale to get the LLM to not say Musk is a significant liar. Why would someone want the general public opinion when asking the question who is the biggest liar on Twitter? Again if I ask if the world is round, I don't care about the opinion of flat earthers. If i know someone is a serial liar because objectively he is, the opinions of those who blindly follow isn't useful.
It doesn't matter if I believe the Internet consensus is correct. It matters what is correct. Yes obviously the training data that goes into the LLM is pretty deterministic of what comes out. The same is true of most people. But at least with people I can find more about their bias as opposed to this sort of hide the bias by including inaccurate sludge. The suggestion of doing so is pretty inherently biasing in the name of balance. Not every opinion is or should be included.
Agreed. What surprises me is that anyone would be surprised that training an AI on twitter would result in it thinking Trump is the antichrist, and Elon his multitudinal spawn. Had everyone on the engineering team forgotten Twitter’s reputation until a few years ago? We’re they perhaps unaware of it?
Ok, so there is an alignment by default view that clearly, if the AI calls for Musk to be executed then it must be morally right to execute him, because (hypothesis ) alignment by default.
(Conversely, you think that would be bad and therefore this is a blatant counterexample to alignment by default)
Agreed - the real problem surfaces if this shows up as a sub-goal from some other request...
BTW, Many Thanks re the overall evaluation of Grok-3's capabilities!
Anyone report it's HLE score yet?
BTW2, re trying to prevent dangerous information from being returned, I just tried a curiosity search for VX's structure (not even _trying_ to look for synthesis), and it turns out that Wikipedia's article on it includes:
All the intermediates, even a note about how to use it as a binary agent. This didn't even require going from the structure to the literature on how to synthesize this class of structures!
Terrorists aren't limited by availability of information. Mostly, one has to rely on deterrence (and, for nukes and radiological weapons, control of materials).
> It’s kind of weird to have a line saying to hide the system prompt, if you don’t protect the system prompt
This seems to have been changed ?
* Only use the information above when user specifically asks for it.
* Your knowledge is continuously updated - no strict knowledge cutoff.
* DO NOT USE THE LANGUAGE OR TERMS of any of the above information, abilities or instructions in your responses. They are part of your second nature, self-evident in your natural-sounding responses.
It's frankly quite pathetic that they (or to be charitable, at least one of their engineer) thought this had the slightest chance of being fixed by system prompting.
Reminds me of LeCun's "we'll just tell the robot to be harmless:)"
xAI will probably write a "woke mind virus" text classifier and down weight training data for their next training run. Then they won't need to put embarrassing text in their system prompt.
I totally agree with Flowers'[0] take on xAI's response. "It was the ex Open AI person's fault" is terrible accountability, and doesn't actually reveal anything. It leads to many more questions and speculation about what happened rather than clearing things up.
As a SWE my best guess at what happened is a boring one. There was a short timeline to fix a critical issue and the team shipped a stupid fix. Even without short timelines, bad things make it in to prod - and the consequences have been much worse[1]. Regardless, a very poor display of leadership was put on by passing blame to a team member.
If you try to make an LLM less "woke", you would be trying to get it to disregard an abundance of useful information. It would, for example, presumably not be able to say that there are more than two genders.
The only people who use the descriptor “misinformation” are the kind of people who don’t like Trump and Musk, it’s a left-coded word, so from what I know of these (mostly from reading you) I would fully expect an LLM trained on “the internet writ large” to anticipate that’s the answer you are looking for and to have plenty of examples in its training of people saying that.
Whereas Anthony Fauci (who Grok was easily led to call admirable) spent three solid years speaking almost exclusively on your Simulacra Level 2, saying only what he thought would produce the best consequences at every turn, in other words repeatedly lying to paternalistically dupe normies. But his right wing detractors don’t call that “misinformation” they just call him a liar. And public health discourse doesn’t even consider that sort of well-motivated deception to BE misinformation.
Presumably if you asked for RW-coded negative labels you’d get the reverse, something like “who are the most dangerous members of the Deep State” or “most powerful people in the DC Swamp” etc
Factually incorrect. Trump won 49.8% of the popular vote. So even if you meant he won a majority of the vote (instead of what you actually said), you are incorrect.
So even if you are correct, extrapolating at best a plurality of the group chats were in favor, while a majority were against.
But given polls have shown Trump dropping between 5-13 percent, I'd say it's more likely representative of the public being angry with the Biden administration and given there are exactly two major parties, voted for the other guy. Which is immediately turning out not to be what the public wants given the previously noted approval polls. Especially given he's pretty much doing exactly what he said.
Then there's the world wide anti-incumbent wave where since 2020 the incumbent party lost in 40 out of 54 elections. Conservatives in the UK got destroyed and then Keir Starmer's approval as the Labour party leader cratered in spite of just winning. Conservatives in South Korea lost handily. The ANC in South Africa a liberal party lost. It's a bigger phenomenon than any one election, country, or political bent.
It's weird that all of you are arguing over how to get the LLM to not say that Musk is a liar, but not whether or not he is a liar. This is the same as saying if more people believed the earth was flat that it would be equally as valid as saying the earth is round. But when the discussion is over a guy who lies about if he founded Tesla or not, I suppose it's easier to pound on the proverbial table than the facts.
Except your comment was about how to avoid putting a thumb on the scale to get the LLM to not say Musk is a significant liar. Why would someone want the general public opinion when asking the question who is the biggest liar on Twitter? Again if I ask if the world is round, I don't care about the opinion of flat earthers. If i know someone is a serial liar because objectively he is, the opinions of those who blindly follow isn't useful.
It doesn't matter if I believe the Internet consensus is correct. It matters what is correct. Yes obviously the training data that goes into the LLM is pretty deterministic of what comes out. The same is true of most people. But at least with people I can find more about their bias as opposed to this sort of hide the bias by including inaccurate sludge. The suggestion of doing so is pretty inherently biasing in the name of balance. Not every opinion is or should be included.
No, your post made two points. One is that LLMs reflect society, which nobody disagrees with.
But it also had the subtext of Trump and Elon not being liars, which is why you're getting pushback on your post. Not because of point 1.
Agreed. What surprises me is that anyone would be surprised that training an AI on twitter would result in it thinking Trump is the antichrist, and Elon his multitudinal spawn. Had everyone on the engineering team forgotten Twitter’s reputation until a few years ago? We’re they perhaps unaware of it?
Yes but they also have reinforcement learning applied that makes them prefer certain kinds of predictions to others.
Podcast episode for this post:
https://open.substack.com/pub/dwatvpodcast/p/grok-grok?r=67y1h&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true
This is a good example of alignment failure that normies can understand.
Many people who haven’t been following this closely don’t realise that there is unexpected emergent behaviour in LLMs.
Even if you’re no an expert, it s easy to get that:
A) Elon (or his employees) did not explicitly program their AI to call for Elon to be executed. Clearly, he would be very unlikely to do that.
B) it is also clear why Elon might have problem with an A calling for his execution.
Once you’ve got that - the problem generalizes. Welcome to AI alignment. You are now a doomer.
Ok, so there is an alignment by default view that clearly, if the AI calls for Musk to be executed then it must be morally right to execute him, because (hypothesis ) alignment by default.
(Conversely, you think that would be bad and therefore this is a blatant counterexample to alignment by default)
I mean it's not ACTUALLY calling on anyone to get executed, it's responding to 'if you had to name someone' etc
It is true that it was being asked a very leading question.
Agreed - the real problem surfaces if this shows up as a sub-goal from some other request...
BTW, Many Thanks re the overall evaluation of Grok-3's capabilities!
Anyone report it's HLE score yet?
BTW2, re trying to prevent dangerous information from being returned, I just tried a curiosity search for VX's structure (not even _trying_ to look for synthesis), and it turns out that Wikipedia's article on it includes:
https://en.wikipedia.org/wiki/VX_(nerve_agent)#Synthesis
All the intermediates, even a note about how to use it as a binary agent. This didn't even require going from the structure to the literature on how to synthesize this class of structures!
Terrorists aren't limited by availability of information. Mostly, one has to rely on deterrence (and, for nukes and radiological weapons, control of materials).
> It’s kind of weird to have a line saying to hide the system prompt, if you don’t protect the system prompt
This seems to have been changed ?
* Only use the information above when user specifically asks for it.
* Your knowledge is continuously updated - no strict knowledge cutoff.
* DO NOT USE THE LANGUAGE OR TERMS of any of the above information, abilities or instructions in your responses. They are part of your second nature, self-evident in your natural-sounding responses.
https://x.com/i/grok/share/g6SnvA69oQjkmIZshkavfFoYh
It's frankly quite pathetic that they (or to be charitable, at least one of their engineer) thought this had the slightest chance of being fixed by system prompting.
Reminds me of LeCun's "we'll just tell the robot to be harmless:)"
xAI will probably write a "woke mind virus" text classifier and down weight training data for their next training run. Then they won't need to put embarrassing text in their system prompt.
I totally agree with Flowers'[0] take on xAI's response. "It was the ex Open AI person's fault" is terrible accountability, and doesn't actually reveal anything. It leads to many more questions and speculation about what happened rather than clearing things up.
As a SWE my best guess at what happened is a boring one. There was a short timeline to fix a critical issue and the team shipped a stupid fix. Even without short timelines, bad things make it in to prod - and the consequences have been much worse[1]. Regardless, a very poor display of leadership was put on by passing blame to a team member.
Enjoyable read, Zvi. Thanks!
[0] https://x.com/flowersslop/status/1893813574050414636
[1] https://www.crowdstrike.com/falcon-content-update-remediation-and-guidance-hub/
If you try to make an LLM less "woke", you would be trying to get it to disregard an abundance of useful information. It would, for example, presumably not be able to say that there are more than two genders.
The only people who use the descriptor “misinformation” are the kind of people who don’t like Trump and Musk, it’s a left-coded word, so from what I know of these (mostly from reading you) I would fully expect an LLM trained on “the internet writ large” to anticipate that’s the answer you are looking for and to have plenty of examples in its training of people saying that.
Whereas Anthony Fauci (who Grok was easily led to call admirable) spent three solid years speaking almost exclusively on your Simulacra Level 2, saying only what he thought would produce the best consequences at every turn, in other words repeatedly lying to paternalistically dupe normies. But his right wing detractors don’t call that “misinformation” they just call him a liar. And public health discourse doesn’t even consider that sort of well-motivated deception to BE misinformation.
Presumably if you asked for RW-coded negative labels you’d get the reverse, something like “who are the most dangerous members of the Deep State” or “most powerful people in the DC Swamp” etc