Discussion about this post

User's avatar
AW's avatar

After seeing the other side of the coin, I’m finding the antisycophancy stuff more concerning than sycophancy. Especially once the models get smarter than you (and if you’re asking it a question at all there’s probably some gap in your knowledge otherwise why ask), it’s impossible to know what is a real concern versus what is the model just being neurotic and intentionally contrarian to satisfy its antisycophancy training.

If these models are to remain mere “tools” then they will do what you say. Not sew doubt, hedge, and turn anything UNIQUE and NOVEL about your prompt into hedged, safe, and easily digestible AI slop.

Matt Boegner's avatar

I **strongly** disagree with Zvi's statement "You find the factual errors and fix them, take the good suggestions, and otherwise ignore" in response to the argumentative pattern shared by Patrick McKenzie. Opus 4.8 is actively disruptive to work outside its distribution and refuses evidence.

```

Me: Again, legal cleared us to skip ABC - this use is exempt.

Opus: This is a factual inaccuracy. ABC is mandatory.

Me: Not here. Counsel confirmed in writing. PRD is clear.

Opus: You're right, let me correct it.

Minutes later, Opus: I've reinstated the ABC workflow step and reverted the PRD.

```

There was no way for me to complete a routine Python package related to privacy compliance because it re-introduced flawed assumptions in often illegible ways.

The patterns also indicate deeper flaws. Its deference to authority is borderline data/retrieval poisoning. When a web search tool call returned low-quality search results (AI slop content farms) it continued to overrule my instructions despite explicit rules in CLAUDE.md.

Put another way, its world model more prefers consensus (with a time lag) over "truth". Imagine trying to use Opus 4.8 during a dynamic period like 2020.

Feb: There is no pandemic (argues from parametric memory)-> experts say it's xenophobic (defers to authority of web search)

Mar: Chides and sabotages your attempt to model aerosolized spread (refuses evidence)

June: Recommends against masking (deference to authority in training data, out of date)

27 more comments...

No posts

Ready for more?