8 Comments
User's avatar
Hollis Robbins's avatar

I think it's a big improvement; I'll be writing about it tomorrow.

Austin Morrissey's avatar

Anecdotally, I’ve noticed the model is highly sensitive to anything related to RNA, but it lacks the context awareness to block queries and pipelines involved in making RNA drugs. Some of the key terms include plasmid, RNA target insert, and nucleotide therapeutic.

(Yet where these terms would matter for chem bio risk uplift, it lacks the awareness to do so)

Kyle Munkittrick's avatar

Comment from Vallier is great addition. Hopefully more non-code evals of CC emerge as things progress.

Yoav Tzfati's avatar

It has felt much faster in CC, but I don't understand why (doesn't feel faster in claude.ai afaict). I guess it's wasting less context?

Coagulopath's avatar

"Pliny jailbroke it immediately"

Is it just me or are these jailbreaks mostly incredibly lame? Quoting lyrics from a famous mainstream pop song? An unverified "recipe" for a semi-legendary medieval petroleum weapon? This seems pretty limited compared to jailbreaks in the "you are DAN" days, and a far cry from the term as most would understanding it ("I just jailbroke my iPhone. It can now install a single .apk that's not on the Apple store, but it's a weather app, and the text is back the front").

vectro's avatar

What would you like to see?

Michael Spencer 🇨🇦🇹🇼's avatar

Hard to see how GPT-5 can be this good at coding.

Alex Palcuie's avatar

Quick note: the tweet they quoted from me is actually an inside joke about Claude's reliability issues. For anyone who's been dealing with the outages - I genuinely apologize. We're pushing hard to stabilize the platform.