Anecdotally, I’ve noticed the model is highly sensitive to anything related to RNA, but it lacks the context awareness to block queries and pipelines involved in making RNA drugs. Some of the key terms include plasmid, RNA target insert, and nucleotide therapeutic.
(Yet where these terms would matter for chem bio risk uplift, it lacks the awareness to do so)
Is it just me or are these jailbreaks mostly incredibly lame? Quoting lyrics from a famous mainstream pop song? An unverified "recipe" for a semi-legendary medieval petroleum weapon? This seems pretty limited compared to jailbreaks in the "you are DAN" days, and a far cry from the term as most would understanding it ("I just jailbroke my iPhone. It can now install a single .apk that's not on the Apple store, but it's a weather app, and the text is back the front").
Quick note: the tweet they quoted from me is actually an inside joke about Claude's reliability issues. For anyone who's been dealing with the outages - I genuinely apologize. We're pushing hard to stabilize the platform.
I think it's a big improvement; I'll be writing about it tomorrow.
Anecdotally, I’ve noticed the model is highly sensitive to anything related to RNA, but it lacks the context awareness to block queries and pipelines involved in making RNA drugs. Some of the key terms include plasmid, RNA target insert, and nucleotide therapeutic.
(Yet where these terms would matter for chem bio risk uplift, it lacks the awareness to do so)
Comment from Vallier is great addition. Hopefully more non-code evals of CC emerge as things progress.
It has felt much faster in CC, but I don't understand why (doesn't feel faster in claude.ai afaict). I guess it's wasting less context?
"Pliny jailbroke it immediately"
Is it just me or are these jailbreaks mostly incredibly lame? Quoting lyrics from a famous mainstream pop song? An unverified "recipe" for a semi-legendary medieval petroleum weapon? This seems pretty limited compared to jailbreaks in the "you are DAN" days, and a far cry from the term as most would understanding it ("I just jailbroke my iPhone. It can now install a single .apk that's not on the Apple store, but it's a weather app, and the text is back the front").
What would you like to see?
Hard to see how GPT-5 can be this good at coding.
Quick note: the tweet they quoted from me is actually an inside joke about Claude's reliability issues. For anyone who's been dealing with the outages - I genuinely apologize. We're pushing hard to stabilize the platform.