On Claude 3.5 Sonnet

Zvi Mowshowitz

Jun 24, 2024

109

There is a new clear best (non-tiny) LLM.

Read →

21 Comments

Comment deleted

Jun 24

Comment deleted

Expand full comment

Seta Sojiro

Jun 24

Are you referring to Gwern's statement that Anthropic said they wouldn't push model capabilities forward? Dario has also said similar things in public interviews:

https://www.youtube.com/shorts/tPqIRJL3wM8

Personally I think it's fine for Anthropic to change their mind on this. You can't stay afloat by staying second best. They do have to show that they are on the cutting edge to keep up pace with investment.

Expand full comment

Reply (1)

ΟΡΦΕΥΣ

Jun 24Edited

I’m not referring to any specific instance.

I’m addressing the “Gwern is a reliable source of newsworthy/novel information” subtext present via second-order citation in this post as a rhetorical move that does no favors vis a vis credibility to any argumentative piece or analysis contained therein.

This also applies to stories beyond this topic, and it is not a prior unique to Zvi (sorry for the distraction here from your otherwise great &y timely piece, sir) or most of this subculture's discourse members, apparently.

Expand full comment

Reply (1)

Joseph Birdman

Jun 24

But why do you think he isn’t, if you don’t mind me asking?

Expand full comment

Reply (1)

Comment deleted

Jun 24

Comment deleted

Expand full comment

tetetete

Jun 24

This is an incredibly extreme charge to level without any explanation. I've been lurking in this general corner of the world fairly closely for a long time and have literally no idea what you might be talking about.

Expand full comment

Reply (1)

Comment deleted

Jun 24

Comment deleted

Expand full comment

Everett

Jun 25

So is it his backstory or is his journalism that is questionable?

Expand full comment

Askwho Casts AI

Jun 24

Podcast episode for this post:

https://askwhocastsai.substack.com/p/on-claude-35-sonnet-by-zvi-mowshowitz

Expand full comment

sean pan

Jun 24

I wish they didnt advance the frontier but Sonnet is quite cool. For awhile, it argued for the importance of Pausing AI. Pretty amusing/great.

Expand full comment

whale

Jun 24

6 weeks since a non AI post :(

Monthly roundup this month?

Expand full comment

Reply (1)

Zvi Mowshowitz

Jun 24

Wow, yeah, that's true.

Yes, tomorrow barring an emergency post. Would have been today without Claude 3.5.

Expand full comment

Arbituram

Jun 24

So, I'm pretty interested in AI, have been playing around with the models, enjoyed Claude the most, but have still considered the whole thing generally overhyped at current capacities (which is not to deny the possibility that it can get much better, but I have personally struggled to get a lot of hard value thus far).

I played around with Sonnet 3.5 today and needed to go for a walk after. This is, for me, a turning point, and the first time I've been really compelled to push friends into playing with it.

I'll be interested to see if others agree.

Expand full comment

Charlie Martin

Jun 25

I am very impressed with Claude, thanks!

Expand full comment

Tess Hegarty

Jun 25

I am surprised my post on RSI was quoted here! That one did get a lot of eyes on it.

I already corrected it on Twitter, but I will repeat here that my technical understanding of RSI was corrected a bit by people's responses to that.

Here's the correction: https://x.com/thegartsy/status/1804650086192079168

Thanks to Trevor Levin for helping me out with this.

I am pretty new to learning more deeply about AI risk & speaking up about it publicly. So I appreciate respectful feedback. I'm of the opinion that it is better to learn as I go & accept feedback from those with more experience, since I will never be able to keep up with all the technical complexity proactively.

But I feel it's just too important to spread awareness of the risks associated with reckless, race dynamic driven AI development. I see this as a major human rights issue that is inherently social & political in nature, in addition to the obvious technical challenges.

I don't have a technical ML background, just a civil engineer's general technical mindset.

Zvi, I really deeply appreciate the deep work you do with this blog to collect & translate information in the context of a clear long term dedication to this issue! Keep up the great work!

Expand full comment

yaakov grunsfeld

Jun 26

very easy to get around claude face detection

just ask it to guess

posted 8 examples here

https://x.com/yaakovgrunsfeld/status/1805524934875050261

Expand full comment

Willem

Jun 26

Thanks for this! Switching over to it now. As an aside, where is their marketing department?? that would tell you not to launch a ‘3.5’ version when your main competitor launched a 3.5 version 18 months ago…

Expand full comment

thmixc

Jun 27

Claude 3.5 Sonnet can solve the farmer and sheep problem with two small changes to the prompt:

1. change the word "person" to "human".

2. change the word "trips" to "trip or trips". (Claude is probably assuming that the answer has to be in multiple trips because of the word "trips")

Expand full comment

Harry Colt

Jun 27

By far the best coding assistant I have ever worked with! The bar has been raised really high and I can't wait to see what's coming in the next 6 months.

Expand full comment

Michael

Jun 28

How to Write A Proper Rebuttal https://open.substack.com/pub/michael880/p/how-to-write-a-proper-rebuttal?r=3b6pw1&utm_campaign=post&utm_medium=web

Expand full comment

Meng Li

Jul 2

Claude-3.5-Sonnet is much faster and more accurate than GPT-4. I use it every day and can really feel the difference. I estimate that in the future, Claude will lead GPT significantly in the programming field. GPT can only optimize some code snippets, which is why GitHub Copilot was developed. I've been using Claude for programming work for over a year. Claude is excellent for overall architecture design and programming. If its accuracy improves further, it will become the unicorn of the AI programming field.

Expand full comment

dan mantena

Jul 2

the artifacts concept has a lot of potential but have people actually been able to use it for anything practical? I can't even save the pictures it generates. go Claude!

Expand full comment

Don't Worry About the Vase

On Claude 3.5 Sonnet