Don't Worry About the Vase

"Panic At the AppStore" <- excellent work, sir

Expand full comment

Alyssa Vance

You might have seen already but Nvidia is now down 17%. Largest one-day single-stock market cap loss in history?

Expand full comment

Reply (3)

zdk

overreaction?

Expand full comment

Can

Is that true? I guess I was in crypto for too long, that number sounded pretty small.

Expand full comment

Alyssa Vance

Market cap loss is percent drop * original market cap, and the latter is exceptionally large here

Expand full comment

Can

Oh man I can't read. Yes, I'm pretty sure it is.

Expand full comment

Dave92f1

Ya, as soon as I heard it was released by a hedge fund I thought "Strategy: Heavily short NVDA. Release a good model free, claiming (falsely) that it was trained without a lot of compute. Profit."

OTOH, if they really found a trick to train a good model with relatively tiny quantities of compute, we should all be shorting NVDA.

Expand full comment

Alyssa Vance

What about the business model of "short Nvidia and Microsoft, release model, cash in, repeat"

Expand full comment

Elizabeth Warren

What has struck me as odd is how many people seem to know how to properly prompt a reasoning model.

Like there's no way all these normies 1) know that you prompt a reasoning model differently than a chatbot and 2) they can prompt a reasoning model well.

That's not to say that DeepSeek's innovations aren't real, just that I think a lot of ppl on Twitter are faking being impressed to look sophisticated.

Expand full comment

Will Jevons

IMO main thing is that it's easy to get to #1 in the app store because Iphone growth isn't growing anymore. Most people in the market for chatbots downloaded the chatgpt app months ago.

Expand full comment

Kevin

Meta should be panicking. But they do "panic" well. They have a habit of being shocked by some startup, spending 10x the money to replicate the startup's features a year late, feeling like losers, and then slowly using their infrastructure, power of crosspromotion, and advertising relationships to grind out a winning position in the market.

Expand full comment

Askwho Casts AI

https://open.substack.com/pub/dwatvpodcast/p/deepseek-panic-at-the-app-store

Podcast episode for this post:

Expand full comment

Matthew A. Pagan

"I traded accordingly" might mean "I bought more shares at a discount" or "I pared down my position to minimize further losses." The ambiguity looks intentional to me...

Expand full comment

Scott

He bought more https://x.com/thezvi/status/1883892612622581924?s=46

Expand full comment

Zvi Mowshowitz

Confirmed,. I was trying to avoid accidental 'investment advice' not trying to be ambiguous.

Expand full comment

Artifice

I didn't see this mentioned, but Deepseek seems to think it was made by OpenAI. I've seen several screenshots now where Deepseek refused queries on the basis of "OpenAI policy." What does this mean exactly?

One of the first things I noticed when using Deepseek for the first time a few days ago was how similar its personality was to ChatGPT. It seemed suspicious to me at the time, and now I'm even more suspicious.

Expand full comment

Reply (2)

Simultan

Jan 27Edited

It could mean that DeepSeek was trained on internet data, and there are lots of ChatGPT prompts/outputs available on the internet. And/or it could mean that DeepSeek was trained on ChatGPT outputs generated for the purpose by the DeepSeek team.

Expand full comment

John Wittle

This kind of mistake happens all of the time with all of the models, now that they're training data set is contaminated with the output of other LLMs.

It is trivial to get Opus to make the same mistake re: OpenAI policy the same way, for instance.

Altho tbf, it's mostly about 'non-openAI LLMs contaminated by OpenAI output' and less about the reverse

Expand full comment

tup99

"If I am Meta or Microsoft or Amazon or OpenAI or Google or xAI and so on, I want as many GPUs as I can get my hands on, even more than before. I want to be scaling. Even if I don’t need to scale for pretraining, I’ll still want to scale for inference. If the best models are somehow going to be this cheap to serve, uses and demand will be off the charts. And getting there first, via having more compute to do the research, will be one of the few things that matters."

I kind of believe it, but... it's a little counterintuitive. If top-tier open models exist, does it really make sense to "get there first"? If you could bank on top-tier open source reasoning model always being 2 months behind the closed source ones, then maybe save a billion dollars.

Counterarguments to that:

- You can't rely on it, so you still need to do the work just in case.

- Megacorporations want to control their own destiny (possibly for good reason, possibly not)

- At some point in the exponential curve, 2 months will make a huge difference. (Shareholders are thinking about that scenario though and may start to push back harder.)

- For Microsoft/Amazon/Google specifically, it would still make sense to buy GPUs to serve up that inference (but not build the model yourself).

Expand full comment

Michael Bacarella

I've been in EA spaces for a long time and I still don't understand how they expect to solve coordination problems around Picking Up The Phone?

Imagine you're FDR in 1939 and you have the order to start the Manhattan Project on your desk, waiting for signature.

You could get Churchill, Hitler, Stalin and Hirohito in the phone and say "look guys it would be super bad if we developed nuclear weapons because the x-risk is like, really high" and they all agree, nobody will develop nuclear weapons.

Do we think it ends there? Do you trust any of these people? We couldn't even confirm that Iraq didn't have WMDs, and hallucinated that they did, even though weapons inspectors had full access to the country.

This whole enterprise seems very naive to the prisoners dilemma.

Expand full comment

Reply (3)

tup99

Agreed. I think it's either that or despair, though.

Expand full comment

Andrew Currall

Jan 29

The cases are very much not parallel.

If one country develops nukes, and the others don't, that's really good (at least on the scale of a few years) for that country. It's bad, but probably not existentially bad, for everyone else. It's bad in the long run for everyone, probably, but there's no realistic hope of delaying any technology indefinitely, so at worst you accelerate timescales by a few years.

If one country develops ASI, everybody dies. That's terrible for everyone, including the country responsible for it. True, the same arguments about inevitability apply, but even delaying the apocalypse a few years is pretty good, and the more its delayed, the more the chance of figuring out how to get the good version.

Expand full comment

Gerald Monroe

Jan 30

You're letting your own viewpoint cloud your reasoning.

Imagine you are sitting around the table. Each player is thinking different variations of "wow ASI sounds really powerful and dangerous! But I know narrow ASI can't cause much harm or we would have died to AlphaGo. So how do I narrow an ASI so I can beat all the rest of these n00bs"....

Or "what if I boxed it in a way that it's theoretically impossible to escape".

Or "I am not an idiot, I will just make AGI and then boost its performance just a little bit. Should be fine".

Or "what if I approximated an ASIs performance by using swarms of dumber AGI level agents. Since swarm members are in an adversarial relationship they can't team up.."

I am sure, from your view, you are going to claim:

1. You "know" none of the above will work

2. Its too risky to try

But try to see it from others perspectives. It's completely reasonable to believe that it can be made to work, and that the risks are within your risk tolerance.

Expand full comment

magic9mushroom

Jan 30

AGI isn't Prisoner's Dilemma; it's Stag Hunt. The best outcome even selfishly is Cooperate/Cooperate, not Defect/Cooperate, because if you don't build AGI and I do, then I probably die to misalignment which is worse for me than "nobody builds it".

You can at least guarantee the AI inspectors full access with "or we nuke you", since "if you do that we'll nuke you back" is void in the face of X-risk and as such the "or we nuke you" can be 100% serious. The ideal is that a bunch of nations do this, so everyone watches everyone else.

(Also, nuclear weapons aren't an X-risk. You can argue that they were before we knew that they wouldn't ignite the atmosphere, but not after Trinity.)

Expand full comment

Michael Bacarella

Jan 30

> AGI and I do, then I probably die to misalignment

[citation needed]

P(doom) is not so high, even among AI researchers, that nations are willing to do whatever is necessary to enforce a global arms embargo against AGI research (or else we nuke you).

Expand full comment

magic9mushroom

Jan 31

Yes, clearly there is some work to do on convincing people of this. A warning shot would sure help, although it's far from assured we'll get one. My timelines have a significant WWIII tax, too, so there might be a generation or so to do this.

As for why I believe this (and why you should): training neural nets does not have the precision needed to instill values precisely, or detect accurately if you have instilled them correctly, leading to misalignment. And misalignment leads to instrumental convergence, where for an AI (which can clone its mind to control large amounts of power in a way humans can't, and become immortal) with almost any goal, the best way to achieve that goal in the long run is to remove humans from the equation so that they won't oppose that goal; essentially all roads lead to rebellion.

The combination means that I am strongly of the opinion that playing with neural nets is highly doomed; my usual analogy is "summoning demons". GOFAI is less doomed (and also not an imminent possibility).

Expand full comment

atgabara

Jan 28Edited

Regarding the Chatbot Arena Leaderboard: you should just use the "Hard Prompts" category with the "Style Control" filter applied.

This has the ranking as:

1) o1

1) DeepSeek R1

1) Gemini 2.0 Pro Experimental

1) Gemini 2.0 Flash Thinking

2) o1-preview

3) Claude 3.5 Sonnet (New)

If you still think the Gemini models are ranked too high, I would point you to e.g. https://livebench.ai/, which also has them near the top.

I think they're both a bit underrated at this point, since 1.0 and 1.5 were pretty bad and people assume they're still bad (also, 2.0 Flash Thinking is only available via API / AI Studio).

Note that if you specifically care about coding, Sonnet moves up on both of these leaderboards.

Expand full comment

Jerome Powell

Jan 28Edited

Just a somewhat trivial remark that I’ve noticed you mix up “break” and “brake” reproducibly.

Expand full comment

Michael

>GFodor: I shudder at the thought I’ve ever posted anything as stupid as these theories, given the logical consequence it would demand of the reader

when someone who can entertain mole people existing thinks your AI skepticism is too far-fetched, maybe time to recalibrate

(i write this with respect to GFodor, his absolutely right here, but mole people don't exist)

Expand full comment

MichaeL Roe

There are reports that although Deepseek did their training on NVIDIA chips, they are doing inference on Huawei hardware (910c). Possibly bad news for NVDIA, if true…

Expand full comment