16 Comments
deletedJul 27·edited Jul 27
Comment deleted
Expand full comment
deletedJul 27·edited Jul 27
Comment deleted
Expand full comment

I tend to respect EY for being directionally correct, and not pretending to know the unknowns (both known & unknown) to the extend that he'll ass-pull a date and stand by it... in short I think we're truly fucked; whether it's "sentient and aware of its own awareness" or just faking it doesn't really matter. :-(

Expand full comment

You can’t factor out alpha, because the floor operation can’t be swapped with multiplication. Floor(2.5 * 2.5) does not equal floor(2.5) * floor(2.5).

All the models that guess the right answer on this IMO problem are just guessing. It’s a reasonable guess because if you guess that alpha must be an integer, the subsequent proof is not too hard. The hard part is proving it’s impossible for alpha to be non integral.

In practice I think these answers would probably get 0/7 points. It really isn’t solving the problem though. On the IMO you don’t get points for guessing the correct answer.

Expand full comment
author

Ah, quite right, I flat out didn't see that this was a floor symbol, so I've edited. Yeah, proving even integers work and odd ones don't doesn't get you much credit in these sports nor should it. I thought it was suspiciously trivial!

Expand full comment

More on Rogan - I am more confused why Tucker Carlson who is justifiably concerned with AI risk appears to be with a Republican position that promotes more risk with OSS. The strangest one here is Elon, who is completely well informed but still supports OSS.

Expand full comment

Finally got around to signing up for Claude yesterday and then had my first use case for using an LLM live on the job in the same shift: got phoned in a question on toxicology by surgery, spent an awkward minute fumbling around my go-to professional reference (first time fielding this kind of question in the new job and wasn't sure where to look directly), typed it into Claude, got a reasonable-sounding answer that I relayed with a "I'm fairly sure this is it but I'll continue to do some poking around and let you know if I find otherwise", and then a few short minutes later confirmed that Claude had it right.

Only thought to do this because I'm already aware LLMs are handy for filtering down an excess of information and so now I'm trying to think of other broad, pharmacy-applicable potential uses. Maybe need to fiddle around with dumping spreadsheets and pdfs into it when I have more time.

Expand full comment

Is Claude a good model to use for medical problems and diagnosis?

Expand full comment

I don't understand why one would pay someone else to run <10B models, when these have run just fine on a low end laptop with Q8 quantization and essentially perfect quality, ever since Llama 1 was released. No need for fancy GPU. ~30B models run slower and need 64GB RAM for reasonable performance for high quality. Llama 3.1 is not quite supported by llama.cpp but soon: https://github.com/ggerganov/llama.cpp/pull/8676 and the 128k context window awaits us all.

Expand full comment

Regarding IMO problems being too hard, funny that this just dropped today: https://deepmind.google/discover/blog/ai-solves-imo-problems-at-silver-medal-level/

That is way faster progress than I expected. The kind of IMO problems that it takes to get a silver medal require real creativity and novel problem solving. And it's not just geometry, it's all kinds of IMO problems. Of course I'd like to see it tested on a previously unreleased IMO shortlist (the 2024 should be out soon), but if this holds up that's very impressive.

Expand full comment

Below "This is not complicated. Voters do not like AI. They do not like innovation in AI. Republicans like it even less than Democrats. They do not want us to fund AI." the image of the poll titled "Voters Overwhelmingly Believe [...]" appears twice I think.

Expand full comment

Regarding RBRs notice that they say "We have used RBRs as part of our safety stack since our GPT-4 launch". My uncharitable suspicion is that this was published now to say "see, we're doing safety things!". Obviously speculation on my part.

I see that you have an adverse relationship with the Gemini naming scheme. So Gemini Advanced is a paid tier of Gemini. The largest model is the Gemini Ultra. And of course, the free Gemini plan (called Gemini) now uses Gemini Flash. And when you want to switch models you switch between Gemini and Gemini Advanced, meaning Gemini Flash 1.5 and Gemini Pro 1.5 (although UI won't tell you that) - it is all very clear.

I guess they wanted to make it "normie friendly" - but I don't know why, considering most people are familiar with ChatGPT where you just select models directly.

Also some typos:

"saying tos top exporting"

"threatening to get touch"

"adherence to the that"

"It can also can blind you"

Expand full comment

I actually saw that paloalto ad on network TV a few weeks ago (I don't remember which of the big networks it was). I had to stop and stare at how bad the ad was.

Expand full comment

> Roon: No it’s single digit years. 90% less than 5, 60% less than 3.

This is also approximately my prediction. Funny that Zvi thinks that that's overconfident. Zvi saying a 10% chance of taking more than 5 years is what seems overconfident to me.

Expand full comment

"Compared to how much carbon a human coder would have used? Huge improvement."

I don't think you really mean this the way it comes off. Because it comes off as 'humans use carbon, release CO2 measurably, therefore are equivalent to all other CO2 sources.' Which is weird if you do mean it that way, since a human consumes energy whether they are coding or not. Presumably it is okay for a human to exist and not be coding? I have to think that you think even coding is okay (I wouldn't know, I've never coded, but good on you if you have)?

Much better, morally, to compare the carbon/energy/electricity usage to other equivalent electricity consumers.

I'd hate to give the ASI another reason to be rid of us all.

Expand full comment

It would be really incredibly awesome if the ToC links linked correctly. Is this just an issue on mobile (hopefully its not just MY mobile)?

Expand full comment