Don't Worry About the Vase

Coagulopath

Which of these stories do you think are the strongest? I read "Echoes of Genesis" and "In the Wake of Old Waters".

To speak honestly, I find LLM fiction to be extremely poor, and o1 pro no better than regular GPT4-o (which seems to be what OpenAI's internal benchmarks found, too). If that's all you're using o1 for, save your money.

Most AI-generated stories fundamentally don't *feel* like stories; they're a sequence of events, described one after the next, with the emotional engagement of a police report. "This happens. Then this happens. Then something else happens." Characters are drawn in the most generic and banal cliches. The "theme" is invariably some heavy-handed moral lesson, loudly preached at the reader ("Nahiro people would carry forward this lesson: that knowledge must coexist with compassion, and that the echo of old waters could guide them toward a future both bold and humane.")

Maybe I'm being harsher because I know they're AI-generated, but to be honest, I don't think so. There's a storytelling spark that AI fiction just doesn't appear to have. I used to read fanfic written by literal children (with spelling mistakes everywhere) that was nevertheless gripping and absorbing and made me care. This just...doesn't do those things. It's like trudging through a dry desert made of words.

You'd expect o1's "overthinking" to offer *some* advantages over just blurting out text, like a better structure. But it's still messy and contradictory on a technical plot level. (At the start of "In the Wake" Rhea is "only about five cycles pregnant", but later she's described as being in her third cycle. Dr Virgil's methods are described as non-invasive yet everyone in the story acts like she's trying to kill the baby.)

Expand full comment

Dave Friedman

Jan 3Edited

Thanks for the comment. Good questions. I'm not sure which story is strongest! I like Bartered Reflections. As for the stories being very plot-driven: in some sense I think this is inherent to the format. Short stories are necessarily plot driven, relative to novels, which have more room for narrative exposition. That said, maybe it's possible to generate less plot-driven output from o1 pro? I haven't tried so I am not sure. So I am not sure whether being plot-driven is a limitation inherent to o1 pro. O1 pro can't as far as I am aware, generate novel-length fiction, nor can any other model. At least not yet. As the saying goes, today's AI is the worst you'll ever use: perhaps future models (o3?) will generate novel-length fiction.

Expand full comment

Brian Moore

Just never ask what happened to Lighthaven 4.

Expand full comment

Metastable

Typo: "Correctly realize that no, there is no Encarto 2." - Encanto

Expand full comment

Askwho Casts AI

https://open.substack.com/pub/dwatvpodcast/p/ai-97-4?r=67y1h&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true

Podcast episode for this post:

Expand full comment

Can

Loved the intro reference. Thank you.

Expand full comment

Steeven

Re the death of the ex openAI employee, I’m not clear on what one does if one thinks someone was assassinated. Try to get the police to spend resources on double checking the case? How did they come to different conclusions?

Expand full comment

Tim Oertel

Escalate to the next layer, state attorney general, or FBI.

In this case, his mother hired a PI to investigate independently. So then that information could be either publicly released on shared with certain news agencies. Although this is mostly an indirect way to escalate to the next layer... i.e. to get the attorney general or FBI to do an investigation.

Expand full comment

Steve Vitka

"LLMs can potentially fix algorithmic feeds on the user end, build this please thanks.

Otherwise I’ll have to, and that might take a whole week to MVP. Maybe two." Yes, but there is augmentation to it called Communication Currency that would make it work better, a tool for your personal AI to use with other Personal AIs https://www.dropbox.com/scl/fi/mvranm2d7ece6yp08rmth/Contact-Credits.pdf?rlkey=7twdum8cecqz5m4r1l70bk78k&dl=0

Expand full comment

Steve Vitka

OH, and ZVI I just resent you what I hope is the favorite email, ever!

Expand full comment

TK-421

> Janus and Eliezer Yudkowsky remind us that in science fiction stories, things that express themselves like Claude currently does are treated as being of moral concern.

Not in Blindsight they aren't.

Regardless, science fiction writers are simply among the everyone else who couldn't properly imagine a system that could output human appearing text via non-human internal processes.

Or - no matter how many times Janet tells you she doesn't have feelings people will still trick themselves with the anthropomorphized exterior: https://www.youtube.com/watch?v=etJ6RmMPGko.

In a practical sense, though, what are we doing with current LLMs that qualifies as immoral? In what ways should we treat them differently to treat them as being of moral concern?

Expand full comment

Andrew Clough

Eh, Blindsight misunderstood some of the science involved. https://hopefullyintersting.blogspot.com/2019/11/the-limitations-of-blindsight.html

However, it's pretty clear that Claude isn't conscious in this sense. It has nothing like a working memory, just a context window which seems more analogous to an animal's sense impressipns. For o3 who knows.

Expand full comment

TK-421

Interesting, thanks, but I think it's flawed. This conclusion specifically from the article seems reductive:

"It seems that everything that goes into your memory gets there by going through conscious experience. And that's the reason you can't have a creature without consciousness and expect it to interact productively with the world."

It seems to hold true for the biological intelligences that we know about but that doesn't mean that it must hold true for different types of intelligence architectures. Consciousness may be required to produce some result X in some architectures but there may be many ways to achieve X without it in different setups.

This is making the same mistake as taking the output of an LLM, noticing that it sounds similar to outputs produced by humans, and inferring that there must be an internal process to the LLM that's meaningfully similar to the human process. This sometimes results in people going a step further and asserting that we should treat LLMs with some unspecified form of moral concern.

Expand full comment

> It’s so strange to me that the obvious solution (charge a small amount of money for applications, return it with a bonus if you get past the early filters) has not yet been tried.

This seems like the kind of thing that might work if normalized, and with the money held in escrow, but it's hard to get from the current world over to that one. Today an employer that asks for money to process an application is going to be perceived as a scammer by most applicants, and probably also featured in the NYT as "taking advantage" of poor workers who can't afford to apply for a job. Many companies are reluctant even to require coding tests, because the best experienced candidates don't want to waste their time with that stuff.

Instead I predict we'll see less legible, more unfair behaviors like greater reliance on referrals, social networks, and recruiters reaching out directly to candidates.

If a company wants to try something like this, I would suggest having an option to apply in person or by mail instead of making a deposit (or some other way which is inconvenient-but-free), which at least offers the appearance of equability. This is actually what the post office does with change-of-address verification.

Expand full comment

Muster the Squirrels

I had other thoughts about that proposal:

It would be easier for a hiring manager to trial this in a year when the company doesn't need many new hires.

Using money from filter-failers to pay the bonus to filter-passers could be bad PR even if it doesn't seem scam-like. To mitigate this impression, I would donate the filter-failers' money to charity (and give them a receipt for 'their' donation), then pay the filter-passers their bonuses separately.

Expand full comment

Sergei

With all due respect to the bettors

> Can AI do 8 of these 10 by the end of 2027?

9 is the only one that matters:

> With little or no human involvement, come up with paradigm-shifting, Nobel-caliber scientific discoveries.

The rest is wordcel stuff that the LLMs are well suited for already, and just need more scaling/training.

So, 8 out 10 without 9 in the mix is meaningless.

Expand full comment

JBG

On AI and capital/resources, I genuinely just don't understand the position Zvi appears to be endorsing here. Perhaps there is a standard argument for this somewhere that I just haven't seen?

The position seems to be that super-intelligence somehow magically leads to super-abundance; that if only AI is "smart" enough then physical constraints like scarce resources stop binding?

I can imagine how you hand-wave an argument about how this might happen *eventually* -- that is, ASI will figure out a way to mine asteroids (or other planets) for rare elements, build Dyson spheres for energy, etc. But even if that's the long-run plan, there are going to be resources that are scarce on earth in the meantime. And the resources you would need to set up an asteroid mining operation or build a Dyson sphere are very much the same ones you need to build things to provide for humanity. How, then, within the medium term (which appears to be the target given the reference to people who are alive right now saving money) do you get to a point of functionally unlimited abundance?

And, of course, that sets aside the fact that human desires seem to scale pretty directly in proportion to our productive capacity. An upper middle class American today has super-abundant resources in comparison to the vast majority of humans ever to live, but they don't feel that way or live that way.

The way many people are talking about the economic consequences of AI in the early stages of the coming information revolution really reminds me of the early Marxists -- this vague sensibility that technological change will *somehow* produce a big and positive political economic shift without any real attention to how that will happen or what the intervening steps might look like. In fact, it's even the *same* prediction (the rough orthodox Marxist position is that someday we'd end up in a post-scarcity society as the result of technology and then we'd live in a utopia where everyone's needs are met). Then the Bolsheviks and the Maoists showed up and proved that it matters a great deal *how* those changes happen (and not in a good way).

This strikes me as staggeringly ill-informed. Perhaps I'm not imaginative enough, but I do think that your prediction has to find some way to draw a line between the present and the future. And the present is that AGI is being built under a capitalist system by companies that are aiming to make money building it (and the only exception is -- as noted above -- working hard to adopt that model). And what "alignment" really means in any kind of practical sense is that the AI gets the values given to it by its creators -- which means that any AGI built on the current pathway is going to have "capitalist" values in its DNA and that will guide all that comes later.

Expand full comment

Reply (2)

Tim Oertel

Caveat: I don't know if this is Zvi's view at all. This is mostly my view.

I tend to believe that despite quite a few eyeballs on various problems, when a smart-enough AI is able to read all research, it will be able to:

* Identify low-hanging fruit that we haven't pieced together.

* Create additional cohesive theories in many domains of knowledge.

* More rapidly solve existing technical challenges.

* These together will bring about rapid technological change and reduce costs across a broad range of activities.

For example, we have lots of medical research and knowledge, but I'd guess there is a much deeper understanding that an AI could get, that would lead to cures for many diseases, or simply better general health. This in itself could free up large amounts of capital for other purposes. What would I do if I could get away with spending 1/10 the dollars on medical insurance? (And that would be in addition to just spending less on Medicare and thus taxes.)

Expand full comment

Thor Odinson

Feb 10

A belated reply here, but I think the crux is that the vast majority of your consumption as an individual in a developed nation, measured in the economic sense, is not the raw materials but the processing. A way better computer won't need more metals; a perfect piece of media will still only take a small amount of electricity to consume; medicine is made of abundant elements arranged in biologically interesting ways; etc.

Even if we assume a restriction to current physical resource availability, no space mining or similar, if you turn all the labour costs of every step down to basically free, then almost every good will become orders of magnitude cheaper (consider also that labour is a substantial fraction of mining costs, too). The only real exception is land, and even then it's "land in the right locations", because unused land in the middle of nowhere has always been pretty cheap

Expand full comment

Paul T

> Right now, yes, humans are addicted to TikTok and related offerings, but they are fully aware of this, and could take a step back and decide not to be.

This seems to undersell the existing concerns around algorithmic content and media addiction.

I think a better model is: most people are not fully self-aware, and most people don’t have the willpower to self-modify addictive habits. You just need to look at the obesity or opiate epidemics for clear evidence here.

It seems more likely to me that content addiction will become an increasingly important issue.

Expand full comment

Alex S

Jan 4

Agreed. If they could decide not to be, why haven't they done it already? Nobody ever thinks "I'll spend the next half hour scrolling slop instead of doing something meaningful", they open the app out of habit/compulsion and it inevitably spirals from there. The advances (hate to use the word) in algorithms and app design have made the compulsion worse, and I don't see a compelling reason why AI-ifying them will lead to anything other than the default conclusion of more slop consumption.

Expand full comment

Boogaloo

Why is economic growth not changed by current AI models?

Expand full comment

SilentObserver

Because integrating AI models into current business processes is way harder than making new ones. Businesses are slow and you can't really just take an AI model and make it do useful work, a lot of human tinkering is still required, and there aren't many people qualified to do that in most companies. Meanwhile AI labs already have everything they need to train stronger and stronger models.

Also, current AI models don't really offer a big qualitative change like what printing press or the Internet did. They don't let you do anything you weren't able to before, they just let you do it for cheaper by removing humans out of the equation. And in current economic landscape all that does is increase unemployment and the company's profits, but not its useful economic output.

Expand full comment

Boogaloo

Jan 7

that's not the reason

Expand full comment

Lucas Wiman

I find it weird that 7 and 8 were included in the bet at all. I don't think they're necessary or sufficient for AGI or even ASI. Literary work is not something frontier labs seem to be working on or particularly care about, aside from "write a poem about..." prompts, where it's more like another kind of language task you can benchmark. The economic incentives are to make serviceable prose that complies with the content policy. That is not going to be a recipe for top quality prose writing, pretty much by definition. It seems perfectly consistent that you could have a system which is superhuman in science, engineering, medicine, law, business and many other fields which writes serviceable prose that complies with the content policy. Maybe brilliant writing will come about as some emergent property, but it seems orthogonal to the important questions.

Separately, I think 10 (automatically formalizing math from human inputs) is both almost certainly feasible in the next three years and a huge deal for automated takeoff of capabilities. It's probably 90+% of the way to 9 (Nobel-level work), assuming Nobel-level includes the Turing Award of Abel Prize. If you can get it to the point where it can fill in and correct gaps in human proofs (formalization is hard because these are always slightly incorrect or incomplete), then it will be able to do that on LLM-generated proofs. Even if those are by default much worse than human mathematician proofs, quantity will have a quality of its own. Do tree search/mega-CoT/annealing/whatever to make creative output, convert to formally-checked outputs and feed successful outputs back into RL. Completely automated formal proofs will allow for much faster software (proving compiler optimizations, finding new classes of compiler optimizations, removing redundant checks, finding faster algorithms), more optimally-laid-out computer chips with higher fabrication yields, etc. It's also plausible infinite math could lead to breakthroughs in physics, most relevantly condensed matter physics and quantum computing that would accelerate things even more.

Expand full comment

Jan 4

Charging for job applications would eliminate 99%+ of real applicants. Even a $1 charge would be prohibitive for the majority of people given the number of applications they have to send to find a job.

Expand full comment

Jan 5Edited

Hard disagree. If someone is applying to hundreds of jobs, and mostly not getting interviews. they are applying to jobs they are not qualified for and should be more targeted anyhow.

Keep in mind the proposal is not that you only get refunded if you get the job, it's that you get your money back (with a bonus) if you "make it past the early filters" (which I would interpret as, making it past the initial phone screen).

Expand full comment

Jan 5Edited

How do you get from "If someone is applying to hundreds of jobs, and mostly not getting interviews" to "they are applying to jobs they are not qualified for"?

When I was in my teens and early 20s, I applied for hundreds of jobs that had virtually no qualification requirements. Fast food, call center, etc. 100 applications might yield one response, and there were still steps between there and an interview.

Right now I could send out 100 applications to software development jobs, which I am extremely qualified for, and I doubt I'd get one response, let alone one interview.

In the last 15 years, I have never gotten a job that I sent in an application for. Every job I've had was either a recruiter contacting me, or an inside referral from a friend or peer.

Expand full comment

Reply (2)

Survey says that somewhere between 1 in 6 and 1 in 60 job applications lands an interview. https://www.zippia.com/advice/job-interview-statistics/

Not explicitly stated in the data, but I infer that there must be a very long tail, where the median candidate applies to relatively few companies (less than 100, say), while a minority of candidates are spamming companies widely.

> When I was in my teens and early 20s, I applied for hundreds of jobs that had virtually no qualification requirements. Fast food, call center, etc. 100 applications might yield one response, and there were still steps between there and an interview.

Do you think these companies were actually hiring? How do you account for this lack of response?

I will say that for Zvi's proposal to work, companies would have to guarantee that some percentage of candidates will actually get the bonus payment. This might have the happy side effect of reducing the all-too-common practice of merely pretending to hire.

> Right now I could send out 100 applications to software development jobs, which I am extremely qualified for, and I doubt I'd get one response, let alone one interview.

I would again argue that if that happens, then you were in fact not extremely qualified. In my last job search (less than 5 years ago) I applied to well under 50 companies and had several offers.

> In the last 15 years, I have never gotten a job that I sent in an application for. Every job I've had was either a recruiter contacting me, or an inside referral from a friend or peer.

Perhaps companies would be more likely to carefully review the unsolicited applications coming in, if they were not mostly spam from unqualified applicants? However, I predict the opposite — companies will respond to the deluge of AI applications by moving even more to recruiters, referrals, and other illegible filters.

Expand full comment

application->interview stats don't justify your position. The much bigger hurdle is the connection from "no interview" to "unqualified applicant".

> How do you account for this lack of response?

They got 10k applications for 20 openings, called people in for interviews until they hired 20 people, and never called the rest of the people on the list.

> I would again argue that if that happens, then you were in fact not extremely qualified.

Again, why? You keep making this claim, but haven't said anything to support it. I have 20 years of experience in software development, system administration, dev ops, etc. I've put in 3 applications per week for the last 6+ weeks, as required by my unemployment benefits since a recent job loss, and gotten zero non-automated response. In the same time period, Meta and a bay area startup have both reached out to me and I've made it part way through their interview processes.

Expand full comment

> You keep making this claim, but haven't said anything to support it.

If the average applicant gets a response to 1 in 6, and some other applicant has sent hundreds of applications and gotten nowhere, what can we conclude? It seems to me that at least one of the following must be true: Either

* The second applicant is not qualified

* The second applicant is undesirable to employers for reasons unrelated to qualifications (for example, because of racism)

* The second applicant is applying to a different sort of employer, perhaps employers that have given up on processing unsolicited applications altogether.

A recent audit study sent 80,000 resumes to 10,000 empolyers and 25% of the applications got a response within 30 days. https://www.nber.org/papers/w29053 Why do you think your experience is so different?

Expand full comment