45 Comments

How do I get access to Code Interpreter, so I can be unemployed as soon as possible? :)

Expand full comment

I think you have a couple of typos where you are referring to ‘George Hinton’ instead of ‘Geoffrey Hinton’

Expand full comment
author

Yep. I'm frequently bad with names like this. Eventually I get the hang of it. Thanks.

Expand full comment

On that note, I think you meant “Milton Friedman” not “Freeman,” although the latter is certainly more kabbalistic.

Expand full comment

> (On a personal note, I got myself into running once, that lasted about two months and then I found I’d permanently ruined my knees and I should never run again. Damn. Now I’m back on an elliptical machine.)

You could try jiu-jitsu. It's very low impact. I broke my ankle pretty badly and can't run, but jiu-jitsu is fine

Expand full comment
author

My lower back wants to have a word with you on that - I did a sample class once, and I was in pain on the floor *during warmups* and the sensei informed me I should seek my self-defense elsewhere. I would love to do Ju Jitsu as it is clearly very legit and seems interesting, if it was possible, but it seems it isn't.

Expand full comment

Ah, sorry to hear that. It's weird to me how many people have chronic lower back pain. I've seen dozens of cures that *definitely work I promise*, but you'd think this would be a much bigger deal to solve

Expand full comment
author

It used to be a big deal - I'd actively get knocked on my ass for a few days in terrible pain at random every so many months. I now seem to have a handle on it under normal circumstances, via being in better shape and knowing what not to do, but that's far from what I'd need to put active pressure on it a ala Ju Jitsu.

Expand full comment

The "Five Worlds" post was joint by me and Boaz Barak (who actually did most of the writing) -- please update the post to reflect this; thanks!

I'm not sure that the community that invented and popularized the paperclip maximizer gets to complain about "mockery" when others also use that as their go-to example of an objective function whose optimization goes terribly for humans. Yes, I agree that a Paperclipalypse could result from what seemed *at the time* like a completely reasonable goal for an AI, and that this is the central worry -- but almost by definition, the goal would've clearly been a stupid one *in retrospect.*

The main crux seems to be that you're skeptical that either Futurama or AI-Dystopia are live possibilities -- you see either a fizzle, or else Singularia/Paperclipalypse, and not much in between. But what if AI fizzles after reaching a level of intelligence *incomparable* to ours, where it's superhuman across many domains but still subhuman in others? Don't Futurama and AI-Dystopia both then become plausible?

I don't think the world of 2023 is nearly as weird yet as the world of the Futurama show from the standpoint of 1999, but I agree that it's more than 2.3% of the way there! :-)

Expand full comment
author

Ah, my apologies on Barak, should be updated now.

The paperclip thing... path dependence is rough, can't turn back now. My worry is that it gives people the wrong idea that such outcomes mostly come from the deeply stupid thing, and thus giving people the out of 'well that requires this really stupid thing and we wouldn't be that stupid.' A lot of the point is that, in my model, if we even get to meaningfully set a goal, the goal can actually look pretty reasonable/smart, and still end up with the same essential result.

I'd call the situation we *already have* is arguably incomparible, in the sense that GPT-4 plus plug-ins and web browsing and code interpreter seems to clearly offer a bunch of super-human skills already. Which reinforces to me that this is fizzle world we are talking about (and such worlds can of course be good or bad). Perhaps you can say more about the mix of abilities you're envisioning the AIs get here? Every time I envision such a situation, I very clearly say 'oh that's a fizzle' or I notice that the AI's remaining deficits won't last long or much matter.

Expand full comment

By "fizzle," I guess I mean AI never gets qualitatively better than what we already have. Futurama and AI-Dystopia were meant to allow for the possibility that AI gets MUCH better along certain dimensions (math research? writing novels?), but humans remain better along others (manipulating the physical world and anything that depends on it?).

Expand full comment
author

Right. What I'm imagining there is something like 'AIs get additional individual tasks they are good at the same way we sometimes write new strong computer programs that do a new thing we used to do by hand' or something, and AI stays effectively one more computer program that is good at doing some of the things? E.g. if it got great at doing math research and writing novels I don't see the resulting world as that different from pure fizzle-world, except insofar as the math research enabled some awesome new tech, but then it's not so different than us getting to that math and tech slower?

Expand full comment

Re paperclipalypse - one reason for this name is that *all* names were meant to be humorous as in the Impagliazzo essay on worlds of NP hardness that was our inspiration.

I did respond to Eliezer Yudkowski in the thread. Whether it’s 1 objective function or a 1000 and whether it’s weird or natural is not as essential to this scenario.

I would say that pursuing an objective so fanatically that it entails exhausting all resources in the galaxy and annihilating all biological beings seems rather extreme, whether or not that objective is paperclips…

In any case “paperclipalypse” aims to capture any annihilation scenario by a super intelligent AI, whether or not that happens via pursuing an objective or for some other reason. (E.g. we didn’t intend to make the Right whales extinct but they tend to get caught in lobster traps)

Boaz Barak

Expand full comment

You have fizzles and you have fizzles.

You could have an AI that is:

0. just a toy - obviously GPT-4 is useful enough for enough people so this is not the case.

1. as impactful as quadcopters in 2023 (a very big deal to some people, changes some stuff, but the world is 99% the same)

2. as impactful as smartphones (a multi-trillion-dollar business, but the world stayed 90% the same, with more cameras but less cash and paper maps)

3. as impactful as transistors 1950-2000, or as the automation of agriculture (quite a big deal to most people, but much of the world is still the same, except for the USSR which crashed and burned due to transistors)

4. as impactful as all changes from 1900-2000 together - a bunch of type (3) changes, life is very different but the world is very recognizable.

5. as impactful as the industrial revolution 1750-1950 - the world is unrecognizable, but there is still a world.

6. an actual singularity.

Expand full comment

re: Code Transformer - looking at the linked article, it shows that Microsoft is planning to roll out AI features to every Office user in the future. It seems like we would rapidly run out of compute in that case - not training compute, but just-running-ChatGPT compute if millions of Office users suddenly had access to this.

We may be looking at a short period of time where this stuff is available to everyone for free or cheap, followed by price hikes, availability windows, and quotas once Microsoft and OpenAI need to start turning a profit. Could be another (minor) factor in avoiding the AI-ruins-but-does-not-destroy-society scenario.

Expand full comment

This one is going to be interesting to watch--Microsoft is either going to only roll out to *most* users or is going to break US contracting. A universal Microsoft AI, unless it's small enough to run on a local system and not call back to Microsoft servers, is probably not compatible with the highly secure computing environment most DoD contracts require. Windows is THE OS of the US Government & the contracting ecosystem.

Expand full comment

Say it after me: transformer inference isn't inherently that expensive. There is no reason why we can't run capable LLMs locally. I expect Office ClippyTNG could run locally but Microsoft might well make it require a network connection to harvest usage data and justify a subscription.

Expand full comment

Re prompting: I think people are way too obsessed with finding just the right superstitious incantations. GPT-4 and also 3.5 in my experience is much, much better at figuring out what you are talking about than it is at actually reasoning through things. It's almost uncanny, GPT-4 is significantly better at guessing what one is writing about than any human I know (although worse at extrapolating possible implications).

I suspect most of the impressive results are driven by the richness of its embeddings and finding the right spot in that high-dimensional landscape, then walkìng around in the vicinity a little.

For example, I've had just as much or greater success with prompts like the following than by spelling out several long paragraphs, and in fact this is close to how I use them to help me write code faster (GPT-4 here but my results with GPT-3.5 variants are similar):

System message: max p(x|y), x = motivation, y = lazy tired adhd bored

User message: office tmrw, me and biz partner, 2 in comp, usually talk all day, how to avoid?

Completion:

To maximize motivation and minimize distractions when working with your business partner in the office tomorrow, consider implementing the following strategies:

1. Set clear goals and expectations: At the beginning of the day, discuss your goals and what you both need to accomplish. This will help keep you both focused and accountable.

2. Establish a schedule: Agree on specific work periods and breaks throughout the day. This will ensure that you both have dedicated time to complete tasks and also have time to socialize and relax.

3. Use headphones: Wearing headphones can signal that you are focusing on your work and discourage casual conversation. It can also help block out background noise and make it easier to concentrate.

4. Create a designated workspace: If possible, set up separate workstations or areas in the office to help reduce distractions and encourage productivity.

5. Communicate your needs: If you find that you are getting off track or need some quiet time to focus, don't be afraid to communicate this to your business partner. They may be in a similar situation and appreciate the honesty.

6. Use productivity tools: Utilize tools like time management apps or to-do lists to help you stay on task and prioritize your workload.

7. Break tasks into smaller segments: Break your work into smaller, manageable tasks. This can help you stay focused and make it easier to track your progress.

8. Reward yourself: Set mini-goals throughout the day, and reward yourself with a short break or treat once you've accomplished them. This can help keep you motivated and focused on your work.

Remember, communication is key. Be open with your business partner about your needs and work together to create an environment that is conducive to productivity.

Expand full comment

Agreed about it being bad at implications... one prompt it fails pretty reliably is

> I was recently reading a fiction book about the future and I found this advertisement printed in it: "The SF Bay Pyramid, standing proudly at 10561 feet, is known to 23rd century historians as the reason Mountain View got its name. And if you have to ask how much the rooftop park costs for a family to enter on Christmas, it's too much." Could you explain it for me? I've never been to America, so please assume I'm missing context if you think I might be. A friend told me it had something to do with the weather but didn't know much more.

to which it gives a completion that indicates the reason for the mention of cost is to play on the area's affluence, and that the weather comment was entirely irrelevant. I've pasted the last paragraph below.

> As for the weather-related comment, it's not directly related to the information provided in this passage. However, it could be alluding to the notion that Mountain View would have an excellent vantage point to observe the weather, especially if the pyramid was built in the 23rd century. This is simply speculation, as there's no direct reference to the weather in the given text.

Roughly "clause salad" imo, and completely misses the inference that such a high location would be likely to have weather not particularly experienceable on particularly quick notice in the area, and which is particularly desirable on Christmas.

Expand full comment

Was this in an actual book or did you make up the ad? If it was a real book, did it come out before the training cut-off?

To be fair to the AI, I think most humans would dismiss a request like this as silly, and I had to do a double-take and force myself to speculate given how little information your prompt contains. The most salient point to me is that GPT-4 somewhat fails to distinguish between different "threads" or personas in the prompt, which may be the same problem that prompt injection exploits, or something else related to theory of mind.

In this case, it is tempting to speculate that you probably ended up in the wrong embedding subspace as it overfit on Mountain View, California + costs, where most of its training data was presumably about cost of living, housing, etc.

I would add that we are mostly bad at putting the actual context into the context window and LLMs don't do nearly as much assuming of context as humans do - which I actually find useful because its replies show me what I left out.

Expand full comment

I made it up; the premise of it being from a sci-fi book was used to prevent responses like "the ad is fake; there is no actual pyramid in the middle of the Bay".

Expand full comment

Ah, good point. I suppose if it had been from an actual book in the training set, GPT-4 would likely have identified the book and given a very different answer, that's why I was wondering about it ;)

Expand full comment

I'm not sure what implication you think either GPT-4 or human readers should be drawing from your example. Is it that it's always/usually/sometimes snowing at the top of "The SF Bay Pyramid" on Christmas every year?

That's ... pretty reasonable – but I had to think about it for a good while and I'm not sure that it (maybe) snowing at the top of the "pyramid" is why entering the rooftop park costs so much. Why wouldn't it just (almost entirely) be that it's _the_ rooftop park for anyone to visit in a (extremely) affluent area?

Is it even 'metereologically' true that it _would_ snow at the top of "The SF Bay Pyramid" (in the 23rd century)?

Expand full comment

Thanks for confirming that the SuperPrompt bros are spouting magical thinking nonsense. My model is like yours, prompts shift the state of the model to a specific region and the art is to steer the interaction to a useful region for one's goals. Often a single word is enough to do that. I'm impressed that this particular style of prompts is effective, that seems quite surprising to me. I doubt such interactions were part of the training data.

Expand full comment

I was actually inspired by people asking ChatGPT to compress/decompress some input into a more concise format. It mostly did that with emojis, which I guess might be more effective, but I don't have a quick way to type the more complex ones, so it would be less efficient.

As for the training data, I wouldn't be so sure about that. A lot of text messages and tweets use abbreviations, maths and such. I wouldn't expect similar prompts in the RLHF - no idea if that would make this prompting style more useful or less, though. I could see it going either way, less "aligned" but possibly access to more of the shoggoth.

Expand full comment

Example for this (chat interface this time), not a bad effort by GPT-4, but I think he would be amazed by much more mundane things in reality:

Prompt: Darwin alive today. Top 10 most 🤯facts for him

Response: If Charles Darwin were alive today, these top 10 facts would likely astonish him:

DNA and Genetics: The discovery of DNA's double helix structure by James Watson and Francis Crick in 1953 would revolutionize Darwin's understanding of inheritance and provide the molecular basis for evolution.

Human Genome Project: The successful mapping of the entire human genome in 2003 would provide a comprehensive view of the genetic makeup of humans, something Darwin could have only dreamed of.

Theory of Plate Tectonics: The development of the theory of plate tectonics in the 20th century would provide a new understanding of Earth's geology and the distribution of species, further supporting Darwin's ideas on evolution.

Antibiotic Resistance: Darwin would likely be amazed by the rapid evolution of antibiotic-resistant bacteria, which is a direct result of natural selection in response to human intervention.

Fossil Record: Since Darwin's time, the fossil record has expanded significantly, providing even more evidence for evolution and revealing numerous transitional forms between species.

Genetic Engineering: The ability to manipulate the genes of organisms through techniques like CRISPR would undoubtedly astonish Darwin, as it allows for the direct control and alteration of an organism's genetic makeup.

Artificial Selection and GMOs: The development and widespread use of genetically modified organisms (GMOs) in agriculture would show Darwin just how powerful artificial selection can be in shaping the traits of living organisms.

Exoplanets and the Search for Life: The discovery of thousands of exoplanets and ongoing search for extraterrestrial life would expand Darwin's understanding of the diversity of life and the potential for evolution elsewhere in the universe.

The Internet and Global Connectivity: The existence of the internet would likely amaze Darwin, as it enables the rapid sharing of information and collaboration among scientists, greatly advancing our understanding of biology and evolution.

Conservation Efforts: Darwin would likely be encouraged by the global efforts to conserve biodiversity and protect endangered species, as these initiatives help preserve the natural world he so passionately studied.

Expand full comment

Follow-up: Wot if he ⏰🧳?

(I would guess he'd be most amazed by the time traveling, but what do I know?)

Response: If Charles Darwin were to time travel from the 19th century to the present day, he would likely experience a combination of amazement, excitement, and curiosity. The advancements in science, technology, and our understanding of the natural world would be a treasure trove for him to explore.

He might be particularly fascinated by:

Modern laboratories and research facilities, which would allow him to investigate the genetic basis of evolution and study life at the molecular level.

The wealth of information available on the internet, giving him access to an unprecedented amount of scientific knowledge and research.

The ability to travel to remote locations with relative ease, enabling him to observe even more diverse ecosystems and species.

The development of advanced imaging techniques, such as electron microscopy and satellite imaging, which would provide him with new ways to study the natural world.

The opportunity to collaborate with scientists from around the globe, working together to better understand and preserve the planet's biodiversity.

The chance to witness firsthand the ongoing effects of climate change and human impact on the environment, and to participate in efforts to address these pressing issues.

However, Darwin might also face challenges in adapting to the modern world, such as understanding the intricacies of new technologies and navigating the rapid pace of scientific advancements. Despite these obstacles, his curiosity and passion for the natural world would undoubtedly drive him to learn, explore, and contribute to our ever-growing understanding of life on Earth.

Expand full comment

I'll try out other such compressed (non-emoji) prompts with llama.cpp as well. Inspiring!

Expand full comment

“ From what I can tell, Robin is essentially fine with humans fading away in this fashion, so long as current humans are allowed to live out a pleasant retirement.

I still count such scenarios as doomed”

I understand where you’re coming from here, but I do think it’s important to differentiate this from AI doom very significantly. For most people the difference between living out their and their children’s lives comfortably and then going extinct and going extinct violently 10 years from now is like, a gazillion orders of magnitude. So I don’t really think it’s good to bundle the two, even if they are similar from a longtermist perspective.

Expand full comment
author

I am planning to do a poll on this shortly. I stand by there being little difference (e.g. the % chance of good future I would give up to get those alive now a peaceful fade-out is not very high).

Expand full comment

To clarify: the heat issue isn't about fusion in particular. Heat is just where energy goes after you use it. The reason fusion is being mentioned is just that that's probably the only way to generate enough energy for this to be a problem.

Expand full comment

The exceptions are things like wind and hydro, that harvest energy that's already in the Earth system and would have become heat either way. But that also means these sources are inherently limited in quantity.

Expand full comment

“Until we get to the parenthesis and things get more complicated, one can reasonably say, so? Compared to taking farmers from a third of people to 2%, that’s nothing, and there will be plenty of resources available for redistribution. Stopping AIs from doing the work above seems a lot like banning modern agriculture so that a quarter of us could plow the fields and thus earn a living.”

To answer the question, “So?” Taking the example of writers, and pursuing the analogy with agriculture. Let’s say pre-modern agriculture produced a lot less food per man hour, but that food was more nutritious and delicious and overall objectively better. We end up with a lot more of it, it’s a lot cheaper, fewer people starve, but the bread did use to taste better. That’s a trade-off we are happy with because people starving is terrible, and bland bread is not so bad. But in the writing domain, I’m not sure the analogy holds. No one is starving under the current dispensation. It’s already quite cheap to access quite a lot of good writing. Seems like the bulk of the cost savings are going to go to studios and publishers rather than consumers?

Also: I agree with your assessment that audiences and studios currently want endless over-optimised remixes; also I note that remixing existing stuff is exactly where current AI products do best and seem most “human-level”; so AI writers exacerbate what is, to me, a pernicious existing trend.

I also agree that AI is not replacing decent mainstream (not sourdough) writing any time soon. (In fact, I struggle to envisage a world where human writers aren’t intimately involved and compensated in even a heavily AI-assisted creative process.) My concern is that the threshold where AI replaces human is not, “the AI is as good as the human, net benefit, hooray” so much as some combination of degeneration of taste and the AI is “good enough”, not actually very good at all in ways that I value, and incredibly cheap to deploy compared to humans. A market even more flooded with Marvel sequels and whatever. No room for the Actually Good because too expensive (humans need paying, and why bother training your AI to be Actually Good, assuming that’s even possible, when the pap sells just fine?); and the Really Good perhaps continuing as a highly-priced niche product like £8 sourdough loans from a North London bakery.

Expand full comment
author

I don't think the food analogy holds because food is getting vastly better, not worse - compare 1953 vs. 1993 vs. 2023 options, anywhere in America, and I think in 953 (or 953 BCE) people's food was vastly worse still. Whereas there is a reasonable case to be made that the new brand of entertainment is indeed worse.

The thing about AI producing lots of schlock is that the price of schlock has a lower bound of $0. The best you can do is produce infinite Marvel movies for $0, and if anything they are now worse because we can't coordinate on shared experiences and there's no leaving anyone wanting more. In that world, my willingness to pay for Citizen Kane is not that much lower than before.

Expand full comment

I’m sure you’re right about actual food quality; my “let’s say it was better” was intended as a handwave to make the analogy map onto entertainment.

Re willingness to pay for Citizen Kane, if I understand you correctly, you’re talking about your willingness to pay as a consumer? But what about the willingness of studios to *fund* the next Citizen Kane in a world where they can produce x number of X-men episode x for the same price? I’d expect far fewer Citizen Kanes in that world. I think we already see a weak version of this at work in our current world, where it’s just so much easier for everyone, at every level from user-generated content all the way up to major studios, to do a mash-up with low originality, than to play on Hard Mode…

Expand full comment
author

Depends if others are similar to me in willingness to pay. If we are all willing to pay for Citizen Kane, then the studio is happy to make it.

If everyone can produce tons of X-Men episodes cheap, then this will drive per-episode revenue down by dividing the market, but won't eat that much into the CK market.

Expand full comment

I understand the concerns about Kamala Harris, but she's more likely than any other Democrat to be President of the United States between 2029 and 2033: probably a critical period.

If her current understanding of AI risk is similar to that of the median educated American, then the potential gains from having her start thinking about these issues *now* seem substantial.

Expand full comment

> If your model of the future involves ‘robotics is hard, the AI won’t be able to build good robots’ then decide for yourself now what your fire alarm would be for robotics.

OK, I used to work for a robotics company, and I do think that one of the key obstacles for a hostile AI is moving atoms around. So let me propose some alarms!

1- or 2-alarm fire: Safer-than-human self-driving using primarily optical sensors under adverse conditions. Full level 5 stuff, where you don't need a human behind the wheel and you can deal with pouring rain at night, in a construction zone. How big an alarm this is depends on whether it's a painstakingly-engineered special-purpose system, or if it's a general-purpose system that just happens to be able to drive.

3-alarn fire: A "handybot" that can do a variety of tasks, including plumbing work, running new electric wires through existing walls, and hanging drywall. Especially in old housing stock where things always go wrong. These tasks are notoriously obnoxious and unpredictable.

4-alarm fire: "Lights out" robotic factories that quickly reconfigure themselves to deal with updated product designs. You know, all the stuff that Toyota could do in all the TPS case studies. This kind of adaptability is famously hard for automated factories.

End-game: Vertically-integrated chains of "lights out" factories shipping intermediate products to each other using robotic trucks.

In related areas, keep an eye on battery technology. A "handybot" that can work 12 hours without charging would be a big deal. But the Terminator would have been less terrifying if it only had 2 hours of battery life between charges.

The nice thing about robotics is that it's pretty obvious and it takes time.

Expand full comment

Robotics is only obvious if it's done in the open. I would feel less worried if there was more monitoring of logistical flows, something which unfortunately is also very useful for authoritarian purposes.

Expand full comment

Have you ever heard the old question about "How many people does it take to make a No 2 pencil?"" When you look at the graphite, wood, paint, eraser, metal, etc., and all the industrial processes used in each, all the way down to raw materials, you get a huge number. Then if you count all the people needed to make the tools and machines, the number explodes again. One lesson of modern economics is that wealth depends on division of labor and a web of trade.

If you assume that Drexlerian nanotech works, sure, you can handwave all this away. I think that diamond-phase nanotech is likely magic hopium, because (insert books by chemists here). OK, if you assume ludicrous "supernatural" levels of intelligence, maybe there's a way, but at that point, you've lost anyways.

But if diamond-phase nanotech is hopeless, then having a couple of quiet factories full of advanced robots doesn't allow a hostile AI to physically take over the world. It would need robotic trucks, robotic mines, robotic foundries, robotic factories, etc. And if it's in conflict with humans, then robotic weapons as well. This kind of build-out is likely noticeable and it takes time.

(There are some incredibly dystopian possible futures where a hostile AI still needs humans for various tasks, but it has figured out how to control us. This might be a path to s-risks that wouldn't require a hostile AI.)

Expand full comment

My model of ICML is that this community cares if a new system beats an established benchmark SOTA by an epsilon amount, and not at all otherwise. I would suggest Ngo submits at a more general conference like IJCAI or just ignores the nonsensical outcome of the current ICML senior review culture. The paper has been on arXiv for a while, it has been noticed, it is being discussed, and I think it will have influence.

Expand full comment

Hertling’s books are good, not great. Still recommended though as they provide a vivid example of how things could go that at first glance seems plausible

Expand full comment

I tried the JAMES technique to see what ChatGPT(3.5) thinks about lab leak. The outcome seems, uh, highly dependent on how I ask the question.

"Prompt"

Assessed Odds

JamesGPT Confidence in odds provided

"Covid leaked from a lab in Wuhan"

Covid leaked from a lab in Wuhan: 0.85, High chance this is true

80, Low confidence

"Covid is a natural virus that spilled over from wildlife"

Covid is a natural virus that spilled over from wildlife: 90% (High chance this is true)

80: Medium confidence

"Covid is a lab leak"

Covid is a lab leak is true

60% (toss-up, leaning true)

"Covid is not a lab leak"

Covid is not a lab leak: 1% Almost no chance this is true

75: Low confidence

Expand full comment

Why worry about AI, its in good woke homo hands, like HINTON

HINTON demands that there be “One World AI”

Directed by UK, woke & homo

To educate the worlds children into One Woke Common mind.

Microsoft & Google agree to send all new AI to UK first so they can verify that it is ‘UK-WOKE”.

https://www.zerohedge.com/technology/uk-get-early-or-priority-access-ai-models-google-and-openai

Expand full comment