Why my p(doom) has risen, dramatically

The chance that humans will literally go extinct at the hands of AI, I told Liron Shapira, in his podcast Doom Debates in May, was low. Humans are genetically diverse, geographically diverse, and remarkably resourceful. Some humans might die, at the hands of AI, but all of them? Shapira argued that doom was likely; I pushed back. Catastrophe seemed likely; outright doom seemed to me, then, to be vanishingly unlikely.

Part of my reasoning then was that actual malice on the part of AI was unlikely, at least any time soon. I have always thought a lot of the extinction scenarios were contrived, like Bostrom’s famous paper clip example (in which superintelligent AI, instructed to make paper clips, turns everything in the universe, including humans, into paper clips). I was pretty critical of the AGI-2027 scenario, too.

My main AI fears, as I have written before, have mainly been about bad actors, rather than malicious robots per se. But even so, I think most scenarios (e.g., people homebrewing biological weapons) could eventually be stopped, perhaps causing a lot of damage but coming nowhere near to literally extinguishing humanity.

But a number of connected events over the last several days have caused me to update my beliefs.

To really screw up the planet, you might need something like the following.

A really powerful person with tentacles across the entire planet

Substantial influence over the world’s information ecosphere

A large number of devoted followers willing to justify almost any choice

Leverage over world governments and their leaders

Physical boots on the ground in a wide part of the world

A desire for military contracts

Some form of massively empowered (not necessarily very smart) AI

Incomplete or poor control over that AI

A tendency towards impulsivity and risk-taking

A disregard towards conventional norms

Outright malice to humanity or at least a kind of reckless indifference

What crystallized for me over the last few days is that we have such a person.

Elon Musk.

The first thing that frightened me, and I mean really frightened me, came in the unveiling of Grok 4 on Wednesday July 9, wherein Musk basically admitted that he doesn’t know how to control his own AI, and that it might be bad, but that he’d like to be around to watch:

“And will this be bad or good for humanity?

It’s like, I think it’ll be good. Most likely it’ll be good.

Yeah. Yeah. But, I somewhat reconciled myself to the fact that even if it wasn’t going to be good, l’d at least like to be alive to see it happen.”

It is terrifying that Musk, who famously warned in 2014 at MIT “we are summoning the demon with artificial intelligence” now seems only mildly concerned with what might happen next, in the event of some massive AI-fueled catastrophe. He is ok with it, as long he gets a front-row seat.

Meanwhile, of course, he aspires to build tens of billions of robots, far outnumbering people. And to stick his AI and robots pretty much everywhere.

And, currently, we have almost no regulation around AI, for that matter the billions of robots he aspires to build.

What could possibly go wrong?

But all that is only part of what has me concerned.

Another concern is that the release of Grok 4 has been a train wreck. Musk and his AI company can’t stop Grok from spewing invective, and some of it is pretty dark.

One common theme is anitsemitism and its fondess for Hitler, summarized here by Wikipedia:

From Grok 4 heavy, the most advanced model:

Another common theme is sexual violence:

None of this is inevitable in LLMs. Most other major LLMs don’t share these specific antisocial tendencies, at least not without much more focused efforts at jailbreaking. But the mess has been going for days.

The fact that xAI can’t get its house together in even basic ways is disturbing.

Worse, it is clear, even from Elon’s own posts on X, that the company is in over its head.

Translation? We’ll make the next model better but only training on it things that are Elon-approved. That’s scary in its own 1984 sort of way, especially when other recent evidence shows that the system sometimes directly searches for Musk’s opinion before formulating an answer:

The system is so bound to Musk it even asked his opinion on pizza.

The official xAI account of the “mechahitler” incident doesn’t give a lot of comfort either. It sounds like a handful of tiny changes (here three) can have huge, unpredicted consequences through the system:

From this I infer that xAI’s main methodology for handling alignment is trial-and-error.

Hardly comforting.

The problems over at xAI are not new either. This is from an earlier version of Grok, in 2023.

Things are actually worse now, not better.

Part of the problem here, by the way, is that the whole idea of building alignment through training data alone is a mess.

As Eliezer Yudkowksy put it on X, “If your alignment plan relies on the Internet not being stupid then your alignment plan is terrible.”

Absent systems cognitively rich enough to represent and reason about moral principles, I don’t how this can ever work.

More broadly, Yudkowsky wrote, “The AI industry is decades away, not years away, from achieving the level of safety, assurance, understanding, and professionalism that existed in the Chernobyl control room the night their reactor exploded anyway.”

Incompetence is only part of the problem.

A deeper problem is that xAI refuses to play by conventions that others have set out, as an Anthropic employee (who properly disclosed his conflict of interest) pointed out in an important thread.

This is the first of several screenfuls that call xAI to task:

Way out of line is right.

I strongly urge you to read the entire thread, the bottom line of which is “AI developers should know whether their models have [bad] behaviors before releasing.” xAI apparently didn’t. They either didn’t do the preflight checks that have become standard (such as as red-teaming and model-cards), or did them poorly.

And industry-standard measures are unlikely to be enough, in any event.

MIT Professor Dylan Hadfield-Menell makes a further point, endorsing the Samuel Marks thread above and adding “this is why we need to go beyond voluntary safety standards. It is in @xai’s interest to get in line with the rest of the industry on their own, but we shouldn’t rely on trust.” We need serious regulation to prevent AI’s from running amok. Voluntary agreements between companies are not going to cut it.

We need liability, auditing, standards of malpractice, international treaties, too.

Already, in current models, seeing bias at scale, even threats of physical violence, in a class of systems that we are increasingly empowering with massive control over our lives.

With no regulation, and no enforcement, it is a recipe for disaster.

It is genuinely scary that Elon Musk went from being one of the first industry leaders warning about AI risks to being the most reckless of the AI leaders.

Is it possible that a poorly constructed AI fueling a worldwide fleet of robots could go truly, horribly wrong?

Yes.

And it’s not just robots, either. LLMs are being inserted into every facet of our lives, from cars to medicine to government. Just this morning, as I was drafting this, the Washington Post reported that the US Defense Department had begun using Grok. Drones and even nuclear weapons may eventually be under LLM command.

A short speculative 2017 film on drones, Slaughterbots, comes to mind. As does this recent quote from Tom Nichols at The Atlantic:

Some defense analysts wonder if AI—which reacts faster and more dispassionately to information than human beings—could alleviate some of the burden of nuclear decision making. This is a spectacularly dangerous idea. AI might be helpful in rapidly sorting data, and in distinguishing a real attack from an error, but it is not infallible. The president doesn’t need instantaneous decisions from an algorithm.

Forcing unreliable AI everywhere in the decision-making chain is not necessarily something we should want.

In his 2014 MIT speech, Musk elaborated on his demon fears:

You know all those stories where there’s the guy with the pentagram and the holy water and he’s like… yeah, he’s sure he can control the demon, [but] it doesn’t work out.

I hope Musk won’t turn out to be that guy.

In fairness, and with a trace of optimism, I don’t think that the nightmare scenario that I am sketching is a certainty, or even close to a certainty. My p(doom) is still lower than most industry people’s. I am at maybe 3% now, much higher than a month ago, contemplating what a wealthy, reckless megalomaniac might in the worst circumstances do, But I am still betting on Team Human for the foreseeable future.

I still think humans are resourceful. We are still obviously genetically and geographically diverse. The chance Elon will get his billion robots out this decade is near zero. Not that much higher next decade. His political capital is rapidly diminishing, relative to where it was a few months ago. Grok itself is a long way from AGI, and hardly smart enough to map out an effective world domination. We are still, perhaps, hopefully, in the realm of science fiction. Importantly, we still have time to think about all this, and to prepare. Maybe someone will come to their senses and stop putting LLMs into military systems. We might actually come up with better ways of approaching alignment (something I am myself interested in, given the right funding). Musk might revert to his early, more concerned self.

ButI have seen enough to realize that there is a real risk, especially in the current anti-regulatory regime, that some exceptionally powerful person, Elon or otherwise, unconstrained in conventional ways, with a reckless disregard for humanity, could accidentally launch and spread an AI that deliberately or otherwise causes “significant harm to the world”.

Subscribe now

Source link

What's Hot

You, AI, and the Brands You Love

Grok 4 Scores High on Benchmarks but Controversy Clouds the Launch

Nvidia to Restart H20 AI Chip Sales to China After US Assurance on Export Licenses

Why my p(doom) has risen, dramatically

How o3 and Grok 4 Accidentally Vindicated Neurosymbolic AI

AI coding may not be helping as much as you think

Are LLMs starting to become sentient?

The Artists and Art Pros Who Donated to Cuomo and Mamdani’s Campaigns

Phillips Sues Billionaire’s Son Over $14.5 M. Pollock Painting

Murujuga Rock Art in Australia Receives UNESCO World Heritage Status

‘Earth Room’ Caretaker Dies at 70

You, AI, and the Brands You Love

Grok 4 Scores High on Benchmarks but Controversy Clouds the Launch

Nvidia to Restart H20 AI Chip Sales to China After US Assurance on Export Licenses

What's Hot

Why my p(doom) has risen, dramatically

Related Posts

Subscribe to Updates