## 12. Why Kolmogorov complexity is useless

377 words

Mathematical hobbyists tend to be fascinated with information theory, Kolmogorov complexity and Solomonoff induction.This sentiment is very understandable. When I first learned of them, these subjects felt like they touch upon some fundamental truth of life that you don’t normally hear about. But for all its being a fundamental property of life and understanding, mathematicians treat it as a cute mathematical curiosity at most. In this post I will explain some of the reasons why so few mathematicians and computer scientists have cared about it over the past 50 years.

The zero-th reason is that it depends on your choice of encoding. You cannot cover this up by saying that any Turing machine can simulate any other with constant overhead, because a 2000 bit difference is not something you can compensate for on data-constrained topics.

The first reason is obvious and dates back to ancient times. Kolmogorov complexity is not computable. Properties that we can literally never know are pretty useless in everyday life.

The second reason is related: we cannot compute Kolmogorov complexity in practice. Even time-constrained variants are hellishly expensive to compute for large data sets.

The third reason is more typical of modern thinking in computer science theory. Namely that any theory of information needs a theory of computation to be useful in practice. This is directly related to the difference between computational and statistical indistinguishability, as well as the myth that your computer’s entropy pool could run out. Cryptography is safe not because it is information-theoretically impossible to retrieve the plaintext but because it is computationally infeasible to retrieve the plaintext. The Kolmogorov complexity of a typical encrypted data stream is low but it would be mistake to think that anyone could compute a short description. Along another route, once I have told you an NP-complete problem (with a unique solution), it won’t add any new information if I told you the answer. But still you would learn new information by getting the answer from me, because you couldn’t compute it yourself even knowing all requisite information.

Kolmogorov complexity is useless based on classical CS theory, practice and modern CS theory. This is how you know that anyone who proposes that it is an integral part of rational thought is full of shit.

## 11. Did MIRI cause a good thing to happen?

1134 words

The Future Perfect podcast from Vox did an episode proclaiming that AI Risk used to be a fringe concern but is now mainstream. That is why OpenAI did not open up GPT-2 to the general public.1Or to academic peer review for that matter. This was good. Everything thanks to Jaan Tallinn and Eliezer Yudkowsky. I cannot let this go uncontested.

Today: why are the people who made our writing bot so worried about what it could do? The short answer is they think that artificial intelligence models like this one can have major unintended consequences. And that’s an idea that’s moved from the fringe to the mainstream with the help of philanthropy.

[0:02:21-0:02:43]

This here is the central claim. The money of Tallinn is responsible for people thinking critically about artificial intelligence. Along the way we hear that Tallinn acquired his AI worries from Yudkowsky. Hence, Yudkowsky did something good.

AI might have unintended consequences, like taking our jobs or messing with our privacy. Or worse. There are serious researchers who think the AI could lead to people dying: lots of people. Today, this is a pretty mainstream idea. It gets a lot of mentions it any round-up by AI expert of their thinking on AI and so it’s easy to forget that a decade ago this was a pretty fringe position. If you hear this kind of thing and your reaction is like “Come on, Killer Robots? Really that sounds like science fiction”, don’t worry, you are part of a long tradition of dismissing the real world dangers of AI. The founders of the field wrote papers in which they said as an aside. “Yes. This will probably like transform human civilization and maybe kill us.” But in the last decade or so, something has started to change. AI Risk stopped being a footnote in papers because a small group of people in a small group of donors started to believe that the risks were real. Some people started saying wait if this is true, it should be our highest priority and we should be working on it. And those were mostly fringe people in the beginning. A significant driver of the focus on AI was Eliezer Yudkowsky.

[0:04:17-0:05:50]

So the driving force behind all worries about AI is said to be Yudkowsky. Because of his valiant essay-writing, Tallin got convinced and put his money towards funding MIRI and OpenAI. Because of course his real fears center around Mickey Mouse and Magic Broomstick, not on algorithms being biased against minorities or facial recognition software being used to put the Uyghur peoples in China in concentration camps. Because rational white men only focus on important problems.

Yes, so here are a couple examples every year the Pentagon discovers some bugs in their system that make them vulnerable to cybersecurity attacks. Usually they discover those before any outsiders do and they’re there for able to handle them. But if an AI system were sufficiently sophisticated, it could maybe identify the bugs that the Pentagon wouldn’t discover for years to come and therefore be able to do things like make it look to the US government like we’re being attacked by a foreign nuclear power.

[0:13:03-0:13:34]

This doesn’t have anything to do with my point, I just think its cute how people from America, the country whose army of cryptographers and hackers (all human) developed and lost the weapons responsible for some of the most devastating cyberattacks in history, worry that other countries might be able to do the same things if only they obtain the magical object that is speculated to exist in the future.

[GPT-2] is a very good example of how philantropic donations from people like Jaan Tallinn have reshaped our approach to AI. The organization that made GPT-2 to is called OpenAI. OpenAI got funding from Jaan Tallinn among many others and their mission is not just to create Artificial Intelligence. But also to make sure that the Artificial Intelligence it creates doesn’t make things worse for Humanity. They’re thinking about, as we make progress in AI, as we develop these systems with new capabilities, as we’re able to do all these new things, what’s a responsible process for letting our inventions into the world? What does being safe and responsible here look like and that’s just not something anybody thought about very much, you know, they haven’t really asked what is the safe and responsible approach to this. And when OpenAI started thinking about being responsible, they realized “Oh man, that means we should hold off on releasing GPT-2”.

[0:17:53-0:19:05]

This is boot licking journalism, completely going along with the narrative that OpenAI’s PR department is spinning, just like Vox’s original coverage of that puff news and all of the Effective Altruism community’s reaction. There is something profoundly absurd about taking a corporate lab’s press release at face value and believing that those people live in a vacuum. A vacuum where nobody had previously made unsupervised language models, as well as one where nobody had previously thought about what responsible release of ML models entails. OpenAI is fundamentally in the business of hype, to stroke their funders’ egos, and providing compute-heavy incremental progress in ML is just the means to this end.

It’s kind of reassuring but this organization is a voice at the table saying hey, let’s take this just a little slower. And the contributions from donors like Jaan Tallinn, they have to put that cautionary voice at the table and they put them there early. You know, I think it mattered. I think that the conversation we’re having now is probably more sophisticated, more careful, a little more aware of some of the risks than it would been if there hadn’t been these groups starting 10-15 years ago to start this conversation. I think I has one of those cases where something was always going to be funded only from the fringe and where it really didn’t matter that it got that funding from the fringe.

[0:20:18-0:20:53]

The writing makes a clear statement here: the people on the fringe (Yudkowsky et al.) are a significant part of the reason why people are thinking about this. I can hardly imagine how a journalist could say this after having done any research on the topic outside of their own cult-bubble, so I think they didn’t do this.

People in EA, people in ML and the staff at Vox seem almost willfully ignorant of all previous academic debate on dual use technology, none of which derives from MIRI’s fairy tales of evil genies. I blame this phenomenon on contempt of rationalists for the social sciences. If Yudkowsky contributed anything here, it might mainly be in making socio-political worries about technology seem marginally more exciting to his tech bro audience. But the counterfactual is unclear to me.

## 10. Compute does not scale like you think it does

520 words

One argument for why AGI might be unimaginably smarter than humans is that the physical limits of computation are so large. If humans are some amount of intelligent with some amount of compute, then an AGI with many times more compute will be many times more intelligent. This line of thought does not match modern thinking on computation.

The first obvious obstacle is that not every problem is linear time solvable. If intelligence scales as log(compute), then adding more compute will hardly affect the amount of intelligence of a system.2Whatever ‘intelligence’ might mean, let alone representing it by a number. Principal component analysis is bullshit. But if you believe in AI Risk then this likely won’t convince you.

The second, more concrete, obstacle is architecture. Let’s compare two computing devices. Device A is a cluster consisting of one billion first generation Raspberry Pi’s, for a total of 41 PFLOPS. Device B is a single PlayStation 4, coming in at 1.84 TFLOPS. Although the cluster has 22,000 times more FLOPS, there are plenty of problems that we can solve faster on the single PlayStation 4. Not all problems can be solved quicker through parallelization.3In theory, this is the open problem of P vs NC. In practice, you can easily see it to be true by imagining that the different rpi’s are all on different planets across the galaxy, which wouldn’t change their collective FLOPS but would affect their communication delay and hence their ability to compute anything together.

Modern computers are only as fast as they are because of very specific properties of existing software. Locality of reference is probably the biggest one. There is spacial locality of reference: if a processor accesses memory location x, it is likely to use location x+1 soon after that. Modern RAM exploits this fact by optimizing for sequential access, and slows down considerably when you do actual random access. There is also temporal locality of reference: if a processor accesses value x now, it is likely to access value x again in a short while. This is why processor cache provides speedup over just having RAM, and why having RAM provides a speedup over just having flash memory.4There has been some nice theory on this in the past decades. I quite like Albers, Favrholdt and Giel’s On paging with locality of reference (2005) in Journal of Computer and System Sciences.

Brains don’t exhibit such locality nearly as much. As a result, it is much easier to simulate a small “brain” than a large “brain”. Adding neurons increases the practical difficulty of simulation much more than linearly.5One caveat here is that this does not apply so much to artificial neural networks. Those can be optimized quickly partly because they are so structured. This is because of specific features of GPU’s that are outside the scope of this post. It might be possible that this would not be an obstacle for AGI, but it might also be possible for the ocean to explode, so that doesn’t tell us anything.6New cause area: funding a Fluid Intelligence Research Institute to prevent the dangers from superintelligent bodies of water.

## 9. Don’t work on long-term AGI x-risk now

194 words

Suppose you believe AGI will be invented in 200 years, and, if it is invented before the alignment problem is solved, everyone will be dead forever. Then you probably shouldn’t work on AGI Safety right now.

On the one hand, our ability to work on AGI Safety will increase as we get closer to making AGI. It is preposterous to think such a problem can be solved by purely reasoning from first principles. No science makes progress without observation, not even pure mathematics. Trying to solve AGI risk now is as absurd as trying to solve aging before the invention of the microscope.

On the other hand, spending resources now is much more expensive than spending resources in 100 years. Assuming a 4% annual growth rate of the economy, it would be around 50 times as expensive.7In all honesty, I don’t actually believe in unlimited exponential economic growth. But my job here is to attack the AI Safety premise, not to accurately represent my own beliefs.

Solving AGI Safety becomes easier over time, and relatively cheaper on top of that. Hence you should not work on AGI Safety if you think it can wait.

## 8. Links #3: the real AI was inside us all along

134 words

Olivia Solon: The rise of ‘pseudo-AI’: how tech firms quietly use humans to do bots’ work

It’s hard to build a service powered by artificial intelligence. So hard, in fact, that some startups have worked out it’s cheaper and easier to get humans to behave like robots than it is to get machines to behave like humans.

Brian X. Chen and Cade Metz: Google’s Duplex Uses A.I. to Mimic Humans (Sometimes)

In other words, Duplex, which Google first showed off last year as a technological marvel using A.I., is still largely operated by humans. While A.I. services like Google’s are meant to help us, their part-machine, part-human approach could contribute to a mounting problem: the struggle to decipher the real from the fake, from bogus reviews and online disinformation to bots posing as people.

792 words

So, I started the anti-AI Safety blogging series because I would be a good fit for the cause area as described by e.g., 80,000 Hours and it seemed reasonable to think through the arguments myself. As it turns out, they don’t stand up to scrunity. I decided to keep on writing for a bit anyway, as all AI Risk enthusiasts seem to be unaware of the counterarguments. I thought there was nothing out there in writing. Boy was I wrong.

This is a non-exhaustive list of links relating to AI Safety skepticism. For more, check out the similar reading lists by Marcus Vindig and by Alexander Kruel. Overlap between these lists is minimal and restricted to a couple of particularly good resources.

Rodney Brooks writes from MIT Technology Review of the seven deadly sins of predicting the future of AI. If you find a paywall, either clear your cookies or view a less edited version on Brooks’ website. His other essays on Super Intelligence are also well-worth checking out.

Wolfgang Schwarz published his referee report of Yudkowsky (MIRI) and Soares’ (MIRI) Functional Decision Theory. I’ll quote a single paragraph, which I think accurately illustrates the whole review: “The standards for deserving publication in academic philosophy are relatively simple and self-explanatory. A paper should make a significant point, it should be clearly written, it should correctly position itself in the existing literature, and it should support its main claims by coherent arguments. The paper I read sadly fell short on all these points, except the first. (It does make a significant point.)”

Ben Garfinkel gave a talk at EA Global 2018 titled “How sure are we about this AI stuff?”, calling for EA’s to be more critical about AI Safety as a cause area. Garfinkel knows his audience well, as everything is phrased so as to make EA’s think without ruffling feathers

Oren Etzioni writes in MIT Technology Review about the survey data Bostrom talks about in Superintelligence and offers alternative data that suggest a very different picture

Maciej Cegłowski‘s talks are always excellent and “Superintelligence: The Idea That Eats Smart People” is no exception. (via)

EA Forum user Fods12 wrote a five-part critique of Superintelligence. They hit on a number of good objections. The posts sadly got little quality engagement, indicative of both the writing quality and of the rest of the EA Forum’s userbase.

Even transhumanists can be reasonable, like Monica Anderson who writes Problem Solved: Unfriendly AI.

Ernest Davis wrote a review of SuperIntelligence, touching on some of the key weaknesses in Bostrom’s arguments but insufficiently elaborating on each of his arguments. MIRI published a response to the review which I think mostly nitpicks Davis’ phrasing instead of actually engaging with his objections, which to be fair might be the best you can do if you don’t have any better source of exposition on these arguments than Davis’ review. In short, Davis’ review isn’t super good, but MIRI’s response is much worse.

Neil Lawrence critiques Bostrom’s Superintelligence. If I had to excerpt a single representative line, it would be “I welcome the entry of philosophers to this debate, but I don’t think Superintelligence is contributing as positively as it could have done to the challenges we face. In its current form many of its arguments are distractingly irrelevant.”

Magnus Vindig writes Why Altruists Should Perhaps Not Prioritize Artificial Intelligence: A Lengthy Critique, in which he tackles most of the standard EA arguments and points out their hidden assumptions. Topics include, but are not limited to, the incessantly cited AI researcher survey predictions, bad Moore’s law-type arguments, slight-of-hand changing definitions of intelligence, the difficulty of alignment rising for future systems compared to current ones and the enormous experience we have with present-day systems, Instrumental Convergence being under argued, the practical value of being super intelligent. He does not rigorously take down every argument to the full extent possible, but that is probably good because the blog post is 22k words as is. Vindig also wrote Is AI Alignment Possible? in which he argues that the answer is no, both in principle and in practice.

Richard Loosemoore has the right amount of derision that AI Risk deserves, which is different from the right amount of derision for convincing the worriers that they’re wrong. One person who was not convinced is Rob Bensiger of MIRI.

Bill Hibbard has an email exchange with Yudkowsky in which he argues that a Superintelligence would not conflate smiling human faces with nano-scale depictions of such. The whole exchange is kind of predictable and not too informative.

On a related note, Nicholas Agar wrote a paper titled “Don’t Worry about Superintelligence” in which he argues that the first AIs with sophisticated agency are inherently likely to be friendly.

## 6. Astronomical waste, astronomical schmaste

2880 words

[Part of this badly written blog post has been superseded by a slightly better written forum post over on the EA forum. I might clean up the other parts in the future as well, and if so I’ll publish them at the EA forum as well.]

Previously: [1] [2] [3] [4] [5][latest].

Epistemic status: there is nothing wrong with writing your bottom line first. The purpose of this article is to get my initial thoughts on AI risk down before I start reading more about the topic, because I fear that I might unknowingly grant AI risk proponents that the implicit assumptions they’re making are true. As I procrastinated a lot on writing this post, there have been an number of articles put out that I did not read. I do not intend this document to be a conclusive argument against ai risk so much as an attempt to justify why it might be reasonable to think ai risk is not real.

Is this text too long? Click here for the summary of the argument.

In this post, I want to tackle the astronomical waste argument as used to justify AI-related existential risk prevention as an EA cause area. I will first describe the argument that people make. After that, I will discuss a number of meta-heuristics to be skeptical of it. Lastly, I want to take the astronomical waste argument face-on and describe why it is so absurdly unlikely for AI risk to be simultaneously real and preventable that the expected value of working on AIS is still not very good.

## Astronomical waste

The astronomical waste argument as most people tell it basically goes like this: the potential good that could be gotten if happy beings colonized the entire universe would be huge, so even if there is a tiny risk of space-colonization not happening, that costs a lot of value in expectation. Moreover, if we can decrease the risk by just a tiny bit, the expected utility generated is still big, so it might be a very cost-effective way to do good.

As many wise people have said before me, “Shut up and calculate.” I will be giving rough estimates without researching them a great lot, because these quantities are not that well-known to humanity either. For the duration of this post, I will be a speciesist and all-around awful person because that simplifies the estimates. Bostrom roughly estimates that colonizing the Virgo supercluster would yield $10^{38}$ human lives per century. The Virgo SC is one of about 10 million superclusters in the observable universe and we have roughly $10^{9}$ centuries left before entropy runs out, making a total of roughly $2^{180}$ potential human lives left in the universe.

I will try to argue that donating $\5000\approx\2^{13}$ to an AI risk charity today will counterfactually produce less than one life saved in expectation. To make that happen, we collect 180 bits of unlikeliness for the hypothesis that donating that sum of money to AI Safety organizations saves a lives.

You need to collect less bits if your counterfactual cause area is more cost-effective than malaria prevention. Possibly $\log_2(5000/0.20) \approx 14$ bits fewer with a charity like ALLFED.

#### On meta-uncertainty

Some of my LessWrong-reading friends would argue that it is impossible to have credence $2^{-200}$ in anything because my own thinking is fallible and I’ll make mistakes in my reasoning with probability much higher than that. I reject that assertion: if I flip 200 coins then my expected credence for most series of outcomes should inevitably be close to $2^{-200}$, because all $2^{200}$ events are mutually exclusive and their probabilities must sum up to at most $1$.

## Discounting the future (30 bits)

Inhabiting the observable universe might take a really long, and in all this time there is some probability of going extinct for reasons other than AI risk. Hence we should discount the total spoils of the universe by a decent fraction. 30 bits. More importantly, if the Alignment Problem were to be solved, you’d still need to be able to force everyone to implement in the solution to it.

Independent AGI developers would need to be monitored and forced to comply with the new AGI regulations. This is hard to do without a totalitarian surveillance state, and such governance structures are bad to live under. 15 bits.

And then there are adversaries, negative utilitarians, who will actively try to build unsafe AGI to destroy the universe. They will keep trying for the rest of human existence. Preventing this for all time seems unlikely without going into real Orwell-level surveillance. 15 bits.

## Biases (20 bits)

I expect many EA’s to be wrong in their utility calculation, so I think I should propose mechanisms that cause so many EA’s to be wrong. Two such mechanisms are described in previous entries in this series [2] (9 bits) [3] (1 bits) and I want to describe a third one here.

When we describe how much utility could fit in the universe, our reference class for numbers is “how many X fits in the universe”, where X ranges over things like {atoms, stars, planets}. These numbers are huge, typically expressed as $10^n$ for $n \in \mathbb{N}$.

When we describe how likely certain events are, the tempting reference class is “statements of probability”, typically expressed as $ab.cdefghij... \%$. Writing things this way, it seems absurd to have your number start with more than 10 zeros.

The combination of these vastly different scales together with anchoring being a thing, makes that we should expect people to over-estimate the probability of unlikely effects and hence the expected utility of prevention measures.

I expect myself to be subject to these biases still, so I think it is appropriate to count a number of bits to counteract this bias. 20 bits.

## Counterfactual actions (-1 bit)

Nothing is effective in and of itself, effectiveness is relative to a counterfactual action. For this blog post, the counterfactuals will be working on algorithmic fairness and/or digital rights campaigning/legislation, and mainstream machine learning research and engineering. -1 bit.

## When is AI risky? (tl;dr)

This is a rough sketch of my argument. AI safety can only be an effective cause area if

1. The future of the non-extinct universe would be good.
2. The probability of an AI-related extinction event is big.
3. It is possible to find ways to decrease that probability.
4. It is feasible to impose those risk mitigation measures everywhere.
5. The AI risk problem won’t be solved by regular commercial and/or academic AI research anyway.
6. A single AI-related extinction event could affect any lifeform in the universe ever.
7. Without AI first causing a relatively minor (at most country-level) accident first.
8. Presently possible AI safety research should be an effective way of decreasing that probability.

I upper bounded the quantity in 1 by $2^{200}$ good lifes. Properties 2 and 3 are necessary for AI Safety work to be useful. Property 5 is necessary for AI safety work to have meaningful counterfactual impact. Property 6 is necessary because otherwise other happy life forms might fill the universe instead, and the stakes here on earth are nowhere near $2^{200}$. If property 7 does not hold, it might mean that people will abandon the AI project, and it would be too easy to debug risky AI’s. Property 8 is in contrast to AI safety work only really be possible after major progress from now has been made in AI capabilities research, and is hence a statement about the present day.

The basic premise of the argument is that there is an inherent tension between properties 2 up to 6 being true at once. AI risk should be big enough for properties 2 and 6 to hold, but small enough for 3 and 5 to hold. I think that this is a pretty narrow window to hit, and which would mean that AI safety is very unlikely to be an effective cause area, or at least it is not so for its potential of saving the universe from becoming paperclips. I am also highly skeptical of both 7 and 8, even assuming that 2 up to 6 hold.

## AI is fake (8 bits)

I think it is likely that we won’t be making a what we now think of as “artificial intelligence”, because current conceptions of AI are inherently mystical. Future humans might one day make something that present-day humans would recognize as AI, but the future humans won’t think of it like that. They won’t have made computers think, they would have demystified thinking to the point where they understand what it is. They won’t mystify computers, they will demystify humans. Note that this is a belief about the state of the world, while [2] is about how we think about the world. Hence, I think both deserve to earn bits separately. 5 bits.

I am not sure that intelligence is a meaningful concept outside principal component analysis. PCA is a statistical technique that gives a largest component of variation in a population independently of whether that axis of variation has an underlying cause. In particular, that might mean that superhuman intelligence cannot exist. That does not preclude thinking at superhuman speeds from existing but would still impy serious bounds on how intelligent an AI can be. 1 bit.

No matter the above, all reasonably possible computation is restricted to polynomial-time solvable problems, fixed-parameter tractable problems and whatever magic modern ILP-, MINLP-, TSP- and SAT-solvers use. This gives real upper bounds on what even the most perfect imaginable AI could do. The strength of AI would lie in enabling fast and flexible communication and automation, not in solving hard computational problems. I hereby accuse many AI-enthousiasts of forgetting this fact, and will penalize their AI-risk fantasies for it. 2 bits.

## AI x-risk is fake (31 bits)

The risks of using optimization algorithms are well-documented and practitioners have a lot of experience in how to handle such software reponsibly. This practical experience literally dates back to the invention of optimization in what is by far my favourite anecdote I’ve ever heard. Optimization practitioners are more responbile than you’d think, and with modern considerations of fairness and adversarial input they’ll only get more responsible over time. If there are things that must be paid attention to for algorithms to give good outcomes, practitioners will know about them. 3 bits.

People have been using computers to run ever more elaborate optimization algorithms pretty much since the introduction of the computer. ILP-solvers might be among the most sophisticated pieces of software in existence. And they don’t have any problems with reward hacking. Hence, reward hacking is probably only a fringe concern. 3 bits.

Debugging is a long and arduous process, both for developing software and for designing the input for the software (both the testing input and the real-world inputs). That means that the software will be run on many different inputs and computers before going in production, each an independent trial. So, if software has a tendency to give catastrophically wrong answers, it will probably already do so in an early stage of development. Such bugs probably won’t survive into production, so any accidents are purely virtual or at most on small scales. 5 bits.

Even if AI would go wrong in a bad way, it has to go really really wrong for it to be an existential thread. Like, one thing that is not ab existential thread is if an AI decided to release poison gas from every possible place in the US. That might kill everyone there, but even the poison gas factories could run indefinitely, the rest of the world could just nuke all of North America long before the whole global atmosphere is poisonous. 10 bits.

Moreover, for the cosmic endowment to be at risk, an AI catastrophy should impact every lifeform that would ever come to exist in the lightcone. That is a lot of ground to cover in a lot of detail. 10 bits.

## AI x-risk is inevitable (28 bits)

Okay, let’s condition on all the above things going wrong anyway. Is AI-induced x-risk inevitable in such a world? Probably.

• There should be a way of preventing the catastrophies. 5 bits.
• Humans should be able to discover the necessary knowledge. 3 bits.
• These countermeasures have to be universally implemented. 10 bits.
• Even against bad actors and anti-natalist terrorists. 10 bits.

## AI becomes safe anyway (15 bits)

Let’s split up the AI safety problem into two distinct subproblems. I don’t know the division in enough detail to give a definition, so I’ll describe them by association. The two categories roughly map onto the distinction from [4], and also roughly onto what LW-sphere folks call the control problem and the alignment problem.

 Capitalist’s AI problem Social democrat’s AI problem x/s-risk Cyberpunk dystopia risk Must be solved tomake money using AI Must be solved to havealgorithms producesocial good Making AI optimal Making algorithms fair Solving required forfurthering a singleentity’s values Solving required forfurthering sentientbeings’ collectivevalues. Only real if certainimplausibleassumptions are true Only real if hedonisticutilitarianism is false,or if bad actors hatehedonistic utility. Prevent the light conefrom becoming paperclips Fully Automated LuxuryGay Space Communism Specific to AGI Applies to all algorithms Fear of Skynet Fear of Moloch Beating back unknowninvaders from mindspace Beating back unthinkinglyoptimistic programmers Have AI do what we want Know what we wantalgorithms to do What AIS-focussed EAscare about What the rest of theworld cares about

I’m calling 20 bits on the capitalist’s problem getting solved by capitalists, and 15 bits on the social democrat’s problem getting solved by the rest of humanity. We’re interested in the minimum of the two. 15 bits.

## Working on AIS right now is ineffective (15 bits)

There are two separate ways of being inefficient to account for. AIS research might be ineffective right now no matter what because we lack the knowledge to do useful research, or AIS work might in general be less effective than work on for example FAT algorithms.

The first idea is justified from the viewpoint that making AI will mostly involve demystifying the nature of intelligence, versus obtaining the mystical skill of producing intelligence. Moreover it is reasonable to think given that current algorithms are not intelligent. 5 bits. ((Note that this argument is different from the previous one under the “AI is fake” heading. The previous argument is about the nature of intelligence and whether it permits AI risk existing versus not existing, this argument is about our capability to resolve AI risk now versus later.))

The second idea concerns whether AI safety will be mostly a policy issue or a research issue. If it is mostly policy with just a bit of technical research, it will be more effective to practice getting algorithms regulated in the first place. We can gain practice, knowledge and reputation for example by working on FATML, and I think it likely that this is a better approach at the current moment in time. 5 bits.

Then the last concern is that AIS research is just AI capabilities research by another name. It might not be exactly true, but the fit is close. 5 bits.

## Research is expensive (13 bits)

Let’s get a sense of scale here. You might be familiar with these illustrations. If not, check them out. That tiny bit is the contribution of 4 year researcher-years. One researcher-year costs at least $50,000. The list of projects getting an ERC Starting Grant of 1.5 million euros. Compared to ambitious projects like “make AI safe”, the ERC recipient’s ambitions are tiny and highly specialised. What’s more, these are grant applications, so they are necessarily an excaggeration of what will actually happen with the money. It is not a stretch to estimate that it would cost at least$50 million to make AI safe (conditional on all of the above being such that AIS work is necessary). So a donation of \$5000 would be at most 0.0001 of the budget. 13 bits.

## Hard to estimate issues (10 bits)

I’ll aggregate these because I don’t trust myself to put individual values on each of them.

• Will the universe really get filled by conscious beings?
• Will they be happy?
• Is it better to lead a happy life than to not exist in the first place?
• Is there a moral difference between there being two identical happy universes A and B versus them being identical right up to the point where A’s contents get turned to paperclips but B continues to be happy? And how does anthropic bias factor in to this?
• Has any being ever had a net-positive life?

## Sub-1-bit issues (5 bits)

I listed all objections where I was at least 50% confident in them being obstacles. But there are probably quite a number of potential issues that I haven’t thought of because I don’t expect them to be issues with enough probability. I estimate their collective impact to count for something. 5 bits.

## Conclusion

It turns out I only managed to collect 174 bits, not the 180 bits I aimed for. I see this as weak evidence for AIS being better than malaria prevention but not better than something like ALLFED. Of course, we should keep in mind that all the numbers are made up.

Maybe you disagree with how many bits I handed out in various places, maybe you think I double-counted some bits, or maybe you think that counting bits is inherently fraught and inconsistent. I’d love to hear your thoughts via email at beth@bethzero.com, via Reddit at u/beth-zerowidthspace or at the EA forum at beth​.

## 5. Are neural networks intelligent?

563 words

I am skeptical of AI Safety (AIS) as an effective cause area, at least in the way AIS is talked about by people in the effective altruism community. However, it is also the cause area that my skills and knowledge are the best fit for contributing, so it seems worthwhile for me to think my opposition to it through.﻿

Previously: [1] [2] [3] [4][latest].

Epistemic status: this argument has more flaws than I can count. Please don’t take it seriously. [See the post-script]

Let’s answer this abstract philosophical question using high-dimensional geometry.

I’ll assume for simplicity that there is a single property called intelligence and the only variation is in how much you have of it. So no verbal intelligence vs visiual intelligence, no being better at math and than at languages, the only variation is in how much intelligence we have. Let us call this direction of variation $g$, scaled to have $\|g\| = 1$, and pretend that it is roughly the thing you get from a singular value decomposition/principal component analysis of human’s intelligence test results.

A typical neural net has many neurons. For example, VGG-19 has ~143 million parameters. Now suppose that we train a VGG-19 net to classify images. This is an optimization problem in $\mathbb{R}^{143 \text{ million}}$, and let’s call the optimal parameter setting $x$. By definition, the trained net has an intelligence of exactly the inner product $g^{\mathsf{T}}x$. ((Note that the projection of g into this 143 million-dimensional space might be much shorter than g itself is, that depends on the architecture of the neural net. If this projection is very short, then every parameter setting of the net is very unintelligent. By the same argument that I’m making in the rest of the post, we should expect the projection to be short, but let’s assume that the projection is long for now.)) ((I’m assuming for simplicity that everything is convex.))

The trained net is intelligent in exactly the extend that intelligence helps you recognize images. If you can recognize images more efficiently by not being intelligent, then the trained net will not be intelligent. But exactly how helpful would intelligence be in recognizing images? I’d guess that a positive amount of intelligence would be better than a negative amount, but other than that I have no clue.

As a good subjective Bayesian, I’ll hence consider the vector $\omega$ of goodness-at-recognizing-images to be chosen uniformly from the unit sphere, conditional on having non-negative intelligence, i.e., uniformly chosen from $\{\omega\in\mathbb{S}^{143\text{ million} - 1} : g^{\mathsf{T}}\omega \geq 0\}$. For this distribution, what is the expected intelligence $\mathbb{E}[g^{\mathsf{T}}x]$? Well, we know, we know that $x$ maximizes $\omega$, so if the set of allowed parameters is nice we would get $g^{\mathsf{T}}x \approx g^{\mathsf{T}}\omega \cdot \|x\|$, ((I have to point out that this is by far the most unrealistic claim in this post. It is true if $x$ is constrained to lie in a ball, but in other cases it might be arbitrarily far off. It might be true for the phenomenon I describe in the first footnote.)) where $\|x\|$ is how good the net is at recognizing images. We can calculate this expectation and find that, up to a constant factor, $$\mathbb{E}[g^{\mathsf{T}}\omega] \approx \frac{2}{\sqrt{2e\pi(143\text{ million}-1)}}.$$

So the trained VGG-19 neural net is roughly $10^{-5}$ times as intelligent as it is good at recognizing images. Hence, it is probably not very smart.

## 4. Who is worried about AI Risk?

429 words

I am skeptical of AI Safety (AIS) as an effective cause area, at least in the way AIS is talked about by people in the effective altruism community. However, it is also the cause area that my skills and knowledge are the best fit for contributing, so it seems worthwhile for me to think my opposition to it through.﻿

Previously: [1] [2] [3][latest].

There are many people talking about the risks of artificial intelligence. I want to roughly split them into three groups for now, because they worry about very different issues that tend to talk past each other, confusing outsiders.

The LessWrong-aligned view seems most popular in the EA community. Examplified by the paperclip maximizer argument, LW-aligned worriers are concerned that an Artifical General Intelligence (AGI) would accomplish their objective in unforeseen ways, and as a consequence should be treated like you should treat an evil genie, except it’d be worse because it would have less understanding of basic words than philosophers have. The principles that AI should satisfy are listed by the Future of Humanity Institute. [Though I suspect at least some of the signatories to have the FATML-aligned view in mind.] A popular book on this is Superintelligence by Nick Bostrom.

Fairness, Accountability and Transparency in Machine-Learning (FATML) is a subfield of machine learning, concerned with making algorithmic decision making fair, accountable and transparent. Exemplified by Amazon’s recent recruiting debacle, FATML-aligned worries are concerned that modern algorithmic decisionmaking will exacerbate existing social, economic and legal inequalities. The princples that AI should satisfy are listed by The Public Voice, and these Google ML guidelines fit as well. [Though I suspect at least some of the signatories to have the LW-aligned view in mind.] Popular books include Weapons of Math Destruction by Cathy O’Neil, Algorithms of Oppression by Safiya Noble and Automating Inequality by Virginia Eubanks.

Other AI-related worries commonly heard in the media, that I want to separate from the previous two categories because, compared to the above categories, these issues are more about politics and less of a technical problem. Worries include killer drones, people losing their jobs because AI replaced them, and who the self-driving car should run over given the choice.

In the next couple of posts on AI-related topics, I will focus on the first two categories. My aim is to use the FATML-aligned view to compare and contrast the LW-aligned view, hopefully gaining some insight in the process. The reason I separate the views this way, is because I agree with the FATML-aligned worries and disagree with the LW-aligned worries.

## 3. Finding meaning in a perfect life

249 words

[This badly written blog post has been superseded by a slightly better written forum post over on the EA forum.]

I am skeptical of AI Safety (AIS) as an effective cause area, at least in the way AIS is talked about by people in the effective altruism community. However, it is also the cause area that my skills and knowledge are the best fit for contributing, so it seems worthwhile for me to think my opposition to it through.﻿

Previously: [1] [2][latest].

My background makes me prone to overrate how important AI Safety is.

My fields of expertise and enjoyment are mathematics and computer science. These skills are useful for the economy and in high demand. The general public is in awe of mathematics and thinks highly of anyone who can do it well. Computer science is the closest thing we have to literal magic.

Wealth, fun, respect, power. The only thing left to desire is cosmic significance, which is exactly the sales pitch of the astronomical waste argument. It would be nice if AI-related existential risk were real, for my labour to potentially make the difference between a meaningless lifeless universe or a universe filled with happyness. It would give objective significance to my life in a way that only religion would otherwise be able to.

This is fertile ground for motivated reasoning, so it is good to be skeptical of any impulse to think AIS is as good as it is claimed to be in cost-effectiveness estimates.