12. Why Kolmogorov complexity is useless

377 words

Mathematical hobbyists tend to be fascinated with information theory, Kolmogorov complexity and Solomonoff induction.This sentiment is very understandable. When I first learned of them, these subjects felt like they touch upon some fundamental truth of life that you don’t normally hear about. But for all its being a fundamental property of life and understanding, mathematicians treat it as a cute mathematical curiosity at most. In this post I will explain some of the reasons why so few mathematicians and computer scientists have cared about it over the past 50 years.

The zero-th reason is that it depends on your choice of encoding. You cannot cover this up by saying that any Turing machine can simulate any other with constant overhead, because a 2000 bit difference is not something you can compensate for on data-constrained topics.

The first reason is obvious and dates back to ancient times. Kolmogorov complexity is not computable. Properties that we can literally never know are pretty useless in everyday life.

The second reason is related: we cannot compute Kolmogorov complexity in practice. Even time-constrained variants are hellishly expensive to compute for large data sets.

The third reason is more typical of modern thinking in computer science theory. Namely that any theory of information needs a theory of computation to be useful in practice. This is directly related to the difference between computational and statistical indistinguishability, as well as the myth that your computer’s entropy pool could run out. Cryptography is safe not because it is information-theoretically impossible to retrieve the plaintext but because it is computationally infeasible to retrieve the plaintext. The Kolmogorov complexity of a typical encrypted data stream is low but it would be mistake to think that anyone could compute a short description. Along another route, once I have told you an NP-complete problem (with a unique solution), it won’t add any new information if I told you the answer. But still you would learn new information by getting the answer from me, because you couldn’t compute it yourself even knowing all requisite information.

Kolmogorov complexity is useless based on classical CS theory, practice and modern CS theory. This is how you know that anyone who proposes that it is an integral part of rational thought is full of shit.

11. Did MIRI cause a good thing to happen?

1134 words

The Future Perfect podcast from Vox did an episode proclaiming that AI Risk used to be a fringe concern but is now mainstream. That is why OpenAI did not open up GPT-2 to the general public.1Or to academic peer review for that matter. This was good. Everything thanks to Jaan Tallinn and Eliezer Yudkowsky. I cannot let this go uncontested.

Today: why are the people who made our writing bot so worried about what it could do? The short answer is they think that artificial intelligence models like this one can have major unintended consequences. And that’s an idea that’s moved from the fringe to the mainstream with the help of philanthropy.

[0:02:21-0:02:43]

This here is the central claim. The money of Tallinn is responsible for people thinking critically about artificial intelligence. Along the way we hear that Tallinn acquired his AI worries from Yudkowsky. Hence, Yudkowsky did something good.

AI might have unintended consequences, like taking our jobs or messing with our privacy. Or worse. There are serious researchers who think the AI could lead to people dying: lots of people. Today, this is a pretty mainstream idea. It gets a lot of mentions it any round-up by AI expert of their thinking on AI and so it’s easy to forget that a decade ago this was a pretty fringe position. If you hear this kind of thing and your reaction is like “Come on, Killer Robots? Really that sounds like science fiction”, don’t worry, you are part of a long tradition of dismissing the real world dangers of AI. The founders of the field wrote papers in which they said as an aside. “Yes. This will probably like transform human civilization and maybe kill us.” But in the last decade or so, something has started to change. AI Risk stopped being a footnote in papers because a small group of people in a small group of donors started to believe that the risks were real. Some people started saying wait if this is true, it should be our highest priority and we should be working on it. And those were mostly fringe people in the beginning. A significant driver of the focus on AI was Eliezer Yudkowsky.

[0:04:17-0:05:50]

So the driving force behind all worries about AI is said to be Yudkowsky. Because of his valiant essay-writing, Tallin got convinced and put his money towards funding MIRI and OpenAI. Because of course his real fears center around Mickey Mouse and Magic Broomstick, not on algorithms being biased against minorities or facial recognition software being used to put the Uyghur peoples in China in concentration camps. Because rational white men only focus on important problems.

Yes, so here are a couple examples every year the Pentagon discovers some bugs in their system that make them vulnerable to cybersecurity attacks. Usually they discover those before any outsiders do and they’re there for able to handle them. But if an AI system were sufficiently sophisticated, it could maybe identify the bugs that the Pentagon wouldn’t discover for years to come and therefore be able to do things like make it look to the US government like we’re being attacked by a foreign nuclear power.

[0:13:03-0:13:34]

This doesn’t have anything to do with my point, I just think its cute how people from America, the country whose army of cryptographers and hackers (all human) developed and lost the weapons responsible for some of the most devastating cyberattacks in history, worry that other countries might be able to do the same things if only they obtain the magical object that is speculated to exist in the future.

[GPT-2] is a very good example of how philantropic donations from people like Jaan Tallinn have reshaped our approach to AI. The organization that made GPT-2 to is called OpenAI. OpenAI got funding from Jaan Tallinn among many others and their mission is not just to create Artificial Intelligence. But also to make sure that the Artificial Intelligence it creates doesn’t make things worse for Humanity. They’re thinking about, as we make progress in AI, as we develop these systems with new capabilities, as we’re able to do all these new things, what’s a responsible process for letting our inventions into the world? What does being safe and responsible here look like and that’s just not something anybody thought about very much, you know, they haven’t really asked what is the safe and responsible approach to this. And when OpenAI started thinking about being responsible, they realized “Oh man, that means we should hold off on releasing GPT-2”.

[0:17:53-0:19:05]

This is boot licking journalism, completely going along with the narrative that OpenAI’s PR department is spinning, just like Vox’s original coverage of that puff news and all of the Effective Altruism community’s reaction. There is something profoundly absurd about taking a corporate lab’s press release at face value and believing that those people live in a vacuum. A vacuum where nobody had previously made unsupervised language models, as well as one where nobody had previously thought about what responsible release of ML models entails. OpenAI is fundamentally in the business of hype, to stroke their funders’ egos, and providing compute-heavy incremental progress in ML is just the means to this end.

It’s kind of reassuring but this organization is a voice at the table saying hey, let’s take this just a little slower. And the contributions from donors like Jaan Tallinn, they have to put that cautionary voice at the table and they put them there early. You know, I think it mattered. I think that the conversation we’re having now is probably more sophisticated, more careful, a little more aware of some of the risks than it would been if there hadn’t been these groups starting 10-15 years ago to start this conversation. I think I has one of those cases where something was always going to be funded only from the fringe and where it really didn’t matter that it got that funding from the fringe.

[0:20:18-0:20:53]

The writing makes a clear statement here: the people on the fringe (Yudkowsky et al.) are a significant part of the reason why people are thinking about this. I can hardly imagine how a journalist could say this after having done any research on the topic outside of their own cult-bubble, so I think they didn’t do this.

People in EA, people in ML and the staff at Vox seem almost willfully ignorant of all previous academic debate on dual use technology, none of which derives from MIRI’s fairy tales of evil genies. I blame this phenomenon on contempt of rationalists for the social sciences. If Yudkowsky contributed anything here, it might mainly be in making socio-political worries about technology seem marginally more exciting to his tech bro audience. But the counterfactual is unclear to me.

10. Compute does not scale like you think it does

520 words

One argument for why AGI might be unimaginably smarter than humans is that the physical limits of computation are so large. If humans are some amount of intelligent with some amount of compute, then an AGI with many times more compute will be many times more intelligent. This line of thought does not match modern thinking on computation.

The first obvious obstacle is that not every problem is linear time solvable. If intelligence scales as log(compute), then adding more compute will hardly affect the amount of intelligence of a system.2Whatever ‘intelligence’ might mean, let alone representing it by a number. Principal component analysis is bullshit. But if you believe in AI Risk then this likely won’t convince you.

The second, more concrete, obstacle is architecture. Let’s compare two computing devices. Device A is a cluster consisting of one billion first generation Raspberry Pi’s, for a total of 41 PFLOPS. Device B is a single PlayStation 4, coming in at 1.84 TFLOPS. Although the cluster has 22,000 times more FLOPS, there are plenty of problems that we can solve faster on the single PlayStation 4. Not all problems can be solved quicker through parallelization.3In theory, this is the open problem of P vs NC. In practice, you can easily see it to be true by imagining that the different rpi’s are all on different planets across the galaxy, which wouldn’t change their collective FLOPS but would affect their communication delay and hence their ability to compute anything together.

Modern computers are only as fast as they are because of very specific properties of existing software. Locality of reference is probably the biggest one. There is spacial locality of reference: if a processor accesses memory location x, it is likely to use location x+1 soon after that. Modern RAM exploits this fact by optimizing for sequential access, and slows down considerably when you do actual random access. There is also temporal locality of reference: if a processor accesses value x now, it is likely to access value x again in a short while. This is why processor cache provides speedup over just having RAM, and why having RAM provides a speedup over just having flash memory.4There has been some nice theory on this in the past decades. I quite like Albers, Favrholdt and Giel’s On paging with locality of reference (2005) in Journal of Computer and System Sciences.

Brains don’t exhibit such locality nearly as much. As a result, it is much easier to simulate a small “brain” than a large “brain”. Adding neurons increases the practical difficulty of simulation much more than linearly.5One caveat here is that this does not apply so much to artificial neural networks. Those can be optimized quickly partly because they are so structured. This is because of specific features of GPU’s that are outside the scope of this post. It might be possible that this would not be an obstacle for AGI, but it might also be possible for the ocean to explode, so that doesn’t tell us anything.6New cause area: funding a Fluid Intelligence Research Institute to prevent the dangers from superintelligent bodies of water.

Ovens have secret built-in automatic timers

269 words

Every oven I’ve ever used has had a secret function, a mechanism that automatically tells you when the food is ready. It is wonderful and I want to tell you about it.

So most ovens control their temperature using a bimetallic strip. When the temperature inside is less than the target temperature, the strip closes a circuit that activates the heating. As soon as the temperature is sufficiently big, the strip will have deformed enough to open the circuit and stop the heating. In many ovens, especially older ones, you can hear this as a soft *click*. If you are lucky, the mechanism is sensitive enough to rapidly go on and off to stay on temperature, at least for a couple seconds.

If you eat frozen pizza, it often only has to be heated to a sufficient temperature. When it reaches this temperature, the pizza will stop cooling down the air around it, thereby allowing the oven to reach its target temperature and starting to say *click*. So the sound will tell you when the food is ready, no need to read the packaging to find the correct baking time.

The same happens for dishes that are ready when enough water has evaporated, or when a certain endothermic chemical reaction has stopped happening. All are done the moment the oven says *click*. There might be some exceptions to this phenomenon, but I have yet to run in to one. Which is great because I always forget to read oven instructions on packaging or recipes before throwing them out. Try it out with your own electrically powered food heating units.