Typed languages, units.

I recently picked up programming again. I used to do it a lot before I went to university, but the constant mind-numbing programming assignments quickly put me off of programming. Apart from the occasional quick bug fix for software I use myself, I haven’t done any serious coding for years.

Until recently, when I needed something coded up during my research. I decided to learn Python, and I like it. It is easy to use, the libraries are extensive and user-friendly, and ipython is a useful tool. There is just one thing that draws my ire: the weak type system. Studying math has given me an appreciation for type checking that is even stricter than most languages.

An example: my length in centimeters plus the outside temperature in °C right now equals 180. This calculation makes no sense, because the units don’t match: you can’t add centimeters to degrees Celcius. But then there’s Python, which just lets you do that.

In [1]: length = 170

In [
2]: temperature = 10

In [
3]: length + temperature
3]: 180

Most bugs that stem from typos are of this sort. Those bugs are possible because the type system is too weak. If you have two loops, one iterating over i and one over j, basic unit-based type checking would probably flag any instance of i in a place where you should have typed j instead. If you intend to query A[i][j] then it should be possible to let i have row-index type and j have type-index type, making A[j][i] raise a type error.

Another example: Let A \in \mathbb{R}^{n \times n}, x \in \mathbb{R}^n, and we’re interested in the quantity Ax \in \mathbb{R}^n. If you’re like me and you can’t remember what rows and what columns are, then that doesn’t have to impact your ability to symbolically do linear algebra: the quantities xA = A^{\mathsf{T}}x, Ax^{\mathsf{T}} and A^{-1} x don’t “compile”, so any mathematician that reads it will know you typo-ed if you wrote one of those latter expressions. All operations might be matrices acting on vectors, but the matrices A^{-1} and A^{\mathsf{T}} fundamentally take input from different copies of \mathbb{R}^n than the ones that x and x^{\mathsf{T}} live in. That is why matrix operations make sense even if the matrices aren’t square or symmetric: there is only one way to make sense of any operation. Even if you write it wrong in a proof, most people can see what the typo is. But then there’s Python.

In [4]: import numpy as np 

In [5]: x = np.array([1,2])

In [6]: A = np.array([[3,4],[5,6]])

In [7]: np.dot(A,x)
Out[7]: array([11, 17])

In [8]: np.dot(A,np.transpose(x))
Out[8]: array([11, 17])

In [9]: np.dot(x,A)
Out[9]: array([13, 16])

I am like me and I can’t remember what rows and what columns are. I would like the interpreter to tell me the correct way of doing my linear algebra. At least one of the above matrix-vector-products should throw a type error. Considering the history of type systems, it is not surprising that the first languages didn’t introduce unit-based types. Nonetheless, it is a complete mystery to me why modern languages don’t type this way.

Tumblr RSS under GDRP

When the GDRP became effective, Tumblr decided to break its RSS feeds for all EU residents. When you try to fetch https://username.tumblr.com/rss, they’ll serve their GDRP wall instead of your requested file. You can only grab the feed if you possess the correct cookies.

Anyway, here is a hacky fix for your RSS feeds. I’m assuming you possess an http server yourself. I use a Raspberry Pi with lighttpd and selfoss. I’m assuming your user on the server is called beth, you want to follow the user called username on Tumblr and your document root is /home/beth/public.

Create the folder /home/beth/public/rss. Create the file /home/beth/.bin/fetchfeeds.sh with the following contents. Duplicate the last line once for every user you want to follow, and adjust all three occurences of username to fit.


curl --header 'Host: username.tumblr.com' --header 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/71.0.3578.80 Chrome/71.0.3578.80 Safari/537.36' --header 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,/;q=0.8' --header 'Accept-Language: nl,en-GB;q=0.9,en;q=0.8' --header 'Cookie: rxx=1jo2mpfhsia.1c2ecvc5&v=1; pfg=1bc46aba34ffeb83e2ef0859447d282cf8a8a2a9f95200a2a705f3afebfe9bef%23%7B%22eu_resident%22%3A1%2C%22gdpr_is_acceptable_age%22%3A1%2C%22gdpr_consent_core%22%3A1%2C%22gdpr_consent_first_party_ads%22%3A1%2C%22gdpr_consent_third_party_ads%22%3A1%2C%22gdpr_consent_search_history%22%3A1%2C%22exp%22%3A1576795974%2C%22vc%22%3A%22%22%7D%237343812268; tmgioct=5c1acbc67c0f570418402840' --header 'Connection: keep-alive' 'https://username.tumblr.com/rss' -o '/home/beth/public/rss/username-tumblr.rss' -L

[The curl command was produced using the CurlWget browser extension.]

Don’t forget to fix the permissions: chmod +x ~/.bin/fetchfeeds.sh. Now put this in your cron table by using the command crontab -e: 0 * * * * ~/.bin/fetchfeeds.sh. This will execute the bash file in the first minute of every hour.

Lastly, put http://localhost/rss/username-tumblr.rss in your RSS reader.

Does blockchain make any sense?

Context: Butarin on non-financial applications of blockchain, 15 tweets. I’ll assume you have read it.

I am deeply convinced that blockchains have no use cases. Trust is provided in the real world by the option to sue people. Trust in most blockchains is misplaced because miner pools are too big and the group of dictators of any coin is small, not accountable and wrongly incentivised. Timestamping data works just as well without using a distributed blockchain. Immutability is provided by standard crypto. The oracle problem is a real fuckin’ serious problem that is only ever resolved through the force of law.

Every function of distributed blockchains can be provided by combining digital signatures, cryptographic commitment schemes, public key cryptography, hashing all data and either periodic or on-demand polling.

So every time Tyler Cowen of Marginal Revolution posts a link to Buterin’s Twitter feed, my confidence in other people’s assertions dies a little bit more.

I’ll take the tweets linked up top in order.

2-5. I think he means “Cryptography allows you to encrypt data, prove data was signed by someone, etc etc… blockchains OTOH allow you to prove that a piece of data was *not* published [according to protocol]”, because the original is obviously false (my daily newspaper published things but is not on any blockchain). The adapted statement is also false, because SSL is based on the very notion that certificates can be rejected and that you can check that a certificate hasn’t been rejected.

6. Not just SSL, GNuPG does this very same thing as well.

7-8.  Blockchains aren’t credibly neutral, every miner on a blockchain has money in it, skewing their incentives in certain directions. Trust only goes as far as you can pay. Cryptographic hashes and signatures can make every database trusted if you can sue its proprietor.

9-11. This might be true, but it is pretty much impossible that performance will ever reach the one of standard crypto, and Buterin conveniently manages to forget the additional cost in programmer-hours. Technological cost is human cost, of the kind which is in most limited supply.

12. This tweet illuminates a lot. Buterin lives in countries without proper online banking and moreover hates privacy of any kind.

13-14. What does this even mean. Anything can and will get hacked. Your hard drive will stop working one day, and your backup won’t work.

15. Smaller stakes \implies smaller mining rewards \implies less miners \implies easier to attack. Also applications breaking incurs a huge social cost; some people are still distrustful of others from when Google Reader (RIP) was cancelled.

I know a lot of computer scientists at my institute, including people who do research into applications of blockchains. Nobody I know believes blockchains are useful apart from black markets, tax evasion, as a novel pyramid scheme, or as a hype for getting grant money.

I get why Cowen falls for Buterin’s sweet talk, he is an economist with no knowledge of cryptography. I don’t get why not more people who know anything about cryptography are speaking out about the insanity of blockchain. My best guess is that everyone who knows stuff about it is using it to either make money themselves, pretending to be enthusiastic to further the pyramid scheme, or profiling themselves as “blockchain expert” and charging high consultancy fees.

Section of doubt

I hope I’m completely wrong and I just misunderstood every statement every blockchain fanatic has ever made. My constant rejection of all their statements does make me question myself, because it seems strange that so many people can be so consistently wrong about a thing. I’d love to show epistemic humility here, but I don’t consider it to be justified.

Optimal diffusion rate in social media

In some social media it is too easy to share content. That makes them breeding ground for fake news and encourage a culture of shallow or vile discussion. For other media, it might be too hard to share content, like traditional print and online publishing. Those need recommendation or search engines, or at least book clubs and advertising.

Is there an optimal rate of knowledge diffusion? Is it different from the optimal rate of publishing new content? Imagine Twitter, except you’re only allowed one tweet every 10 minutes, one retweet every 30 minutes, and one reply every 5 minutes.

It is frustrating how all media suck. I hope humans will develop something, anything, less shitty during my lifetime.