# Typed languages, units.

511 words

I recently picked up programming again. I used to do it a lot before I went to university, but the constant mind-numbing programming assignments quickly put me off of programming. Apart from the occasional quick bug fix for software I use myself, I haven’t done any serious coding for years.

Until recently, when I needed something coded up during my research. I decided to learn Python, and I like it. It is easy to use, the libraries are extensive and user-friendly, and ipython is a useful tool. There is just one thing that draws my ire: the weak type system. Studying math has given me an appreciation for type checking that is even stricter than most languages.

An example: my length in centimeters plus the outside temperature in °C right now equals 180. This calculation makes no sense, because the units don’t match: you can’t add centimeters to degrees Celcius. But then there’s Python, which just lets you do that.

In [1]: length = 170In [2]: temperature = 10In [3]: length + temperatureOut[3]: 180

Most bugs that stem from typos are of this sort. Those bugs are possible because the type system is too weak. If you have two loops, one iterating over i and one over j, basic unit-based type checking would probably flag any instance of i in a place where you should have typed j instead. If you intend to query A[i][j] then it should be possible to let i have row-index type and j have type-index type, making A[j][i] raise a type error.

Another example: Let $A \in \mathbb{R}^{n \times n}, x \in \mathbb{R}^n$, and we’re interested in the quantity $Ax \in \mathbb{R}^n$. If you’re like me and you can’t remember what rows and what columns are, then that doesn’t have to impact your ability to symbolically do linear algebra: the quantities $xA = A^{\mathsf{T}}x$, $Ax^{\mathsf{T}}$ and $A^{-1} x$ don’t “compile”, so any mathematician that reads it will know you typo-ed if you wrote one of those latter expressions. All operations might be matrices acting on vectors, but the matrices $A^{-1}$ and $A^{\mathsf{T}}$ fundamentally take input from different copies of $\mathbb{R}^n$ than the ones that $x$ and $x^{\mathsf{T}}$ live in. That is why matrix operations make sense even if the matrices aren’t square or symmetric: there is only one way to make sense of any operation. Even if you write it wrong in a proof, most people can see what the typo is. But then there’s Python.

In [4]: import numpy as np  In [5]: x = np.array([1,2])  In [6]: A = np.array([[3,4],[5,6]])  In [7]: np.dot(A,x) Out[7]: array([11, 17])  In [8]: np.dot(A,np.transpose(x)) Out[8]: array([11, 17])  In [9]: np.dot(x,A) Out[9]: array([13, 16])

I am like me and I can’t remember what rows and what columns are. I would like the interpreter to tell me the correct way of doing my linear algebra. At least one of the above matrix-vector-products should throw a type error. Considering the history of type systems, it is not surprising that the first languages didn’t introduce unit-based types. Nonetheless, it is a complete mystery to me why modern languages don’t type this way.