Gaussian tail bounds

812 words

One-dimensional tail bounds

The standard normal distribution N(0,1) has probability density function \frac{1}{\sqrt{2\pi}}e^{-x^2/2}. There is no way to integrate this function symbolically in a nice way, but we do at times want to (upper) bound expressions of the form \frac{1}{\sqrt{2\pi}}\int_x^\infty e^{-t^2/2} \mathrm{d}t. How can we do this?

One way is to follow this approach. Since t\geq x everywhere, we can upper bound \frac{1}{\sqrt{2\pi}}\int_x^\infty e^{-t^2/2} \mathrm{d}t \leq \frac{1}{\sqrt{2\pi}}\int_x^\infty \frac{t}{x} e^{-t^2/2} \mathrm{d}t = \frac{1}{x\sqrt{2\pi}}e^{-x^2/2}.

There is another tail bound which is a bit weaker for large x, but I like the proof better. We’ll give a tail bound by looking at the moment-generating function \lambda \mapsto \mathbb{E}[e^{\lambda X}], where X \sim N(0,1) is our normally distributed random variable. We can explicitly calculate this expectation and find \mathbb{E}[e^{\lambda X}] = \frac{1}{\sqrt{2\pi}} \int_{-\infty}^\infty e^{\lambda x - x^2/2}\mathrm{d}x = \frac{1}{\sqrt{2\pi}}e^{\lambda^2/2}\int_{-\infty}^\infty e^{-(x-\lambda)^2/2}\mathrm{d}x. The last term is just the entire Gaussian integral shifted a bit and hence \mathbb{E}[e^{\lambda X}] = e^{\lambda^2/2} Now we use Chernoff’s bound (an easy corrollary of Markov’s inequality) to find \mathbb{P}[X \geq t] \leq \mathbb{E}[e^{\lambda X}]e^{-\lambda t}, which we can now minimize over the choice of \lambda, setting \lambda=t, and we conclude that \mathbb{P}[X \geq t] \leq e^{-t^2/2}.

Multi-variate Gaussians

Let X \in \mathbb{R}^d be N(0,I_d) normally distributed, i.e., X is a vector with iid Gaussian N(0,1) entries. What tail bounds do we get on \|X\|? We start off with Markov’s inequality again. \mathbb{P}[\|X\| > t] = \mathbb{P}[e^{\lambda\|X\|^2} > e^{\lambda t^2}] \leq \frac{\mathbb{E}[e^{\lambda\|X\|^2}]}{e^{\lambda t^2}}.

Deriving the moment generating function \lambda \mapsto \mathbb{E}[e^{\lambda\|X\|^2}] of X^2 is an elementary calculation. \int_{-\infty}^\infty e^{\lambda x^2} \cdot e^{-x^2/2} \mathrm{d}x = \int_{-\infty}^\infty e^{\frac{-x^2}{2(\sqrt{1-2/\lambda})^2}}\mathrm{d}x = \frac{\sqrt{2\pi}}{\sqrt{1-2\lambda}}.

The coordinates of X are iid, so \mathbb{E}[e^{\lambda\|X\|^2}] = \mathbb{E}[e^{\lambda X_1^2}]^d = (1-2\lambda)^{-d/2}. The minimizer is at \lambda=(1-1/t^2)/2, and we find, requiring t \geq 1 for the last inequality,\mathbb{P}[\|X\| > t] \leq e^{-d(t^2-2\log t - 1)/2} \leq e^{-d(t-1)^2}.

Operator norm of Gaussian matrices

The operator norm or spectral norm of a n \times n matrix M is defined as \|M\| := \max_{x \in \mathbb{R}^n} \frac{\|Mx\|}{\|x\|}.

Now if M were a matrix with every entry independently N(0,1), what would the largest singular value of this random Gaussian matrix be? I’ll give an easy tail bound based on a net argument.

An \eta-net, \eta > 0, on the sphere is a subset N \subset \mathbb{S}^{d-1} such that for every point x \in \mathbb{S}^{d-1} there is a net element n \in N such that \|x-n\| \leq \eta, but every two net elements are at distance at least \eta from each other. A greedy algorithm can construct an \eta-net, and any \eta-net has size at most (4/\eta)^d. 1See e.g., Jiří Matoušek, Lectures on Discrete Geometry (Springer, 2002), page 314. The proof is based on a simple packing argument where balls of radius \eta/2 around each net element have to fit disjointly inside the ball of radius 1+\eta/2 \leq 1 centered at the origin.

Now let N\subset \mathbb{S}^{d-1} be a 1/2-net. By the above, the size of the net is bounded by |N| \leq 8^d.

The function x \mapsto \|Mx\| is \|M\|-Lipschitz. Hence we can bound \|M\| \leq \max_{x\in\mathbb{S}^{d-1}} \min_{\omega \in N} \|M\omega\| + \|M\|\cdot\|x-\omega\| \leq \max_{x\in\mathbb{S}^{d-1}} \min_{\omega \in N} \|M\omega\| + \|M\|/2. So we have now proved that \|M\| \leq 2\max_{\omega\in N} \|M\omega\|.

Now, as M\omega is N(0,I_d) normally distributed for any \omega\in\mathbb{S}^{d-1}, we can use the union bound over all points of N and conclude that, for all t \geq 1, \mathbb{P}[\|M\| \geq 2t\sqrt{d}] \leq 8^d e^{-d(t-1)^2/2}.

Maximum of n Gaussians

The distribution of the maximum \mathbb{P}[\max_{i \leq n} X_i \geq t] of n independent identically distributed variables X_1,\ldots,X_n \sim N(0,1) is, up to a constant factor, tight with the union bound \mathbb{P}[\max_{i \leq n} X_i \geq t] \leq ne^{-t^2/2}.

From this, we can find that the expected maximum is \mathbb{E}[\max_{i \leq n} X_i] = O(\sqrt{\ln n}). We derive this by bounding the integral \mathbb{E}[\max_{i \leq n} X_i] = \int_0^\infty \mathbb{P}[\max_{i \leq n} X_i \geq t] {\rm d}t. Split the integral at t = \sqrt{\ln n}, bound the integrand in the first part by 1 (it is a probability) and in the second part by ne^{-t^2/2}.

Average width of the simplex

Let x_1,\dots,x_{d+1} \in \mathbb{R}^d be the vertices of a regular simplex such that \|x_i\| = 1 for all i \in [d+1]. If \omega \in \mathbb{S}^{d-1} is chosen uniformly at random, the expectation of the difference \max_{i,j\in[d+1]} |\omega^{\mathsf{T}}(x_i-x_j)| is called the average width of the simplex. We can bound this up to a constant factor using our knowledge of Gaussians. Let H_t := \{y\in\mathbb{R}^d : \omega^{\mathsf{T}}y = t\}. The d-2-dimensional volume of H_t\cap \mathbb{S}^{d-1} is (1-t^2)^{(d-1)/2} times the volume of \mathbb{S}^{d-2} by Pythatoras’ theorem. Recalling that (1+1/\lambda)^\lambda \approx e, you can prove that the distribution of \omega^{\mathsf{T}}x_i is approximately N(0,1/\sqrt{d-1}). The Gaussian tail bound now says that the average width of the simplex is O(\frac{\sqrt{\ln d}}{\sqrt d}).

The life expectancy of trans women is not 35 years

904 wordsEvery once in a while I read an article or comment like this:

The average lifespan of trans women in the US is 35 years. Source

This number is wrong in so many ways, but many people seem to fall for it. The most viral variant consists of screencaps of this tumblr post.

User fuckallies on tumblr: 'On average, you have a 1 in 18,989 chance of being murdered A trans person has a 1 in 12 chance of being murdered The average life span of a cis person is about 75-90 The average life expectancy of a trans person is 23-30 years old 75% of people killed in anti LGBT hate crimes are poc Think about this the next time you go crying over “cisphobia” and “reverse racism”'

Every murder is a tragedy, and it is particularly sad when someone gets murdered just for being trans. That is all the more reason to get our numbers right: exaggeration doesn’t help the cause. The lifespan statistic above is absurd. There is no way that they can be true, and it is telling of our maths and science education that people believe them.

Let’s do a basic sanity check on the 23-30 years figure. If 50% of trans folks would reach the age of 60, the other half would have to be dead by age 10. But measuring “age of death of trans people” is really tricky, because a person will only enter the trans population after transition: if they die before that, nobody knows they’re trans. So realistically, all trans people that die are at least 16 years old.  If the trans people who get murdered do so at the earliest possible age (16), how many trans people can maximally reach the age of 75? Well, we solve for the fraction x:75x + (1-x)*16=35.

Does one in 3 trans people get killed over their lives? Of course not. The only place the 23-30 year figure could realistically come from is if the number described life expectancy conditional on getting killed or some other low-probability conditional.

Media reporting

Most journalists aren’t good with numbers. This effects their propensity to cite the low life expectancy for trans folks. The Guardian:

a 2014 report concluded that the average life expectancy of trans women in the Americas is between 30 and 35.

NPR:

“Transgender people have an average life expectancy of about 30 to 32 years,” Balestra says.

Huffington Post:

The statistic said the average life expectancy for a trans woman of color is 35.

The difference is striking. “transgender people”, “trans women in the US”, “trans women in the Americas”, “trans women of colour”. They’re all misreporting the same in number in different messed up ways.

The first popular English medium to cover the number, and the source that most chains of web pages end up on, is Washington Blade, which reports one realistic number and also the unrealistic one.

The commission indicates 80 percent of trans murder victims in the Americas during the 15-month period were 35 years old or younger. Its report further concludes the average life expectancy of trans people in the Western Hemisphere is between 30-35 years.

Washington Blade cites a report by the Inter-American Commission on Human Rights of the Organisation of American States (press release, report). The report mentions this

According to the data collected in the Registry of Violence, eighty percent of trans persons killed were 35 years of age or younger.

The data (xlsx, Spanish) contains 770 reports of violence against LGBT people from a lot of countries in the Americas, all incidents happened in the 15 months between January 1st 2013 and March 31st 2014. Not all countries are in the dataset and no skin colours of the victims are listed, but are. Some victims are deadnamed, others appropriately named. Searching the names online, there are indeed a lot of trans women of colour among the listed US victims.

The data contain 282 murders on trans people, of which 212 have an age of death attached. The average age of death among those 212 is 28.7. Among 14 murder cases in the US, all trans women, the average age was 29.8. Filtering the America-wide data for listed ages of murder of 35 and under, we find 168 cases, and 168 out of 212 is 80%. This is the source of the 80% vs 35 years claim, but keep in mind that this is as a fraction of cases that have ages attached to them.

Where does the life expectancy of 30-35 years come from? The IACHR report just says this:

In terms of the age of the victims, the IACHR notes that while it seems gay men of all ages are targeted, in the case of trans women, it is mostly younger trans women who are victims of violence. In this regard, the IACHR has received information that the life expectancy of trans women in the Americas is between 30 and 35 years of age.

No source is listed. I’m calling bullshit. There is no way that getting murdered reduces life expectancy by at most 5 years.

Section of doubt

The 1 in 12 murder rate up top is interesting because it cannot be refuted in a conservative street-fighting calculation.  The US has 325 million citizens and a life expectancy of 79 years, so every year roughly 4.1 million US citizens die. If 1 in 40000 people is trans, then you’d expect 100 US trans people to die each year. 2018’s TDOR has 23 murder cases from the US listed, so we’d estimate that trans people have a 1 in 4 probability of death by murder, not accounting for skin color.

I don’t trust the 1:40000 statistic within an order of magnitude, but it is an often cited one. So while I don’t take the 1:4 number to be remotely close to the truth, it is understandable if people believe the 1 in 12 figure.

Archiving the Trans Girl Diaries

209 wordsBetween standing on the shoulders of giants and picking through my own old files, I compiled the most complete archive of the Trans Girl Diaries gag comics so far. Check it out, this stuff is amazing.

Where do these things come from?

Turns out I used wget’s mirror function on the website once. The most bulletproof setting for this command is

wget -mkE http://example.com

This stuff is so great. It makes a copy of an entire website, including all pages, images, CSS and Javascript. Use it to grab a blog for reading on the plane, to make a static WordPress site if you are worried about security exploits but dislike updating, or to save your favourite webcomic for posterity.

Trigger warnings

Suicide, gender dysphoria, violence, external transphobia, internalized transphobia, Bailey-Blanchard-Lawrence two-type transwomen classification, transphobia, really intense descriptions of gender dysphoria, TERFism, sexism, homophobia, womyn-born-womyn-ism, Harry Benjamin syndrome and an altogether too realistic view of transgenderism.

If you like r/tgcj you’ll probably like the Trans Girl Diaries.

Review

I love this stuff. The comics meant a lot to me when I was younger. They are relatable and funny and give insight into all the disturbing thoughts that are part of the Trans Woman Experience. Whether you are trans or not, it is worth checking out.