Becoming an Estonian e-Resident

General

Almost six years ago, I became one of the first people in the world to be an Estonian e-Resident 🙂 I had to physically go there to get finger-printed and prove that I was biologically alive, and after that I got an e-identity tied to it which allowed me to transact within Estonia’s (and partially EU’s) trust system remotely.

The very next day I had to rush myself to ER with an extremely painful case of kidney stones. I got to use my freshly acquired e-Resident card to navigate the medical system 😂. No one spoke much English and at that time I knew no Estonian or Russian. When I was eventually discharged and I went to the pharmacy, I just had to show them my card and they already knew my prescription.

So the system worked quite flawlessly. But there was human error in the prescription and I had to rush myself back into the ER after 3 days of writhing in intense pain. That was a great way to officially start my e-Residency 😜

Understanding Variances based on Sample Sizes

Uncategorized

Every now and then you read something that really furthers your understanding of the world around us. I read this fascinating piece in the book by Howard Wainer: Picturing the Uncertain World. The specific chapter I read was called “The Most Dangerous Equation” where he discusses De Moivre’s equation. It’s quite a bite to chew on and I tried explaining it to my team using just words and that just didn’t cut it. So I put together a quick graphic visualizing some of the basis of it. This may not be academically super accurate, but gets the gist across, so bear with me and I welcome you to follow along 🙂

Below are 32 hypothetical students’ heights, each represented by one vertical bar. They are grouped by color into individual classrooms A, B, C, D … H making it 8 classrooms in all.

In the first row at the top, the solid green horizontal line shows the average of the heights of all the individual students across all 32 individual measurements. The rightmost section shows the average height and also shows the maximum height and the minimum height for this sample of all students.

In the second part, we first calculate the average height of each classroom separately e.g. instead of looking at each yellow bar separately, we are now only looking at the single green line across those yellow bars that represents the average height of that classroom. And we do that for each cluster of colors. So now we only have 8 measurements that reflect the average height of each classroom. Taking an average of those 8 averages results in the exact same average height. However, the variance in this sample is much lower i.e. it’s more likely that the tallest kid in a class gets balanced out by other short kids in a class so the average height of a classroom will show less variation than the average height of the kids individually.

Also, a large classroom is always closer to the mean than the average height of smaller classrooms which will have more outliers as it’s easy for a single tall student to throw off the average of a small classroom. But in a large class room, a single tall student has less impact on the average height.

The third section shows that distribution. Classrooms with the tallest average height tends to be smaller classrooms. Similarly, classrooms with the shortest average height also tend to be the smaller classrooms.

It would be erronous to just look at the top of the distribution and conclude that smaller classrooms have taller students compared to large classrooms. However, now replace height with grades. And that’s exactly the premise of the “small schools” movement. Without understanding the underlying real world distribution of data and how sample sizes affect variance, small school lobbying centers around the belief that small schools have better grades. This is true. But due to statistics and how data is distributed and measured. Not because small schools actually do something different. Also, the worst performing schools are also small schools by the same distribution.

Understanding this relationship between sample sizes and variances observed in them is very important when making sense of data. Yet, the chapter states, many examples of large policy decisions have been made by incorrect understanding of the datasets or by looking at just one side of the distribution.

[Update: This is also covered in the famous book about understanding our biases, Thinking Fast and Slow.]

Outdoor Movies in Seattle 2014 – ical and csv format

Uncategorized

Thrillist put together a great collection that lists out all of the outdoor movies screening this summer in Seattle . However, they didn’t offer a calendar format of that data which makes it kinda hard to plan these movies around other things that I also have going on. To make it easy to compare this with other things on my calendar, I manually scrubbed the list and put it together in a spreadsheet.
  
And then I made it available as  XML, iCal and HTML versions if anyone wants to subscribe/add this to your own calendars. Enjoy!

Seattle Outdoor Movies 2014

Ridge soaring with a Paraglider on Gas Works Park

Uncategorized

20140313-170136.jpg

There’s not a lot of upward winds on the tiny mound that is Gas Works Park. But it’s windy enough that this guy might be on the something. He is able to inflate the glider, get stable, but every time he tries to lift off it finally drops back down.

Reminds me of the time I used to paraglide back in India and it’s a very meditative experience as you sit and patiently wait for the wind to pick up or sometimes just go home if conditions aren’t right. But you typically do this on a high enough ridge. I trained on a 300ft hill and “graduated” to 1000ft ones. That’s hardly nothing for a pro, but I am still at a beginner level.

We are all rooting for him here. May the force be with you.