Sebastiano Schillaci, 13th April 2020 – see post on facebook for discussion
Possible link between COVID-19 susceptibility and genetic factors
Haplogroup R1, also known as R-M173, is a Y-chromosome DNA haplogroup. In modern populations it appears to comprise subclades R1a and R1b (most people with red hair belong to the latter). Wherever just haplogroup R1 is specified, it must be intended that a mixture of both subclades is present. Comparing the following distribution maps1 there seems to be a correlation between the densities of COVID-19 cases (shown on the left) and haplogroup R1b (shown on the right). This correlation seems to be very clear in Italy (see picture below). It also seems evident in Iran, Austria, Switzerland, Belgium, Spain and maybe even in Portugal, North and Southeast Regions of Brazil, and United States (see following pictures). Anyway, at least in the case of Belgium, population density may play a bigger role. For example in Australia this theory doesn’t seem to work but it can be very well due to the fact that the population is concentrated on the East part of the island.
If this correlation were significant, it could help explain why:
Iran, Spain and Northern Italy were among the first to be seriously affected by the virus
Southern Italy, Greece2, Portugal3 and Africa are less affected by the virus than it was expected
the first African known to have contracted the virus was from Cameroon
China was much less affected than European countries according to official data, notwithstanding the pandemic started there
there were no cases in the big Chinese community in Prato (Italy) even after many returned from Chinese New Year celebrations4
and maybe it could even help explain why:
Seattle, where the first case in United States was discovered, is much less affected than New York
men are more affected than women (it is a Y-chromosome haplogroup)
The problem with the last point is that if the different mortality rate of men and women were due to a Y-chromosome gene, it should be more sensible where R1b is more prevalent, which doesn’t seem to be the case.
In order to make this theory more concrete, I tried to give a quantitative measurement, based on publicly available data, of this visual correlation. I have found a positive correlation with high confidence level (p-value 2.1774 x 10-12) between the initial growth rate of COVID-19 deaths and the maximum haplogroup R1b percentages in different countries (data set). I have written a short paper to better convey these results.
For my own reference, I have also kept track of related works.
This theory, if correct, could provide potentially huge benefits. For example it could speed up the discovery of a treatment, help make more reliable quantitative forecasting models or, at least, help better tune the social distancing measures.