According to recent research from Sweden’s Lund University, the most commonly used analytical method in population genetics is deeply flawed. This could have caused incorrect results and misconceptions regarding ethnicity and genetic relationships. The method has been used in hundreds of thousands of studies, influencing findings in medical genetics and even commercial ancestry tests. The findings were recently published in the journal Scientific Reports.
The pace at which scientific data can be gathered is increasing rapidly, resulting in huge and very complex databases, which has been nicknamed the “Big Data revolution.” Researchers employ statistical techniques to condense and simplify the data while maintaining the majority of the important information in order to make the data more manageable. PCA (principal component analysis) is perhaps the most widely used approach. Imagine PCA as an oven with flour, sugar, and eggs serving as the input data. The oven may always perform the same thing, but the ultimate result, a cake, is highly dependent on the ratios of the ingredients and how they are mixed.
“It is expected that this method will give correct results because it is so frequently used. But it is neither a guarantee of reliability nor produces statistically robust conclusions,” says Dr. Eran Elhaik, Associate Professor in molecular cell biology at Lund University.