A post to help all you insomniacs out there. All right, this is dry and uninteresting but it’s also something worth reflecting on.
This post long ago was on the way research methods are wrongly adopted to produce a well-formed outcome and various issues associated with that.
Today’s is about statistics and how they are fraught. Here are some excerpts:
A practical understanding of probability and statistics at an advanced, at least college, level is increasingly important in the modern world. For example, many expensive and potentially hazardous drugs including chemotherapy for cancer and anti-cholesterol drugs such as Lipitor are approved for use and justified to patients based on complex statistical studies.
Children are being increasingly medicated for a range of alleged psychiatric disorders such as Attention Deficit Hyperactivity Disorder (ADHD or ADD), bipolar disorder, and others. Many questions have arisen about the seeming epidemic of autism (see the recent article The Mathematics of Autism).
Important public policy issues such as “global warming” hinge on complex mathematical models and statistics. The public is often swayed by shocking statistics widely repeated such as “the Soviet Union is producing two to three times as many engineers and scientists as the United States” (1950′s), “one million missing children” (1980′s), and “drugs cost $800 million dollars” to research and develop.
Complex mathematical and statistical models for mortgage backed securities played a major role in the financial crash in 2008 and the housing bubble. The financial system continues to rely on these so-called derivative securities despite numerous costly failures.
Averages can be highly misleading … The median is an example of a robust statistic that is less susceptible to misleading outliers in the data. It is often better to look at the median instead of the average, especially with noisy real-world data.
Any single statistic such as the average, median, or mode (most common value in the data) can be misleading depending on the underlying distribution of the sequence and the context in which the statistic is used.
No matter how convincing a statistic may seem, it is best to examine the distribution of the underlying data.
The Gaussian, also known as the Normal Distribution or Bell Curve, is very heavily used, often improperly, in statistics.
The Gaussian is taught in almost all introductory probability and statistics, at least at the college level. There is a theorem, known as the Central Limit Theorem, that the average of a sequence of independent identically distributed (IID) variables converges to the Gaussian distribution as the number of variables in the sequence (N) tends to infinity.
The Gaussian/Normal/Bell Curve is very heavily used in mathematical models today. However, despite the Central Limit Theorem, many real-world distributions are not Gaussian and have long tails. The data often contains outliers.
Several mathematical models used in quantitative finance such as the famous Black-Scholes Option Pricing Model use the Gaussian distribution. They often assume the returns for a financial asset are distributed according to a Gaussian distribution.
Historical data shows that the returns for many financial assets do not have a Gaussian/Normal/Bell Curve distribution and often contain extreme “fat tail” outliers such as market crashes. Mathematical models using a Gaussian distribution tend to underestimate the risks of financial assets.
Statistical significance can be a treacherous concept. Statistical significance is often reported as something known as a p value. The p value usually refers to the probability that the data, set of measurements, could have been due to pure chance. The lower the p value, the greater the statistical significance of a result.
Many scientific journals accept papers that report a p value of five percent or less for their results. The p value is often interpreted as meaning there is a probability that the hypothesis being tested is correct, but that is not really correct.
Would you live in a house that had a five percent chance of collapsing on you? Drive over a bridge that had a five percent chance of collapsing as you cross the bridge? Probably not. Even though ninety-five percent seems high and is typically an A in classroom homework, it is not a very high level of confidence in the real world.
The p value also tells you nothing about whether the “statistically significant” effect was due to the hypothesis being tested or the cause suggested by the authors of a scientific paper or study.
Probability and statistics says little about systematic errors. The OPERA experiment’s spurious report of faster than light neutrinos was due to a systematic error in measuring time delays, very tiny time delays. The results was statistically significant but incorrect for other reasons.
Correlation does not prove causation. There are many statistical methods and single statistics (number) that measure whether two or more measurements are correlated. Even if A and B are perfectly correlated, this can mean A causes B, B causes A, A and B share a common cause, or even certain kinds of chance occurrences.
Even though most scientists, mathematicians, and statisticians are taught that correlation does not prove causation, it is common to find this disregarded in practice, especially in biology and medicine. Many prominent theories in biology and medicine are based, on close examination, on a correlation, perhaps a very strong correlation, but only a correlation.
Beware of the use of language such as “the link between A and B” or “the relationship between A and B” used as if “link” or “relationship” means A causes B (or B causes A). Link and relationship are very general terms. If A and B are correlated, one can honestly say there is a “link” or “relationship” between A and B, even though causation is not actually proven by a correlation.
By far the greatest and most common problem with using probability and statistics in the real world lies in the definition of terms, categories, and measured values. When counting the number of engineers produced by the United States, the Soviet Union in the 1950′s, China, or other nations, what is an engineer? What is a missing child in “one million missing children?” What does it mean to say someone has been cured of cancer or has survived cancer? What is autism?
In the 1950′s and 1960′s, Soviet expert Nicholas DeWitt used a broad definition of scientists and engineers to argue that the Soviet Union produced two to three times as many scientists/engineers as the United States, by amongst other things including engineers receiving correspondence degrees, medical workers including nurses, and agricultural workers in his total (see MIT Historian David Kaiser’s article The Physics of Spin: Sputnik Politics and American Physicists in the 1950s).
In the medical literature, being “cured” of cancer or “surviving” cancer often means living for at least/no more than five years after being diagnosed with the disease. This differs dramatically from common English usage of the words “cured” and “survive.” Since cancer is often a slow progressing disease — many people with untreated cancer will live at least five years — this practice is particularly misleading.
The statistics on the prevalence of autism from the United States Centers for Disease Control (CDC) are extremely difficult to interpret due to the vague and broad definition of “autism spectrum disorders,” a situation the CDC has done little to resolve despite many years and billions of dollars in funding for autism research.
I would suggest that much of this misuse is deliberate, to achieve a political effect, e.g. as with the research methods linked to at the top.