How we misinterpret election polls
Liberal leader Justin Trudeau, Conservative leader and Prime Minister Stephen Harper and New Democratic Party (NDP) leader Thomas Mulcair (L-R) talk before the Munk leaders' debate on Canada's foreign policy in Toronto, Canada September 28, 2015. Canadians go to the polls in a federal election on October 19, 2015. REUTERS/Mark Blinch
In the torrent of polls reporting different results about the federal election campaign, many of us in the media have either forgotten or never learned the most basic rule of interpreting data.
That is knowing the difference between “correlation” and “causation”.
Never mind that some polls show the Liberals in first and others the Tories, although there seems to be a consensus that the NDP is in third.
That’s not the problem. Different polls have different methodologies and produce different results.
As long as the pollster explains how the data was gathered, the questions that were asked and who paid for the poll (if someone other than the pollster), people can decide for themselves if they think it’s credible.
The problem is that as soon as a poll comes out, many in the media and even some in the polling industry instantly interpret what it means and how it can be explained by recent news events and controversies surrounding the election campaign.
This is where the problem of not knowing, or ignoring, the difference between correlation and causation arises.
As a result, the public gets all kinds of erroneous analysis on what is influencing public opinion.
You can only accurately speculate on what is driving polling numbers if you ask people, carefully, why they’re voting the way they are and why they have -- or haven't -- changed their minds.
Simply asking people how they’re going to vote isn’t enough.
When the media try to explain the “horse race” numbers (who’s ahead, who’s behind) according to news events and controversies related to the election campaign, it’s often their own wishful thinking that is in play, not analysis based on data or empirical evidence.
The great American conservative thinker, Thomas Sowell, sums up the difference between correlation and causation and the damage caused by confusing them in his book, The Vision of the Anointed: Self-Congratulation as a Basis for Social Policy. As he writes:
“One of the first things taught in introductory statistic textbooks is that correlation is not causation. It is also one of the first things forgotten. Where there is a substantial correlation between A and B, this might mean that:
1.) A causes B
2.) B causes A
3.) Both A and B are results of C or some other combination of factors.
4.) It is a coincidence.”
The problem when you don’t understand the difference between correlation and causation, Sowell notes, is that people interpreting the data, “almost invariably choose one of the first two patterns of causation (A causes B or B causes A), depending on which is more consistent with (their vision), not which is more consistent with empirical facts.”
I often use this simple illustration -- it’s not original and I can’t recall where I first read it -- when I’m talking about climate change, to explain the problem of not understanding the difference between correlation and causation.
The number of pirates in the world is decreasing.
Global warming is increasing.
Therefore, to fight global warming, become a pirate.
Adding to the confusion is that the polls in this election are saying different things -- some have the Liberals first, others the Tories -- so interpreting what is actually influencing public opinion is even more open to misinterpretation.
Which doesn’t mean it will stop. So let the buyer (and reader and listener) beware.