Making sense of polls and predictions - Is this even possible?
by Tom Cloyd - 8 min. read - (reviewed 2020-09-02: 0900 PDT)
Predicting the future challenges us in many ways. So does evaluating predictions! What follows is a case study in the latter, since we are approaching a critical event in American history - the November 2020 federal election. We want to know - in advance, please - what the outcome will be. But do we really? The prophet, ignored, is an ancient story. Ancient, too, is the problem of discriminating false from true prophets. We have here “old news”, not to be confused with “fake news”! Let’s look briefly at this problem of evaluating predictions.
On this page…
- A necessary preliminary issue: What is a “model”? ^
- Modern prophecy in the US: The case of the 2020 pandemic ^
- Today’s annoyance - a map filled with red states!??? ^
- Step one: Get some context ^
- Step two: Look at the study itself ^
- Step three: Back to the ad hominem heuristic, but with caution ^
- Step four: Knock, knock - Is anyone home? ^
- Summing up… ^
- NOTES ^
A necessary preliminary issue: What is a “model”? ^
In modern times, all prediction models are based on the idea that as one or more variables change, other variables (the ones we’re really interested in) change. When these variables are separated by a time gap, we have the potential for causation. Variable A changes, then at some point after that Variable B changes. From this, we may infer (though not always correctly) that A causes B. In the real world, this “time-series” relationship is never perfect. Indeed, it is to get away from the oversimplifications to which our minds are inclined that we look at real world data.
Here is a rather good description of how a model is actually built: “A model is at its core a math equation with some inputs. When you are building a model, you create the basic structure of the model…then train the model on existing data.”1 The word “train” might better be “tune”, because exposure of the model to real data allows us to weight the variables in the model so that the model is as good a fit as possible to the data. At that point, we may use the model for predictions of the future.
Modern prophecy in the US: The case of the 2020 pandemic ^
When public health experts predicted in 2012 that a pandemic was the only truly grave threat our country faced, you were stunned, were you not? Then, in 2015, when an interview with Bill Gates resulted in the prediction that a pandemic was easily the most predictable major disaster that could afflict us2, you promptly rearranged your life to prepare for the inevitable.
No? Well, you have a thundering herd of company, including me, because, well, prophets can be so annoying. And it’s so easy to just “change the channel”.
In 2017, “…a week before inauguration day, …Lisa Monaco, Barack Obama’s outgoing homeland-security adviser, gathered with Donald Trump’s incoming national-security officials and conducted an exercise modeled on the administration’s experiences with outbreaks of swine flu, Ebola, and Zika. The simulation explored how the U.S. government should respond to a flu pandemic that halts international travel, upends global supply chains, tanks the stock market, and burdens health-care systems—all with a vaccine many months from materializing.”2
The Trump administration’s response? In 2018, they disbanded “…a National Security Council directorate at the White House charged with preparing for when, not if, another pandemic would hit the nation.”3 What we ignore will no longer bother us, as any 3-year-old knows instinctively.
This works…until it doesn’t.
Today’s annoyance - a map filled with red states!??? ^
This morning we have Dawn Watts, one of our most prolific, productive, and valuable contributors at the BOCT Facebook Page,4 calling our attention a seriously annoying article which predicts a Trump victory in the November 2020 election - by a landslide!5 I was just waking up, still lying in bed, and already I’m having to deal with Trump - and coffee was nowhere in sight. This is more annoying than an alarm clock going off across the room!
I want to walk you through what happened next…
Step one: Get some context ^
Fast-forward a few minutes, and I’m looking at the Facebook page of “The Colorado Herald”, the source of the posted article.6 There, I read that it is “Run by private citizens.” The alternative to that would be…what? Run by squirrels? Republicans? The government? Corporations? This is a bit odd, but I already have a hunch, and I soon see that I am right.
On this page is a “Related pages” section. The first entry is “Team Trump”.
The top post on “Colorado Herald” page is a very long annotated timeline for the COVID-19 pandemic. Reading several of the annotations, it’s clear that the author believes this matter to be vastly overblown, and that the main players in the story - Dr. Fauci and the media among them - have some kind of agenda (which is never specified in the annotations, so far as I could see). A quick scan of other posts on the page reveal the same point of view: the pandemic is a media creation driven by a lot of “fear-mongering”.
OK, that’s a bit of context. Let’s look for more.
I search of the Internet for “The Colorado Herald”. Nothing comes up other than that Facebook page. So whoever these people are, they appear to be attempting to present their material as journalism, but there is no real “journal”, much less a masthead. Anyone can present themselves as a journalist. It is not a licensed profession anywhere that I know of!
Moving on to the post about the “Helmut Norpoth” post on the BOCT Facebook page…Click on the image and you go to a “Colorado Herald” Facebook page dated August 22 (2020). All the information there, other than the comments stream, is in Dawn’s post.
There are two links at the bottom of the article. The first is to a Wikipedia page about Dr. Norpoth. I used to work for Wikipedia, and left in disgust. I am wary about anything I read there, although it can still be quite useful if one retains a healthy skepticism (which you should about anything you read anywhere!). Scanning this page, I see that Norpoth appears to be a legitimate academic at a well-known university - Stony Brook University (SBU), on the north shore of Long Island in New York state. This is the “flagship” school of the “State University of New York system”.7
What we have to this point is the result merely of a quick search for authoritativeness. We know that this article with the red-states-rule!-map is not coming from the New York Times. We also know that Norpoth is a validated academic. But this is only useful if we’re using the common heuristic which serious scholars of logic advise us (correctly) is irrelevant to sound reasoning - the ad hominem heuristic8. We all use this quick-and-dirty method of evaluating a statement, hoping for “quick” but not “dirty”. It gets us by, usually. But we can do better.
Step two: Look at the study itself ^
The other link at the bottom of the “Herald” article takes us to what is basically a press release on the SBU website9. It contains much detail about Norpoth’s model, and tells us that it’s been under development, and tested against real outcomes, for years. There are links to the model’s website, and to other secondary material which would tell one a lot about how the model works. The press release tells us that is driven primarily by early state primary election results, which Norpoth has found to be quite predictive. It also tells us that the model has been modified since the last federal election to focus on Electoral College outcomes rather than general election outcomes, given what happened in 2016. But this is still a tertiary report10 - not close enough to the relevant research.
To be explicit, what that means is that to honestly evaluate this model, we need to look at the source data and how it was analysed. I won’t do that (even assuming he makes the data available, which these days is considered a proper thing to do). I have limited time, and interest, and I am more interested in the broader context of this model, which is to say how this model is viewed by other people interested in models.
Step three: Back to the ad hominem heuristic, but with caution ^
What do people who are prediction professionals think of this model, I’m wondering? (I.e., why do the work of thinking when you can let others think for you?)
I am now going very explicitly to employ the ad hominem heuristic! The trick to making this work, of course, is to be discriminating in your choice of hominems “ (“…Latin…hominem, accusative of homo “man”).11 Fortunately, I can be.
Because of very real time constraints in my life, I’m going to look at a single much-esteemed source: Nate Silver. If you are not familiar with him, look briefly at the lead section of Wikipedia’s article on him.12 It’s well-sourced and I trust it. You’ll see that his reputation is well-established and quite good.
Silver’s website13 (he runs it, but sold it a while back to ESPN) is where I’m headed at this point. Prediction is what this website is all about, and I visit it regularly. I understand, technically, what they’re doing (because of my own statistical education) and I find it both respectable and fascinating.
My fundamental question: Has Silver anything to say about Norpoth? If Norpoth’s model is such a success, surely he would.
Well, this gets interesting quickly. Norpoth has been mentioned in the past 5 years exactly twice on the 538 website.14 In neither case was his election outcome prediction model the reason. “We are not impressed”. Apparently neither was Silver.
Step four: Knock, knock - Is anyone home? ^
Resource limitations being what they are - unavoidable, I am now desperate. I search the general Internet for mention of Norpoth’s model. There are plenty of hits, and most of them appear to be on conservative news outlets and websites. But, I’m looking for a critical review, not journalism employing cherry-picked sources.
Nate Ashworth15, writing in July of this year, gives us a nice evaluation of Norpoth’s model, but advises caution, given the date. It’s early in the election season to draw any real conclusions. Writing in the same week, DJ Drummond gives us a comparative analysis of four models, one of which is Norpoth’s.16 This is by far the most effortful treatment of comparative models that I could find, and well worth a read, but the writer clearly is under-educated in both logic and research method, and makes several errors in his analysis.
Summing up… ^
The next step, were I to take it, would be to search the research literature for reviews of election prediction models, if any exist - and I’m skeptical about this.
It is clear at this point, however, that these models get attention only when they hit the bullseye or glaringly get it wrong, prospectively or retrospectively.
This puts us right back at my original statement: to paraphrase - predicting the future is hard, and evaluating predictions is too. No wonder prophets get little respect!
If only we could be less curious about things as yet unseen…
https://skepchick.org/2016/03/stop-panicking-there-isnt-a-98-chance-of-a-president-trump-future/ - This article is reassuring us, in 2016, that predictions of a Trump victory were nonsense. It’s well-written, and cautious, and also does NOT say that Trump won’t win. ^
A “primary” source or report, in research, is a dataset, eyewitness report, or other information source that is either “the thing itself” or something that is as close as we can get to it. A “secondary” report is a report that describes and analyses a primary source. A “tertiary” report talks about one or more primary and/or secondary sources. See: https://guides.library.cornell.edu/sources, https://subjectguides.esc.edu/researchskillstutorial/primary, and https://library.uncw.edu/guides/primary_secondary_and_tertiary_sources. ^
See https://fivethirtyeight.com/features/media-coverage-doesnt-actually-determine-public-opinion-on-the-economy/ and https://fivethirtyeight.com/features/romney-has-an-empathy-problem-but-so-do-most-republicans/ ^
☀ ☀ ☀