Analyzing 538's Democratic Primary Analysis

I get into the extended discussions about politics on Facebook. Sorry, not sorry. Recently, I got into an in-depth analysis of Bernie’s chances going forward. Given his big wins in Washington, Hawaii, and Alaska, I pointed to this great project on fivethirtyeight.com to see how much he caught up.

Who’s on Track for the Nomination? describes itself like this

Tracking a candidate’s progress requires more than straight delegate counts. We’ve estimated how many delegates each candidate would need in each primary contest to win the nomination. See who’s on track and who’s falling behind.

It’s great and you should check it out. It will serve as the source of data in this post. In short, Hillary is supposed to be up right now. She has a bigger lead than she should at this point, but is it enough?

My initial argument was that Bernie has moved up monotonically from 81% of his projected target to 92% since late February and in that respect is doing well.

I (perhaps too frivolously?) brushed aside arguments that he isn’t polling well in New York, given that he was polling in the low teens 3 weeks before the Illinois primary. He forced a draw here. It’s now 3 weeks before the New York primary.

I’ve ceded the argument that Bernie is over-performing in caucus states and that those are for the most part gone.

Analogies were made to Bernie being down late in the game. “Even though he’s gaining ground, he’s not gaining fast enough.” We are now through 55% percent of the delegates. This is equivalent to being in the top of the 6th inning. So what does Hillary’s bullpen look like relative to her starting pitching? I think the answer to that is, “Not good.”

Caveat emptor: The problem with all of these quick analyses is that there are 1000’s of variables in the world and we’re likely to find at least a few that look very predictive… just by chance. Correlation is not causation and all that. You’ve been warned.


I decided to plot fivethirtyeight’s target delegate counts against the delegates she actually won. Each data point represents a state (or territory). The line running through the middle represents a candidate getting as many delegates in a state as they would need to get the nomination. Anything over that line means they over-performed in that state. Under the line means they under-performed in that state. The colors correspond to liberal (blue), conservative (red), and swing states (purple). We get this plot for Hillary’s performance

Target538_Clinton

What’s surprising here is how close 538 has been to reality (check out Michigan!) with a few notable exceptions at the top end; for Hillary that’s Florida and Texas. This suggests to me that whatever model the authors (Aaron Bycoffe and David Wasserman) were using to construct this model was actually pretty good.

The other thing that pops out is that Hillary appears to be over-performing in purple and red states. She isn’t doing as well in blue states. To highlight this, I fit a very simple linear models seperately for the blue and the red/purple states. Those models are represented by the red and blue lines around the black one

Target538_Clinton_RB

Bernie’s plot shows something similar

Target538_Sanders_RB

Whether or not it holds up remains to be seen. It’s a small sample size, but it does pass the sniff test. More liberal states support Bernie… sounds about right. If this trend holds up then it begs the question…

Are the remaining primary states closer to Texas (red), Florida/Ohio/Michigan (Purple), or Illinois/Washington (Blue)? The biggest upcoming states are Wisconsin (purple), New York (blue), Maryland (blue), Pennsylvania (purple), Indiana (red), California (blue), and New Jersey (blue). I wrote some R code to do all of this analysis (see below), so it was trivial to calculate the exact answer.

This is where things get crazy. Given the criticisms of DNC’s handling of the primary process, maybe I shouldn’t have been as surprised.

Blue Shift

57% of the pledged delegates have been voted on/pledged to their respective candidates. Of those delegates a whopping 83% of them have come from red or purple states! Only 17% have come from blue states. Of the remaining 43% of delegates who have yet to be voted on, 66% come from blue states!

The electorate that is about to vote comes from substantially more liberal states. If the trend of Bernie over-performing in liberal states holds, then this could be good news for Bernie. I ran the numbers to see if Bernie outperforming his blue state requirements by the estimated 11% would be sufficient to win.

It’s not. He’s lost too much ground in the red and purple states. He’d end up with 1945 delegates; about 80 short of the nomination. He’d have to either do better in the red and purple states, or do 25% better than targets in the blue states. Neither of which are easy.

Or maybe this is all just noise. Who knows? Only 5 blue states have voted so far, we have 10 more to go. There’s a lot that could happen. Either way, it should continue to be an exciting primary season.


The code and data used to generate the data and plots can be found here: https://github.com/TroyHernandez/Targets538