The post-combine running back success model

Recently, I’ve written a few posts on pre-combine success models for wide receiver and running back prospects, then applying the models to the 2016 draft class. Some consensus top prospects – like Laquon Treadwell, Michael Thomas and Devontae Booker — fared poorly in the analysis, while prospects further down current draft boards found themselves near the top — such as Leonte Carroo and Paul Perkins. The model wasn’t meant to embody definitive rankings, but instead inform you of how comparable the 2016 class prospects are to prior rookies that found early success in the NFL.

Now that the NFL combine has wrapped up, we have athletic measurables to go along with age and production that we used in our pre-combine model. Again, in our post-combine models we used logistic regression to answer a binary question: Will a particular prospect find early NFL success?

You can define success many ways, but I’m choosing to use a top 12 fantasy point season (PPR) for running backs and top 24 season for wide receivers. The model’s dependent variable of early NFL success is whether or not a player had a top season within his first three years in the NFL.

The major takeaway from the updated models was that the statistical significance of combine measurables for predicting success of running backs and wide receivers is vastly divergent. For wider receivers, none of the combine measurables had statistical significance when added to age and production numbers, whereas weight and the 40-yard dash were the most significant variables in the running back model. The wide receiver findings are bolstered by similar conclusions from the Harvard Sports Analysis Collective, and fit with the theory that wide receiver talent has more of an influence on production than for running backs, who could be stuck behind a poor offensive line or never given enough opportunity since usually only one running back is on the field. You can look back on the age and production model for 2016’s wide receivers for insight; there isn’t anything new to present for wide receivers post-combine.

We used age, production, and combine measurables to train and test the updated running back model. The model used 330 running back prospects that entered in the NFL from 2000-2013, splitting the data roughly 2-to-1 into training and testing sets.

After plugging more than a dozen different production and combine statistics into the model and slowly taking away, one-by-one the least statistically significant, we were left with four (two combine, two production) that provide the most explanatory and predictive power (listed in order of statistical significance):

1. 40-yard dash

2. Weight

3. Final season rushing yards per game

4. Final season receiving yards per game

As you’d expect, the model favors faster, heavier prospects who had strong rushing and receiving production in their final college season. The 40-yard dash is by far the most influential statistic for predicting NFL success, followed by weight. Presumably skewed by the fact we’re using PPR scoring to measure success, receiving yard per game had a model coefficient nearly three times that of rushing yards per game. In order words, each additional receiving yard raises the prediction score by three times that of each additional rushing yard.

The model’s accuracy rate for correctly predicted early NFL success or failure on the test set was roughly 92 percent, slightly better than using NFL draft position as the independent variable.

Here are the top 10 prediction scores for the entire 2000-2013 data set.

Name	College	Year	Draft Position	Weight	Forty	Rush Yds/Gm	Rec Yds/Gm	Top12	Predict
Chris Johnson	East Carolina	2008	24	197	4.24	109.5	40.6	1	0.74
Matt Forte	Tulane	2008	44	217	4.44	177.2	23.5	1	0.61
Rashard Mendenhall	Illinois	2008	23	225	4.41	129.3	24.5	1	0.61
DeMarco Murray	Oklahoma	2011	71	213	4.37	86.7	42.4	1	0.57
Kevin Jones	Virginia Tech	2004	30	227	4.38	126.7	12.4	1	0.56
Michael Turner	Northern Illinois	2004	154	237	4.49	137.3	19.2	0	0.54
Adrian Peterson	Oklahoma	2007	7	217	4.40	144.6	19.4	1	0.52
Darren McFadden	Arkansas	2008	4	211	4.33	140.8	12.6	1	0.52
Latavius Murray	Central Florida	2013	181	223	4.38	100.5	21.0	1	0.49
J.J. Arrington	California	2005	44	214	4.40	168.2	10.1	0	0.47

The “Predict” column gives the model score (between 1 and 0) for each prospect indicating the likelihood of a top-12 PPR season in a prospect's first three years, and the “Top12” column indicates whether or not the receiver actually had a top-12 PPR year in his first three seasons. Draft position is listed in the table only for reference; it was not part of the model.

You can see that our post-combine model did a great job predicting the early NFL success, hitting on eight of its top-10 scores.

The two model misses include Michael Turner, who was a top-12 PPR running back multiple times later in his career. It’s very likely that Turner’s lackluster early career had more to do with the fact that he was a fifth round draft pick, and the corresponding absence of opportunity, than any lack of talent.

None of the backs with top-5 predict scores were drafted in the first 20 picks, but they all found early-career success. The model was even able to uncover Latavius Murray, a sixth-round pick who was largely ignored by NFL talent evaluators.

The model also loves actual top-10 draft picks, like Adrian Peterson and Darren McFadden, both of whom tested with incredibly fast 40-yard dash times.

Later this week, we’ll use this model to predict the likelihood of success for the 2016 running back class. Until then, you can peruse this sortable table of the 2000-2013 data set for running back prospects with the relevant variables and prediction scores.

Kevin Cole is a Lead Writer for PFF Fantasy. You can follow him on Twitter at @Cole_Kev

[table id=1171 /]