Thanks to math and feature engineering, we can use natural language processing to compare prospects to their contemporaries and those from the past before tying in advanced descriptive stats that we have built previously to gauge how well a prospect fits within a certain mold performed in the NFL.
For this analysis, we took prospect write-ups from The Athletic's Dane Brugler, who is one of the best football film analysts out there, over the past eight seasons (including 2022) and used latent semantic analysis (LSA) to derive similarity scores between the text in prospects’ scouting reports.
After building our dataset to span eight seasons, we can create a prospect's score in a number of ways. We decided to use a weighted average of similar players’ WAR (wins above replacement), using the similarity score derived above as the weights. For example, if a player has a 0.60 similarity score with a player who has earned 7.0 WAR since being drafted and a -0.3 similarity score with someone who has earned 4.0 WAR, his overall score would be +3.
Using the analyses above, we can look at 2022 prospects in a couple of ways. First, we can examine player comparisons for notable prospects. Second, we can rank the players in each position group by the score derived above. These scores have correlated well with draft position and future WAR generated at the NFL level, although a more robust analysis using more seasons and data sources is beyond the scope of this article.
Let’s start by looking at the most successful NFL tight ends' text comparisons so that we can then see what that means for prospects in the 2022 class.
SUCCESSFUL TEXT ANALYTIC TRAITS
PLAYERS EXCEEDING THEIR DRAFT PEDIGREE
McBride is the only tight end who will likely to get selected in the first two rounds of the draft. He has a 57 similarity score to former first-round pick T.J. Hockenson, who looks the part as a top-five tight end through his three-year career. Two more intriguing comparisons in his top-10 are Hunter Henry and Dallas Goedert. Brugler compares him closely to Hayden Hurst, who is the 22nd closest in this study with a 14.4 similarity score. In his own right, Hurst was a first-round selection, meaning mock drafters and other analysts could be lower on McBride than his current status in league circles.
Dulcich projects as a “big slot” option according to Brugler, and our model likes slot terminology at the tight end position. Dulcich’s closest comparison is Cole Kmet with a 53 similarity score. The most intriguing comparison is that Mark Andrews ranks third while both Noah Fant and Dawson Knox round out his top 10. The negative comparisons are Stephen Anderson, Kaden Smith and Zach Gentry.