Tuesday, July 14, 2015

A Stat-Based QB Ranking System

I recently concluded an eight-part series on the greatest quarterbacks in the history of professional football. Those rankings were subjective, based on everything I know about the players: stats, awards and honors, coaching and teammates, team success and postseason performance, reputation, the eye test, and so forth.

But I also have a method for classifying quarterbacks statistically. I actually published the results of this formula three months ago, but without revealing the process that produced those results. A number of readers were curious about my methodology, and in this post, I'll finally explain how the sausage gets made. The math is not complicated — you don't need a stats background to understand this — but there's a lot of it: you could calculate most of this with a pencil and paper, but by the end, you're going to want a spreadsheet.

I'm a fan of baseball analytics — I've even done some writing on sabermetrics — and I'm convinced that this kind of project must be based on linear weights. The all-time ranking process includes some additional steps, but here's the basic formula, which I call QB-TSP: Quarterback Total Statistical Production (I call all my stat-based rating systems TSP).

Passing Yards - Sack Yards - (constant * (Attempts + Sacks) ) + Completions + (20 * Passing Touchdowns) - (40 * Interceptions) + (0.5 * Rushing Yards) + (20 * Rush Touchdowns) - (20 * Fumbles)

The constant approximates replacement level and varies depending on the era: it moves to reflect quality of competition, changes in the game, and the level of passing efficiency. Here are the values I use presently:

AAFC, 1946-49: 4.0
NFL, 1946-49: 3.5
NFL, 1950-69: 3.0
AFL, 1960: 4.0
AFL, 1961: 3.9
AFL, 1962: 3.8
AFL, 1963: 3.7
AFL, 1964: 3.6
AFL, 1965: 3.5
AFL, 1966: 3.4
AFL, 1967: 3.3
AFL, 1968: 3.2
AFL, 1969: 3.1
NFL, 1970-77: 3.0
NFL, 1978: 3.25
NFL, 1979-94: 3.5
NFL, 1995-2003: 4.0
NFL, 2004-08: 4.5
NFL, 2009-present: 5.0

Obviously, this is not a precise valuation of replacement level. But it's a functional approximation, and there are additional tweaks (see below) that smooth it out across eras. The value varies according to league: NFL, AFL, AAFC. But even within the post-merger NFL, the expected value of a pass attempt rose in the late '70s, due to the Mel Blount rule and eased restrictions on offensive linemen. The value increases again with the 1995 expansion (and widespread adoption of the West Coast Offense), the 2004 illegal contact policy, and the 2009 defenseless receiver rules. All of these adjustments correspond to noticeable differences in QB production and efficiency; TSP scores remain relatively stable even when the constant changes. In a normal year, 90-95% of qualified passers will have a positive score.

Let's use Aaron Rodgers' 2014 season as an example of the system in action.

Passing Yards - Sack Yards - (constant * (Attempts + Sacks) ) + Completions + (20 * Pass TDs) - (40 * Interceptions) + (0.5 * Rushing Yards) + (20 * Rush TDs) - (20 * Fumbles)

Rodgers passed for 4,381 yards, with only 174 sack yards lost. That's +4207. Rodgers threw 520 passes and took 28 sacks. That's 548 * -5 = -2740, updating his TSP to +1467. Rodgers completed 341 passes (+341), with 38 pass TDs (+760) and only 5 INTs (-200), bringing his score to +2368. Rodgers rushed for 269 yards (+134.5) with 2 TDs (+40), and he fumbled 10 times (-200). That's -25.5, so his total TSP for the year was +2342.5. That's an excellent score, by far the highest of any quarterback in 2014.

This system rewards both production and efficiency. In 2014, Cam Newton had an 82.1 passer rating and 7.8% sack rate, on 486 dropbacks. His backup, Derek Anderson, posted a 105.2 rating and 4.0% sack rate on 101 dropbacks. Newton also rushed for 539 yards and 5 TDs, while Anderson's rushing was negligible. Newton was a slightly below-average passer, and he scores a 729 TSP, 20th in the NFL. Anderson's passing stats were far above average, but in about 1/5 the attempts, and without any rushing value. He scores 316, the 30th-ranked QB of the season (between Charlie Whitehurst and Geno Smith). Newton's massively higher production (3,366 net yards, 23 TDs) earns him a higher score than Anderson's greater efficiency in very limited action (708 net yds, 5 TD).

The opposite can apply, as well: Tony Romo (465 dropbacks, 3474 net passing yards) had a higher TSP than Drew Brees (688 dropbacks, 4766 net passing yards) or Matt Ryan (659 dropbacks, 4489 net passing yards). TSP balances production and efficiency: a player won't have a great score unless he was efficient, and a driving factor in his team's offense.

Comparing QB-TSP Across Eras

Even with the differing constant in the TSP formula, era adjustments are necessary. Shorter seasons are pro-rated to 16 games. Furthermore, each season is assigned a value, based on the top 10 in QB-TSP, or top-5 during the 1950s. Instead of using league average, this focuses on the best players, and it's not distorted by bottom-outliers. I calculate the average of the top 10, and average that with the median score among the top 10.

Continuing to use 2014 as an example, the year's score is derived from an average of Aaron Rodgers (2343), Ben Roethlisberger (2097), Peyton Manning (2002), Andrew Luck (1862), Tony Romo (1684), Drew Brees (1676), Matt Ryan (1582), Tom Brady (1542), Russell Wilson (1493), and Eli Manning (1393). That average (1767.0) provides 50% of the value assigned to 2014. The remaing 50% comes from the median within that group — Romo and Brees, averaged — 1680. So the value for 2014 is (1767 + 1680) / 2 = 1723.5 (actually, 1723.3, because Romo had 1683.5). Incorporating the median helps to reduce distortion from positive outliers.

1723.3 is an unusually high value for a single season, but I use a rolling five-year average to calculate the adjustment for a given season. So for the 1975 season, I use (5 * 1975) + (3 * (1974 + 1976)) + (1973 + 1977), divided by 13. When I don't have five years of data, I use as much adjustment as possible: 2014's value is ((5 * 2014) + (3 * 2013) + 2012) / 9. That comes to 1642.1. The five-year average insures that players aren't punished for having a good season in a year when other players also had a good season, unless it's a trend across multiple years, indicating something about the league-wide passing environment.

To obtain a player's era-neutralized TSP, I multiply his raw TSP by 1750, then divide it by the value for that season. So in 2014, that's: (TSP * 1750 / 1642.1). Using Rodgers as an example, his QB-TSP was 2342.5. His era-neutralized TSP is 2496.4. The 1750 figure is generous, so it's normal for the era-neutralized TSP to be higher than the raw figure.

Rather than using era-neutralized TSP, I use 1/4 of the raw TSP and 3/4 of the era-neutralized TSP. Retaining some of the raw TSP rewards players who actually did more. If you played in an era when quarterbacks played a larger role in the team's offense, the system reflects that. This final figure is era-adjusted TSP. For Rodgers, we use (.25 * 2342.5) + (.75 * 2496.4) = 2457.9. That ranks 50th all-time:

Chart Chart

For those keeping track, the top 100 includes seven seasons of Peyton Manning; five each of Dan Marino, Joe Montana, Johnny Unitas, and Norm Van Brocklin; four each of Ken Anderson, Drew Brees, Otto Graham, and Steve Young; and three each of Tom Brady, John Brodie, Dan Fouts, and Fran Tarkenton. Those 13 players account for 55 of the top 100.

The chart above ends at 100, but the top 200 includes: 11 Manning, 8 Marino, 8 Montana, 8 Tarkenton, 7 Unitas, 6 Brees, 6 Graham, 6 Sonny Jurgensen, 6 Roger Staubach, 6 Van Brocklin, and 6 Young.

As a general guide:

* Anything under 500 TSP is an inconsequential season. The quarterback had very limited playing time, played poorly, or both. 2014 examples: Derek Anderson, Derek Carr.

* 500 era-adjusted TSP is a bad starter or a good backup. 2014 examples: Nick Foles, Drew Stanton.

* 1000 era-adjusted TSP is an average season. The player had some value to his team, but he wasn't a Pro Bowl-quality performer. 2014 examples: Alex Smith, Andy Dalton.

* 1500 era-adjusted TSP is a good season, a top-10 season, a borderline Pro Bowl season. This is a clear positive contribution to any player's résumé. 2014 example: Russell Wilson.

* 2000 era-adjusted TSP is a great season. There are 149 Modern Era seasons that meet that standard, a little more than two per year. The player will almost always make the Pro Bowl, and he'll usually generate some all-pro support. 2014 example: Andrew Luck.

* 2500 era-adjusted TSP is an exceptional season. There are only 45 such seasons in the 69-year Modern Era, so these seasons only occur about twice every three years. About half of them were named league MVP, and most were first-team all-pro. 2014 example: Aaron Rodgers.

* 3000 era-adjusted TSP is a legendary season. The player always wins MVP, and these are seasons that educated fans know about: Marino in '84, Young in '94, Peyton in '04.

Using QB-TSP For Multiple Years

Simply adding up a player's era-adjusted TSP is not an effective method to determine his career value: it rewards compilers and does not reflect what we think of as greatness. For anything over four or five years, I use an additional step.

I started doing this years ago, for RB-TSP. It works better for running backs than quarterbacks, but it can apply to any "value over replacement" rating system. When we talk about the best players ever, part of what we mean is, who was the best at his best? Someone who started for 20 seasons but was never the best in the league accrued a lot of value, but he wouldn't be in the discussion for best of all time. Using TSP as a frame of reference, one 3000-point season is worth much more than three 1000-point seasons. A player who performs at that level gives his team an excellent chance of reaching the championship game, and for the length of a season, he was as good as anyone we've ever seen. Compared to three average seasons, well there is no comparison.

To acknowledge that, I employ exponents. Take the player's era-adjusted TSP, divide it by 1000, and then raise it to the 1.5 power: (EA_QB_TSP / 1000) ^ 1.5. For Aaron Rodgers in 2014, this is 2.458 ^ 1.5 = 3.9. Rodgers' career value by year, starting in his rookie season of 2005: 0.0, 0.0, 0.0, 2.2, 2.7, 2.6, 5.0, 2.7, 1.3, 3.9. Summing those yields 20.4, Rodgers' career value in my rating system. This is the 20th-ranked score of the Modern Era, between Roger Staubach (22.4) and Boomer Esiason (19.9). I published the top 125 career ratings in April.

Let's revisit the guide above:

* 500 era-adjusted TSP is a bad starter or a good backup. This translates to roughly 0.35 in the exponent formula.

* 1000 era-adjusted TSP is an average season. This is worth 1.0 toward career value.

* 1500 era-adjusted TSP is a good season. This is worth about 1.8 in the career ratings.

* 2000 era-adjusted TSP is a great season. This will earn 2.8 in the exponent formula, almost three times as much as a 1000-TSP season.

* 2500 era-adjusted TSP is an exceptional season. This is worth 4.0 toward career value.

* 3000 era-adjusted TSP is a legendary season. This is worth 5.2. Dan Marino's 1984 season, the best of all time, scores 7.2.

The exponent method prevents compilers from dominating the career rankings. Average and slightly-below-average seasons have some value, but not very much, whereas great seasons raise a player's score very quickly. The ^ 1.5 adjustment seems a little conservative to me, and I will likely increase it the next time I revise the formula.

To review, we began with the basic QB-TSP formula: Pass Yds - Sack Yds - (constant * (Att + Sacks) ) + Comp + (20 * (Pass TD + Rush TD - Fumble) ) - (40 * INT) + (0.5 * Rush Yds). We obtain an era-neutral rating: TSP * 1750 / (5-Year Rolling Value), then use 25% of QB-TSP and 75% of era-neutral TSP to produce the era-adjusted TSP score. This value is divided by 1000, then raised to the 1.5 power. Sum the value for each season, and this is the career rating I use.

This is not the same ranking I used for the Top 101 QBs series. That was a subjective ranking, incorporating stats but not relying upon them exclusively. In fact, of the 39 ranked players in that project, only three have the same placement in the article series as in the stat-based method outlined above. This system is limited to regular-season stats, and is not sufficient to evaluate a player's career. But it does help me to put players' regular-season stats into context.

Most stat-based systems lean too heavily toward either efficiency or production, and I believe this method balances the two. I like that this system recognizes some value in seasons that are below average but above replacement level, which is essential in evaluating quarterbacks who don't rank among the very best of all time. Players like Drew Bledsoe, Dave Krieg, and Kerry Collins, who were starters for a long time, do reasonably well in this formula, without dominating the list based on their gross production. The exponent layer insures that the top of the list is dominated by players who actually had exceptional seasons.

I don't claim this system is perfect, because it clearly is not. But I know some readers have been curious about the way I evaluate quarterback stats. This is the way.