Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Full Coverage Of New York's Victory Celebration

Clustering the college QBs

Predicting QB success in the NFL is a very challenging problem: among other difficulties, it's hard to define what "success" is, and properly account for the fact that earlier picks are given many more opportunities than lower picks.

So I decided to look at the problem from a different angle, and instead ask the question:

What current NFL QBs are Matt Stafford, Mark Sanchez, and Josh Freeman most similar to, based on their college stats?

One way to try to answer this question is to apply a clustering method to the data. Here's what I did:

1. Get the final-year college stats (Completion %, Yards, Yards/Att, Int, TD, Rating, Attempts/Game, Yards/Game) from QBs drafted in 2005-2007, as well as from Stafford, Sanchez, and Freeman

2. Apply the k-means clustering algorithm to the normalized statistics. Basically, the algorithm figures out the groupings which yield the smallest within-group variances.

Here are the results. The algorithm requires that you pre-specify a number of clusters; I chose six. Reported for each group are its members as well as the average college statistics for that group:

"Group 1 : Brady Quinn, Jason White, Matt Leinart, Matthew Stafford, Mark Sanchez"

     Pct.     Yards Yards.Att       Int        TD    Rating     Att.G

     65.2    3555.2       8.8       9.0      31.0     159.2      31.5

  Yards.G

    281.5

"Group 2 : Andrew Walter, Jay Cutler, Kyle Orton, Omar Jacobs"
     Pct.     Yards Yards.Att       Int        TD    Rating     Att.G
     59.5    2988.7       7.4       7.9      28.2     138.2      38.0
  Yards.G
    287.5

"Group 3 : Dan Orlovsky, Derek Anderson, John Beck, Jordan Palmer, Kevin Kolb"
     Pct.     Yards Yards.Att       Int        TD    Rating     Att.G
     60.5    3511.1       7.4      17.0      26.0     133.5      38.6
  Yards.G
    292.0

"Group 4 : Aaron Rodgers, Alex Smith, Jason Campbell, Troy Smith, Vince Young"
     Pct.     Yards Yards.Att       Int        TD    Rating     Att.G
     65.8    2588.1       9.3       7.0      24.0     164.5      24.8
  Yards.G
    213.3

"Group 5 : Brodie Croyle, D.J. Shockley, David Greene, Isaiah Stanback, JaMarcus Russell, Reggie McNeal, Trent Edwards"
     Pct.     Yards Yards.Att       Int        TD    Rating     Att.G
     58.6    2356.4       7.7       7.0      16.0     136.5      25.6
  Yards.G
    204.1

"Group 6 : Brad Smith, Bruce Gradkowski, Charlie Frye, Charlie Whitehurst, Jeff Rowe, Kellen Clemens, Josh Freeman"

     Pct.     Yards Yards.Att       Int        TD    Rating     Att.G

     62.0    2546.1       7.3       9.0      19.3     133.5      31.7

  Yards.G

    223.9

 

Observations:

- Stafford and Sanchez profile similarly, and closely resemble (in terms of college stats) Quinn and Leinart. Good completion %, lots of yards, TDs, etc.

- David Greene = JaMarcus Russell? Uh, OK. But the rest of that grouping seems to make sense.

- Group 4 is interesting. High completion %, but relatively low yards and attempts per game. Could call these guys the "dinkers". Interesting to see Aaron Rodgers in with a couple of highly-regarded busts and potential busts-to-be.

- Gradkowski. Frye. Clemens. Freeman. Yikes.

Caveats:

- The groupings aren't totally stable, since the algorithm isn't guaranteed to find the optimal solution; if I ran things again, the groupings might change a bit, but not dramatically. Same goes if you change the number of groupings; a few names might change groups, but the overall structure would be similar. For example, for all the settings I tried, Sanchez and Stafford ended up being grouped together.

Comment 13 comments  |  1 recs  | 

Do you like this story?

Comments

Display:

Interesting study.

There are a lot of groupings I didn’t expect. Of course a lot of it depends on the supporting cast and the type of play calling the coach does. If you have great RBs, that might be the reason for fewer attempts. Or if you’re a running QB.

My only wish to improve this study is to have a few QBs from the last 90’s or early 2000 (I suspect the scarcity of data was the reason for this), because so far, the only QB that one can consider a success in the NFL is Aaron Rodgers and Jay Cutler, and to a lesser extent Jason Campbell, Kyle Orton, and maybe Derek Anderson.

I’m interested in group 1. I don’t think Leinart will be anything more than a Chad Pennington, but I like most of the other QBs in that cluster. I find it interesting that all of the QBs were highly touted, yet Jason White was unanimously predicted to be a long shot for success, which is why scouting is so valuable.

by LantermanC on Mar 31, 2009 10:33 AM PDT reply actions  

Oh, and is there anything more polarizing than trying to predict

the success of a college QB trying to be an NFL QB? I can’t think of anything more interesting personally. So many factors involved, so many things to debate, so many different stats to look into and question, etc.

by LantermanC on Mar 31, 2009 10:34 AM PDT reply actions  

By measuring yards and attempt per game

you are measuring the system the player came from, not the player’s innate ability. That’s why Shockley and Greene are grouped together and Leinart and Sanchez are grouped together.

I don’t think clustering quarterbacks by their stats produces a meaningful measure or profile of the players. Your “dinkers” group

Aaron Rodgers, Alex Smith, Jason Campbell, Troy Smith, Vince Young

couldn’t be much different from each other in ability or profile. Therefore, this

Gradkowski. Frye. Clemens. Freeman. Yikes.

is as sensible as this

Andrew Walter, Jay Cutler, Kyle Orton, Omar Jacobs

yikes.

by John Morgan on Mar 31, 2009 11:02 AM PDT reply actions  

I think you're probably right

Actually, I had a comment about Cutler in the first draft of the post, but removed it. And I think the Rodgers group is a bit weird as well; actually, he seemed to “jump” around clusters when I repeated the analysis more than any other QB.

I would agree that this isn’t a direct measure of a QB’s ability; but I still think it’s interesting to see these similarities/differences, as long as you don’t make a conclusion like OMG STAFFORD=LEINART=BUST!!

by cyberwulf on Mar 31, 2009 11:44 AM PDT up reply actions  

I get it then

I guess I’m getting touchy about Rosetta Stone quarterback projection stats. I fear that statistics are becoming the new jargon.

by John Morgan on Mar 31, 2009 11:50 AM PDT up reply actions  

Indeed

I’ve tried to throw in as many caveats as possible – for me, this was really about finding out which QBs had similar college stats to the current crop. If anything, these groupings illustrate the difficulty in projecting NFL performance from college stats; there are good (or at least decent) and bad QBs in almost every group.

by cyberwulf on Mar 31, 2009 1:10 PM PDT up reply actions  

I think that was a shot at me.

…and yes you seem touchy about it.

During an otherwise boring time of year, why is it such a touchy subject. You don’t think that NFL clubs “attempt” to do the exact same thing?

I’m certain that clubs prospecting these guys have stat matrixes’ that make Russell Crowe’s deciphering (A Beautiful Mind) seem sophomoric.

by iverson2169 on Mar 31, 2009 11:26 PM PDT up reply actions  

No, don't take it personally.

I recently posted two articles, one by Walter Football, and one by ESPN.com, and both were on QBs. I didn’t really read them too in depth, and posted them since we were just beginning to talk about the possibility of drafting Stafford. However in retrospect, they were not really good articles. One tried to reconcile why QBs fail, whether it was because of arm strength, the system, or the intangibles they had. The other had some convoluted stats meshing system that placed Matt Leinart very highly and didn’t have any rhyme nor reason to its madness. It just so happened that most of the good QBs ended up in the good group, and most of the bad ones were below the arbitrary threshold.

by LantermanC on Apr 1, 2009 12:21 AM PDT up reply actions  

The one thing a blog entry cannot convey....

…is TONE. I have very thick skin and in no way have meant to convey in ANY post that I am offended by any comments. I viewed the “rosetta stone” comment as a playful jab at my fanpost and said as much. Had anyone been able to see me say it in person, it would have been delivered with a half smirk and a wink… no problems here at all.

As the owner of a garment factory in Khon Kaen Thailand with 16,000 employees, I have been in absolute WARS with the biggest branded sports apparel companies in the world (the dreaded swoosh and their competition) and come out alive. My point? If I don’t take those encounters personally, I certainly don’t take anything personally on a Seahawks blog site (a damn good one at that). If it weren’t for differing opinions, we’d all be a pretty dull (and ignorant I might add) bunch.

I totally get John’s point. He thinks many of us are playing amateur draft scout and trying to create “paint by number” systems for drafting QB’s that utilize stats samples that are too shallow. My response:

Of course we are. Nobody here is going to reinvent an NFL wheel, and none of us will ever touch an NFL GM chair. That is the whole joy of logging onto John’s site and taking time out of our days to make educated interactions with other rabid Hawks fans.

by iverson2169 on Apr 1, 2009 2:28 AM PDT up reply actions  

Entrepreneur huh?

Seems tough.

Yeah, I think what John means is that while these studies are usually interesting, have a healthy dose of skepticism when reading them (to everyone not just you or me) and don’t take every study to be a fact. Sometimes people just manipulate numbers or studies to say what they want to say.

by LantermanC on Apr 1, 2009 8:41 AM PDT up reply actions  

It is a challenge...

… but in these economic times, we are thriving. The reason is because the branded companies are all very interested in sustainability. Non-delivery due to a failed subcontractor would mean disaster in a rough economy. Because of this, smaller, less stable companies end up losing work, while the super-factories swell in size.

by iverson2169 on Apr 1, 2009 9:05 PM PDT up reply actions  

Don't be confused

I’m not at all mad at you or am attempting to take swipes at you. I’m irked by the statistical mumbo jumbo created by people whose goal is to shock and amaze and forgo work on the way to expertise. I don’t think that’s you. I think it’s the people creating the suddenly dime a dozen quarterback projection systems.

by John Morgan on Apr 1, 2009 10:51 AM PDT up reply actions  

Why is everyone so down on Campbell?

It seems like every quarterback projection system put Campbell in with the busts but I don’t see it. I thought he was very good last year considering the odd to terrible playcalling.

by Nate Dogg on Mar 31, 2009 12:54 PM PDT reply actions  

Comments For This Post Are Closed


User Tools

SEA!

FanPosts

Community blog posts and discussion.

Recommended FanPosts

Avatar_small
The Official Field Gulls OT Thread - In Which We Gush About Our Favorite TV Shows
Hatersgonnahate_small
A short note on what worked for the 49ers, but isn't really a "model"
Halloween_mobster_small
Come on in!
Mail
A Reply to Beekers and Some Comments About Comments

Recent FanPosts

Walshrun_small
Super Bowl XLVI Reaction: New England Patriots
Small
My Friend has a Friend who works for Nike...
208114_505637750968_23709013_30160241_9483_n_small
GM John Schneider On The Ideal QB
Bodypaint_small
Delocated ad
Beast_mode_tshirt_small
Tats Comeback Attempt?
994_small
Free Agents vs. NFL Draft - Wide Receivers
Small
Where Will the Seahawks' Churn Hit? Defensive Line Edition
Small
Expanding Our Football Knowledge

+ New FanPost All FanPosts >


Managing Editor/Lead Writer

284430_601240951600_44900771_32958650_2317286_n_small Danny Kelly

Staff Writers/Editors

Screen_shot_2011-01-05_at_9 Scruffy Lefty

Small BrianL

Avatar_small Benne

Olympiabeer_small Tyler Jorgensen

Hatersgonnahate_small Thomas Beekers

Profilepic_small DJ C-Raig

897267_o_small Kenneth Arthur

Halloween_mobster_small Jacson Bevens

Photo__1__small Charlie Todaro

Staff Writers

Small Joshua Kasparek

Mail Matt Erickson

Davis_small Davis Hsu

Profile2_small Rob Staton

208114_505637750968_23709013_30160241_9483_n_small Scott Enyeart

Elephant_pink_clothes_small Chris Sully

Seattle_seahawk_white_1600_reasonably_small_small Derek Stephens

Osprey1_small Ben Harbaugh

Bu_fb_2_small Daniel Hill