Someone posted this in one of the comments, in case any readers missed it:
-------------------------------------
As Jason notes, this isn’t really how a t-test is meant to be used. What the p value tells you is how confident we can be that a person’s above average performance (or below average performance) is not just due to chance. The fact that we can be more confident about Peyton Manning than about Tom Brady (two pick two quarterbacks at random) that the above average comeback performance is not due to chance does not mean that Manning is better at comebacks than Brady. It’s mostly just a consequence of sample size, the same way that you can be very very confident that a coin is weighted if you flip it a billion times and get 51% heads, even though it’s only slightly biased.
Another way of ranking comeback QBs, analogous to PAR or VOA, is to look at how many extra wins a QB earned for his team, compared to how they would have done with an average QB. Just take the number of comeback wins a team got minus the number of wins that they would have been expected to get with an average quarterback, given the number of comeback chances that they had. The number of expected wins is just 31.3% of the number of comeback chances (since there were 603 successful comeback attempts and 1322 failed comeback attempts). Here are those numbers for the 20 quarterbacks whose records are given in the article. The quarterbacks’ records and their rankings using the p value method are in parentheses.
+6.4 Brady (13-8, 4)
+5.3 Bulger (10-5,
+4.3 Plummer (19-28, 1)
+4.0 Manning (19-29, 2)
+4.0 Testaverde (19-29, 2)
+3.5 McNabb (12-15, 7)
+3.1 Delhomme (10-12, 9)
+3.1 Fiedler (10-12, 9)
+3.1 Kitna (15-23, 5)
+2.3 Collins (17-30, 6)
-2.0 Holcomb (3-13, 154)
-2.2 Banks (5-18, 157)
-2.4 Rattay (1-10, 153)
-2.5 Warner (5-19, 159)
-2.6 Brunell (14-39, 162)
-2.8 Kanell (1-11, 155)
-3.0 George (2-14, 158)
-3.4 Ramsey (0-11, 156)
-3.5 Griese (4-20, 160)
-3.8 Beuerlein (5-23, 161)