A statistics post I've been working on (long)

73west

Semi-Pro
I had a discussion earlier, regarding how to define players' "primes", and there was a question of what one does with the data once it's calculated. The answer is that I've put together 11 statistical "tests" to compare great players. Within that 11, there are two kinds of statistics
Career totals: Majors won, Major Fs reached, "big events" won, Overall Tournaments Won, Weeks at #1, Year End #1s, Wins over top 10 opponents
Prime Averages: Avg. Penetration at majors, Tournament Win%, Match Win %, Win % against top 10

I only intend these stats for comparing ATGs to each other. I have not looked at how they do comparing, say, Jim Courier to Andy Roddick, and I doubt they would work for players who were never major winners and never reached #1. These are ATG vs ATG statistics.

A few definitions:
Prime = the 8 year prime that I proposed, with some changes based on feedback from posters in those threads. All "Prime" statistics are based only on matches/tournaments players entered in their prime. Very important point: It is a fixed 8 year period because for this mathematical comparison to work, each player *has to* have the same time period. Comparing one person's 6 years to another's 10 does not work (it artificially biases the analysis). If you have a questions about prime definitions, see this thread: https://tt.tennis-warehouse.com/index.php?threads/8-year-definition-of-prime-long.602135/
Big Events = Majors, Masters, YEC
Penetration = Matches Won / Tournaments Entered
Tournament Win % = Titles Won / Tournaments Entered
Match Win % = Matches Won / Matches Played

In the charts below, anything in Green is a career total stat, anything in blue is a Prime stat. Anything in bold is related to Majors only

So here is the raw data for 12 ATGs of my lifetime, in chronological order
tennisrawnumbers.png

Here's how it looks if you rank each player in each stat, 1-12
tennisranks.png


I have no intention of doing a mathematical average or weighted average to calculate who is the best. Many have done rankings already: ELO, Ultimatetennisstatistics, etc. But a few things jump out. Obviously, with the caveat that it's hard to statistically compare players from one era to another. Two main areas where the statistics don't compare:

Majors: Connors, Borg and McEnroe played most of their careers in a 3 major world, while most of the rest played 4. It makes Major comparisons misleading
Win %: When the top 4 in tournament win % are 4 of the 5 oldest players in the sample, it suggests what most of us believe, that the lack of depth on the tour made it much easier to regularly win all the lesser tournaments one entered.

Having said that:
- If you try to identify Tier 1, Tier 2 and Tier 3, Tier 3 is the one that is easy to identify. Agassi, Becker, Wilander and Edberg regularly fall way behind the others. The only other player to finish 10th or worse even in one stat is Bjorn Borg. Clearly, those 4 players, in some order, are the 9-12 in this group of 12.
- Federer's averaging 6 wins per major entered in his prime is one of the most stunning stats on the grid.
- Connors and Lendl winning half of all the tournaments they entered in their prime is the other utterly stunning stat. Impressive even with the depth factor I mentioned above.
- There's been a lot of talk about how the courts have progressively played more and more like each other over the years, that clay vs grass vs hard court is not as stark a contrast as it once was, for example. Evidence of that, perhaps: from Connors until Sampras, only Borg was able to win more than 8 majors. Connors 8, Mac 7, Wilander 7, Lendl 8, Edberg 6, Becker 6. 6-8 was greatness. Since then, 14 for Sampras, 19 for Federer, 16 for Nadal, 12 for Djokovic. If the courts are more homogenous, the fact that so many have passed Borg is evidence of that. Also, and not on the chart: we went decades without anyone winning the career grand slam (though Connors did win on all surfaces). Now, the last 4 ATGs have all done it. Agassi did it, Federer did it, Nadal did it, Djokovic did it. More evidence of homogeneity of surfaces, perhaps.
- Lendl's consistency and longevity really stand out with this kind of analysis more than with the metrics we often use when casually discussing tennis. For example, he loses every comparison to Sampras wrt Majors. But he wins 5/7 metrics that are not specifically related to majors. Only two players win more than half the comparisons with Ivan Lendl: Novak Djokovic and Roger Federer. In a way, Sampras's legacy vs Lendl's is a great test of criteria. Sampras does great in the criteria that almost everyone agrees are the most critical: winning majors, and year end #1. Lendl killed it at the criteria that people don't look at first: consistency at majors, consistency at other tournaments, winning other tournaments … and he did a lousy job turning weeks at #1 into year end #1. He spent 100 weeks more than Mac at #1, but has no edge in YE #1. He spent only 16 weeks fewer than Sampras at #1, but has 2 fewer YE #1s.
- There is not a single statistic in which Nadal tops Federer. In fact, Djokovic wins most of the matchups against Nadal.

I also started looking by surface. These stats are prime stats only. I only looked at these:
Match Winning %age
Match Winning %age in "Big Events" (Majors, Masters 1000 or equiv, YEC)
Tournament winning %age
Top 10 Ws
Top 10 match win %age

Clay, Hard Court, Grass, Carpet (I did not split indoors/outdoor, red vs blue, fast vs slow)
Raw numbers
tennissurfaceraw.png

(if that's too small: http://iblogforcookies.com/tennissurfaceraw.png)

And ranking, 1-12
tennissurfacerank.png

(http://iblogforcookies.com/tennissurfacerank.png)

What jumps out:
- By these numbers, Borg does not do well on hard courts. It wasn't just failure to win the USO, it was overall not great results
- Only 4/12 players on this list are not in the bottom 3 on any of the surfaces: Connors, McEnroe, Federer and Djokovic. These are arguably your most consistent "all court" players. Only Mac on that list is at all surprising.
- The only person to be #1 by every metric on a surface: Nadal on clay, unsurprisingly. I was surprised by how poorly he does on the hard court metrics, given that he has won multiple slams and numerous Masters on HC.
- Shocking to me was how bad the overall metrics for Agassi are on grass. He made the Wimbledon SFs or better 5 times, including 1 W (outside his prime, though). Still, he entered 19 grass court tournaments in his life, only won 1 and only made 2 Fs and some of those SF runs at Wimbledon were draw aided (Jaco Eltingh in the QF in 95, Gustavo Kuerten in 99, Nicolas Escude in 01).
- Federer is probably the only one who grades out top 3 on 2 of the 3 main surfaces
 
Last edited:

73west

Semi-Pro
you might want to change Fed's major count to 20 :)
I realized I'd forgotten to add yesterday's match, so I had to edit it. Thanks.

How long did it take? In one sense, not much time. In one sense, forever.

For a while now, I've kept a database with a single line for every match played by one of the 12 ATGs of my lifetime (+ Andy Murray). Year, tournament, opponent, opponent's rank, score, etc. And I've got an excel file with lots of pivots and lookups and such. Almost all of it directly from the ATP website, with the ultimatetennisstatistics site for some missing data. So it took a long time to get that in shape, but once I have it in shape it takes no time at all to update.
 

Harry_Wild

G.O.A.T.
Should post this also in the Former Pro Player forum, this type of topic is discussion almost on a daily basis and probably get more interest too. It would be ammo for further discussions to take place.
 
Top