Dominance ratio vs. % of points won

Sysyphus

Talk Tennis Guru
[Trigger warning: this thread is directed primarily at those who are unduly interested in the nitpicky and trivial aspects of tennis stats. If that ain't you, this thread's probably not for you.]
———

As many of you already know, dominance ratio (DR) is a nifty little stat to show how dominant a player has been in a given match, season, or similar. It has been shown to explain a lot of the variation in players' win %, and is thought to be a good predictor of future results. On the face of it, it's pretty similar to looking at what % of points a player wins, but not quite. How is it calculated? By dividing the percentage of points you win against your opponent's serve versus the percentage of points your opponent wins against your serve. Wouldn't that amount to pretty much exactly the same as just % of total points won? Not entirely – let's look at two hypothetical examples.

Say player A wins 70% of his serve points of his service points and 35% of his return points. Assuming an equal number of points played on serve and return, this should suggest that he wins 52.5% of points overall. His DR would be 35/30 ≈ 1.17

Player B wins 65% of service points and 40% of return points, which (assuming equal amount of serve and return points played) suggests the same % of total points won. However, played B would end up with a lower DR of 40/35 ≈ 1.14.

Therefore, it looks like ceteris paribus DR is slightly skewed toward players with better serve stats. Now, I wasn't sure whether this slight skew meant that DR is slightly inaccurate compared to just % of points won, or whether it's actually a clever way of weighting for an actual advantage of stronger serving. Perhaps the ratio of return points won vs. service points lost tells us something important beyond what just total % of points won tells us. That was my hunch, that DR would be a slightly stronger predictor.

To check this, I ran a quick regression analysis comparing first the connection between DR and match win %, then % of points won and match win %. Thanks to this excellent thread by Falstaff, where he shows that DR explains a lot of variance in match win %, there was already a data set to look at where half the data was already plotted in (the clay seasons of Nadal, Ferrer, Djokovic and Federer through 2012/2013), so I used that sample for convenience's sake.

What were the results? Dominance ratio explained 76.9% of the variation in winning percentage (like Falstaff found). The correlation between DR and match win percentage was .877.

Percentage of total points won explained 78.3% of the variation in match win percentage, and the correlation was .885.

DR
RvRD3Jm.png


% points won
OTJiPgz.png

As such, this seems to me to beg the question: does using dominance ratio really add anything useful compared to just using the simple nuts and bolts metric of % of points won? The explained variation and correlation is almost exactly the same. If anything, % of points won seems to do slightly better, which may give credence to the idea that DR has a slightly unnecessary skew in favor of better serving.

This was pretty spur of the moment, and I may have overlooked something obvious. Calling tennis stat enthusiasts @falstaff78 @Chanwan @TheFifthSet @Gary Duane @Meles and the rest.
 
Last edited:

ghostofMecir

Hall of Fame
Great post—I’ll reply properly later.

One thing DR does for sure is correct for any match that has an uneven # of serve games for each player. If one player has served one more serve game, total points won can be slightly skewed and DR will correct for that. That doesn’t speak to the overall point you are making, though, about the correlation between DR and %of total points won and variation in winning percentage.
 

Sysyphus

Talk Tennis Guru
Great post—I’ll reply properly later.

One thing DR does for sure is correct for any match that has an uneven # of serve games for each player. If one player has served one more serve game, total points won can be slightly skewed and DR will correct for that. That doesn’t speak to the overall point you are making, though, about the correlation between DR and %of total points won and variation in winning percentage.

Excellent point.

I would think that this is a clear advantage for DR when looking at a single match. In the longer term, I would guess that players more or less have as many matches where they serve more games than where they return more, and that this would even out, though.

Of course none of this matters as long as you win the last point.

True dat.

If I ever get to play Fed and he happens to hit a double fault, I'll just retire then and there and tell him "winning the last point is all that matterz in tennis."
 

Red Rick

Bionic Poster
I take it there can be actually a dominance ratio in a match that's not 1.0 in case that there's a difference in the amount of points that's played on serve and return. I think you see that often when one player struggles a bit more on serve. In that case % of points is perhaps an underrepresentation of the difference, whereas dominance ratio doesn't have that.

On the other hand, both get a little bit screwy if there's lopsided sets, but there's been matches where a player lost with a dominance ratio of 1.7 (thanks Roddick)
 
D

Deleted member 716271

Guest
I don't understand from a logical perspective why DR does not equal total points won percentage. Of course, I accept the math that this is so. But I can't grasp what DR is measuring then that is slightly different from points won if it is simply a ratio of points on serve won to points on return won.
 

Red Rick

Bionic Poster
I don't understand from a logical perspective why DR does not equal total points won percentage. Of course, I accept the math that this is so. But I can't grasp what DR is measuring then that is slightly different from points won if it is simply a ratio of points on serve won to points on return won.
There's almost always a difference in the amount of service points played, and usually the player who wins less points on serve plays more points on serve.
 

AiRFederer

Hall of Fame
So it wouldn't matter to more balanced/all around players? Since they basically say the same thing, mathematically speaking. Although it would be interesting to use both on players like Kyrgios, Raonic, Karlovic, basically anyone that relies on their serves to win them sets. On the other hand It may also be interesting to analyze players who are more comfortable or just plain better at receiving but obviously there's no good archetype/sample of that player in the pro tour since everyone is presumably good on both ends of the court at a certain level.
 

AnOctorokForDinner

Talk Tennis Guru
I don't understand from a logical perspective why DR does not equal total points won percentage. Of course, I accept the math that this is so. But I can't grasp what DR is measuring then that is slightly different from points won if it is simply a ratio of points on serve won to points on return won.

Because the number of service and return points played doesn't have to be anywhere near equal. For instance, let's take an extreme (but possible) set where one player held to 30 (6 points, 4-2) every time, while the other player also held every time, but faced deuce - 2 deuces (10 points, 6-4) on average - in every service game. Until the tiebreak, player A won 40% of return points, while player B won 33,(3)% of return points. Both players won every service game with a +2 differential, so TPW is 50/50, but player A has DR of 1.2, since he held easier. In this case, DR correctly reflects that player A has been the better player throughout the set, but couldn't convert it into a break lead.
 

73west

Semi-Pro
[Trigger warning: this thread is directed primarily at those who are unduly interested in the nitpicky and trivial aspects of tennis stats. If that ain't you, this thread's probably not for you.]
———

As many of you already know, dominance ratio (DR) is a nifty little stat to show how dominant a player has been in a given match, season, or similar. It has been shown to explain a lot of the variation in players' win %, and is thought to be a good predictor of future results. On the face of it, it's pretty similar to looking at what % of points a player wins, but not quite. How is it calculated? By dividing the percentage of points you win against your opponent's serve versus the percentage of points your opponent wins against your serve. Wouldn't that amount to pretty much exactly the same as just % of total points won? Not entirely – let's look at two hypothetical examples.

Say player A wins 70% of his serve points of his service points and 35% of his return points. Assuming an equal number of points played on serve and return, this should suggest that he wins 52.5% of points overall. His DR would be 35/30 ≈ 1.17

Player B wins 65% of service points and 40% of return points, which (assuming equal amount of serve and return points played) suggests the same % of total points won. However, played B would end up with a lower DR of 40/35 ≈ 1.14.

That's interesting, and not what I would have expected.
I'd add that one place where %age of points won fails to reflect the reality is when one player is being pushed *much* harder on serve than the other. An example, based on playing just 2 total games (1 serve, 1 return).

If I hold at 15, then my opponent holds to 30
I have won 6 points, my opponent 5. I won 55%. My DR is 1.67
If I push my opponent to a single deuce before he holds
I have won 7 points, my opponent 6. I won 54%. My DR is 1.88
If I push him to 3 deuces before he holds
I have won 9 points, my opponent 8. I won 53%. My DR is 2.08

In the extremely small sample size, DR reacts *positively* to my pushing my opponnent's serve to the limit, while points won reacts *negatively*. It's counterintuitive the way points won reacts, but the reason is that the more I push my opponent, the more deuces I force him to fight through, the more weight is given to his serve, where he is winning 50% or more of the points.

So intuitively, I totally see why DR is appealing. But that it is intuitively appealing doesn't necessarily mean that it is useful.
 

falstaff78

Hall of Fame
[Trigger warning: this thread is directed primarily at those who are unduly interested in the nitpicky and trivial aspects of tennis stats. If that ain't you, this thread's probably not for you.]
———

As many of you already know, dominance ratio (DR) is a nifty little stat to show how dominant a player has been in a given match, season, or similar. It has been shown to explain a lot of the variation in players' win %, and is thought to be a good predictor of future results. On the face of it, it's pretty similar to looking at what % of points a player wins, but not quite. How is it calculated? By dividing the percentage of points you win against your opponent's serve versus the percentage of points your opponent wins against your serve. Wouldn't that amount to pretty much exactly the same as just % of total points won? Not entirely – let's look at two hypothetical examples.

Say player A wins 70% of his serve points of his service points and 35% of his return points. Assuming an equal number of points played on serve and return, this should suggest that he wins 52.5% of points overall. His DR would be 35/30 ≈ 1.17

Player B wins 65% of service points and 40% of return points, which (assuming equal amount of serve and return points played) suggests the same % of total points won. However, played B would end up with a lower DR of 40/35 ≈ 1.14.

Therefore, it looks like ceteris paribus DR is slightly skewed toward players with better serve stats. Now, I wasn't sure whether this slight skew meant that DR is slightly inaccurate compared to just % of points won, or whether it's actually a clever way of weighting for an actual advantage of stronger serving. Perhaps the ratio of return points won vs. service points lost tells us something important beyond what just total % of points won tells us. That was my hunch, that DR would be a slightly stronger predictor.

To check this, I ran a quick regression analysis comparing first the connection between DR and match win %, then % of points won and match win %. Thanks to this excellent thread by Falstaff, where he shows that DR explains a lot of variance in match win %, there was already a data set to look at where half the data was already plotted in (the clay seasons of Nadal, Ferrer, Djokovic and Federer through 2012/2013), so I used that sample for convenience's sake.

What were the results? Dominance ratio predicted 76.9% of the variation in winning percentage (like Falstaff found). The correlation between DR and match win percentage was .877.

Percentage of total points won predicted 78.3% of the variation in match win percentage, and the correlation was .885.

DR
RvRD3Jm.png


% points won
OTJiPgz.png

As such, this seems to me to beg the question: does using dominance ratio really add anything useful compared to just using the simple nuts and bolts metric of % of points won? The explained variation and correlation is almost exactly the same. If anything, % of points won seems to do slightly better, which may give credence to the idea that DR has a slightly unnecessary skew in favor of better serving.

This was pretty spur of the moment, and I may have overlooked something obvious. Calling tennis stat enthusiasts @falstaff78 @Chanwan @TheFifthSet @Gary Duane @Meles and the rest.

this excellent thread gets to the heart of an important question in tennis: what are the ordered pairs of (SPW%, RPW%) which lead to identical expected winning percentages? (iso-probabilistic lines, as it were.)

your exposition discusses the two most frequent ways of lumping together ordered pairs.

example 1: constant dominance
Say D/R = 1.2. The following pairs have constant dominance:
(75%, 30%), (70%, 36%), (65%, 42%), (60%, 48%) etc.

example 2: constant % of points won
Say TPW% = 52.5%. Assuming equal points on serve / return; good assumption in long term only. The following pairs have constant TPW%:
(75%, 30%), (70%, 35%), (65%, 40%), (60%, 45%) etc.

In these examples, both schemes start from (75%, 30%), but as you can see, by the time you get to SPW=60%, there is a difference of 3 p.p. in RPW%. Now it's not clear to me, at least ex-ante, that either of these schemes lead to constant expected win probabilities. (which is really what you were testing for, when you regressed the two sets of data).

Stated more clearly it's not clear to me that, in terms of expected win%, the value of giving up 1 p.p. of SPW% is equal and opposite to the value of gaining 1 p.p. of RPW%. Similiarly, it's not clear that the value of giving up 1 p.p. of SPW% is equal and opposite to the value of gaining the amount of RPW% implied by a constant D/R. (i.e. D/R x 1 p.p.).

To take a concrete example to back up this point: In 2003, through 5 rounds of Wimbledon, Federer had (SPW, RPW) of (67, 45) for a sum of 112 and a D/R of 1.37. In 2017 through 5 matches he was at (78, 40) for a sum of 118 and a D/R of 1.77. I suspect that gaining those 5 points of RPW was probably worth a lot more than losing those 11 points of SPW. Even though both points won and D/R declined, I suspect his win probability went up.

As this thread alludes, the real power of a metric is to lie on an isoprobabilistic curve. As an example, a clay courter and a grass-courter with the same value of the metric, should have the same expected win probability.

Now, the central question of this thread becomes: is D/R closer to lying on an isoprobabilistic curve than TPW? I think not. As pointed out in OP, D/R is a ratio, and does favour high servers.

Thanks for the thought provoking thread, and look forward to the ensuing discussion.
 

Sysyphus

Talk Tennis Guru
Stated more clearly it's not clear to me that, in terms of expected win%, the value of giving up 1 p.p. of SPW% is equal and opposite to the value of gaining 1 p.p. of RPW%. Similiarly, it's not clear that the value of giving up 1 p.p. of SPW% is equal and opposite to the value of gaining the amount of RPW% implied by a constant D/R. (i.e. D/R x 1 p.p.).

Excellent poast.

Yes, I agree with the extracted here, and have been thinking the same thing. This was why I originally assumed DR would explain the variance a bit better – I figured that a 1 p.p. gain in SPW% would probably usually be a slightly more beneficial gain than an equivalent in RPW%, and that DR would account better for this. I may have been wrong on that assumption, though.

To take a concrete example to back up this point: In 2003, through 5 rounds of Wimbledon, Federer had (SPW, RPW) of (67, 45) for a sum of 112 and a D/R of 1.37. In 2017 through 5 matches he was at (78, 40) for a sum of 118 and a D/R of 1.77. I suspect that gaining those 5 points of RPW was probably worth a lot more than losing those 11 points of SPW. Even though both points won and D/R declined, I suspect his win probability went up.

this is interesting – would you care to expound a bit on this, because I'm not sure I follow? Is the thought that you can only hold serve that often anyways, so eventually there comes a point of diminishing returns (no pun), but going from 40% to 45% in RPW% would mean breaking serve significantly more often?
 

Red Rick

Bionic Poster
I'm pretty sure you can calculate it mathematically if you assume there's no variation in the% of points won and if you assume points won on the ad/deuce side are the same. It's basically a limit to solve for any game, and it would be a limit for a tiebreak as well.
 
D

Deleted member 716271

Guest
There's almost always a difference in the amount of service points played, and usually the player who wins less points on serve plays more points on serve.
Because the number of service and return points played doesn't have to be anywhere near equal. For instance, let's take an extreme (but possible) set where one player held to 30 (6 points, 4-2) every time, while the other player also held every time, but faced deuce - 2 deuces (10 points, 6-4) on average - in every service game. Until the tiebreak, player A won 40% of return points, while player B won 33,(3)% of return points. Both players won every service game with a +2 differential, so TPW is 50/50, but player A has DR of 1.2, since he held easier. In this case, DR correctly reflects that player A has been the better player throughout the set, but couldn't convert it into a break lead.

Intuitively, it seems like points won percentage would be more valuable then? DR seems to be the same thing except it is rewarding more of a struggle on service games you eke out?
 

falstaff78

Hall of Fame
Excellent poast.

Yes, I agree with the extracted here, and have been thinking the same thing. This was why I originally assumed DR would explain the variance a bit better – I figured that a 1 p.p. gain in SPW% would probably usually be a slightly more beneficial gain than an equivalent in RPW%, and that DR would account better for this. I may have been wrong on that assumption, though.



this is interesting – would you care to expound a bit on this, because I'm not sure I follow? Is the thought that you can only hold serve that often anyways, so eventually there comes a point of diminishing returns (no pun), but going from 40% to 45% in RPW% would mean breaking serve significantly more often?

the central challenge here is that point winning % translates in a non-linear fashion to game winning %

e.g. in this article Jeff sackmann makes the following points.
  • with RPW 32% you can expect to win 12% of return games
  • with RPW 36% you can expect to win 20% of return games
  • with RPW 40% you can expect to win 26% of return games
the non-linearity is critical. you can see that the 4 points of increase from 32 to 36 are worth more than the 4 points from 36 to 40.

now to relate this to federer example from earlier: I had a gut feel that the increase in break% that you enjoy from moving from RPW 40 to 45, is greater than the decrease in hold% you sustain in moving from SPW 78 to 67. I checked the numbers, and sure enough, this is the case. The increase in break% is 9 points, and the decrease in hold% is only 6 points. in other words, Federer of 2003 was better off in terms of games, and therefore expected win probability, despite a significantly lower D/R and fewer points won!


2017, R128 - QF:
D/R: 1.77
TPW: 57%

SPW%: 78%
Hold%: 96%
RPW: 40%
Break: 28%


2003, R128 - QF:
D/R: 1.37
TPW%: 56%

SPW%: 67%
Hold%: 90%
RPW: 45%
Break: 37%
 

AnOctorokForDinner

Talk Tennis Guru
That's interesting, and not what I would have expected.
I'd add that one place where %age of points won fails to reflect the reality is when one player is being pushed *much* harder on serve than the other. An example, based on playing just 2 total games (1 serve, 1 return).

If I hold at 15, then my opponent holds to 30
I have won 6 points, my opponent 5. I won 55%. My DR is 1.67
If I push my opponent to a single deuce before he holds
I have won 7 points, my opponent 6. I won 54%. My DR is 1.88
If I push him to 3 deuces before he holds
I have won 9 points, my opponent 8. I won 53%. My DR is 2.08

In the extremely small sample size, DR reacts *positively* to my pushing my opponnent's serve to the limit, while points won reacts *negatively*. It's counterintuitive the way points won reacts, but the reason is that the more I push my opponent, the more deuces I force him to fight through, the more weight is given to his serve, where he is winning 50% or more of the points.

So intuitively, I totally see why DR is appealing. But that it is intuitively appealing doesn't necessarily mean that it is useful.

In a serve-dominant match, with few breaks, DR is a good measurement to compare how quickly/dominantly players hold serve.

See this extreme example: http://www.tennisabstract.com/charting/20131006-M-Tokyo-F-Juan_Martin_Del_Potro-Milos_Raonic.html

Set 1: Potro 36-14 on serve, Meelosh 28-6. Both won 42 points; 6/34 = 17.6% RPW for Potro, 14/50 = 28% RPW for Meelosh, DR = 1.59 for Meelosh.

Set 2: Potro 27-13 on serve, Meelosh 22-8. Both won 35 points; 8/30 = 26.7% RPW for Potro, 13/40 = 32.5% RPW for Meelosh, DR = 1.22 for Meelosh.

Total: Potro 63-27 on serve, Meelosh 50-14. Both won 77 points; 14/64 = 21.9% RPW for Potro, 27/90 = 30% RPW for Meelosh, DR = 1.37 for Meelosh.

Result: Clutchtro def. Chokenic 7-6(5) 7-5.

The beauty of tennis scoring!
 

Red Rick

Bionic Poster
Intuitively, it seems like points won percentage would be more valuable then? DR seems to be the same thing except it is rewarding more of a struggle on service games you eke out?
Points won is a lot more linear. Win 2% more points on serve, points won will go up by about 1%, whereas dominance ratio may increase between 0.025% and 0.10%
 

Red Rick

Bionic Poster
the central challenge here is that point winning % translates in a non-linear fashion to game winning %

e.g. in this article Jeff sackmann makes the following points.
  • with RPW 32% you can expect to win 12% of return games
  • with RPW 36% you can expect to win 20% of return games
  • with RPW 40% you can expect to win 26% of return games
the non-linearity is critical. you can see that the 4 points of increase from 32 to 36 are worth more than the 4 points from 36 to 40.

now to relate this to federer example from earlier: I had a gut feel that the increase in break% that you enjoy from moving from RPW 40 to 45, is greater than the decrease in hold% you sustain in moving from SPW 78 to 67. I checked the numbers, and sure enough, this is the case. The increase in break% is 9 points, and the decrease in hold% is only 6 points. in other words, Federer of 2003 was better off in terms of games, and therefore expected win probability, despite a significantly lower D/R and fewer points won!


2017, R128 - QF:
D/R: 1.77
TPW: 57%

SPW%: 78%
Hold%: 96%
RPW: 40%
Break: 28%


2003, R128 - QF:
D/R: 1.37
TPW%: 56%

SPW%: 67%
Hold%: 90%
RPW: 45%
Break: 37%
I'd expect the non linearity to work the other way around,

Actually, I'm very sure that if you plot a line between points won and holding/breaking probability that the differential is highest at 0.5
 

falstaff78

Hall of Fame
I'd expect the non linearity to work the other way around,

Actually, I'm very sure that if you plot a line between points won and holding/breaking probability that the differential is highest at 0.5

Yes that's a good point. It should be an S curve.

It should pass through (0, 0), (0.5, 0.5) and (1,1) with lowest slope at 0 and 1, and max slope at 0.5

Something like this, although this is suggestive not accurate

plot.png
 

Red Rick

Bionic Poster
Yes that's a good point. It should be an S curve.

It should pass through (0, 0), (0.5, 0.5) and (1,1) with lowest slope at 0 and 1, and max slope at 0.5
Now I do wonder if ad/deuce point distribution matters, though I think doesn't matter apart from getting to deuce more often....



And this doesn't just work for % of breaks and holds, but also for games and sets won I think.
 

falstaff78

Hall of Fame
That's interesting, and not what I would have expected.
I'd add that one place where %age of points won fails to reflect the reality is when one player is being pushed *much* harder on serve than the other. An example, based on playing just 2 total games (1 serve, 1 return).

If I hold at 15, then my opponent holds to 30
I have won 6 points, my opponent 5. I won 55%. My DR is 1.67
If I push my opponent to a single deuce before he holds
I have won 7 points, my opponent 6. I won 54%. My DR is 1.88
If I push him to 3 deuces before he holds
I have won 9 points, my opponent 8. I won 53%. My DR is 2.08

In the extremely small sample size, DR reacts *positively* to my pushing my opponnent's serve to the limit, while points won reacts *negatively*. It's counterintuitive the way points won reacts, but the reason is that the more I push my opponent, the more deuces I force him to fight through, the more weight is given to his serve, where he is winning 50% or more of the points.

So intuitively, I totally see why DR is appealing. But that it is intuitively appealing doesn't necessarily mean that it is useful.

The reason for the counter intuitive movement of total points won, is that a serve dominant guy will make fewer serves. Thus his share of the points where he wins the majority decreases.

To give an extreme example: If I win 65% on serve, 40% on return, and 35% of points are played on my serve, then I'll have a DR of 1.14 but total points won of 49%
 

Red Rick

Bionic Poster
The reason for the counter intuitive movement of total points won, is that a serve dominant guy will make fewer serves. Thus his share of the points where he wins the majority decreases.

To give an extreme example: If I win 65% on serve, 40% on return, and 35% of points are played on my serve, then I'll have a DR of 1.14 but total points won of 49%
Pretty sure that's hardly possible unless sample size is tiny
 

falstaff78

Hall of Fame
I did this quickly, but if I did it right this is the curve. Assuming all points are random: the X axis is your chance of winning a random point, Y is your chance of winning the game.

prob.PNG

i did the algebra some years ago. do you get something like p^4 + 4.p^4.(1-p) + 10.p^4.(1-p)^2 + { the probability of reaching deuce x probability of winning from deuce } ?
 

Red Rick

Bionic Poster
i did the algebra some years ago. do you get something like p^4 + 4.p^4.(1-p) + 10.p^4.(1-p)^2 + { the probability of reaching deuce x probability of winning from deuce } ?
1 combination of love holds
4 combination of 15 holds
10 combinations of 30 holds
20 combinations to get to deuce

At deuce you get a limit
 

Meles

Bionic Poster
[Trigger warning: this thread is directed primarily at those who are unduly interested in the nitpicky and trivial aspects of tennis stats. If that ain't you, this thread's probably not for you.]
———

As many of you already know, dominance ratio (DR) is a nifty little stat to show how dominant a player has been in a given match, season, or similar. It has been shown to explain a lot of the variation in players' win %, and is thought to be a good predictor of future results. On the face of it, it's pretty similar to looking at what % of points a player wins, but not quite. How is it calculated? By dividing the percentage of points you win against your opponent's serve versus the percentage of points your opponent wins against your serve. Wouldn't that amount to pretty much exactly the same as just % of total points won? Not entirely – let's look at two hypothetical examples.

Say player A wins 70% of his serve points of his service points and 35% of his return points. Assuming an equal number of points played on serve and return, this should suggest that he wins 52.5% of points overall. His DR would be 35/30 ≈ 1.17

Player B wins 65% of service points and 40% of return points, which (assuming equal amount of serve and return points played) suggests the same % of total points won. However, played B would end up with a lower DR of 40/35 ≈ 1.14.

Therefore, it looks like ceteris paribus DR is slightly skewed toward players with better serve stats. Now, I wasn't sure whether this slight skew meant that DR is slightly inaccurate compared to just % of points won, or whether it's actually a clever way of weighting for an actual advantage of stronger serving. Perhaps the ratio of return points won vs. service points lost tells us something important beyond what just total % of points won tells us. That was my hunch, that DR would be a slightly stronger predictor.

To check this, I ran a quick regression analysis comparing first the connection between DR and match win %, then % of points won and match win %. Thanks to this excellent thread by Falstaff, where he shows that DR explains a lot of variance in match win %, there was already a data set to look at where half the data was already plotted in (the clay seasons of Nadal, Ferrer, Djokovic and Federer through 2012/2013), so I used that sample for convenience's sake.

What were the results? Dominance ratio predicted 76.9% of the variation in winning percentage (like Falstaff found). The correlation between DR and match win percentage was .877.

Percentage of total points won predicted 78.3% of the variation in match win percentage, and the correlation was .885.

DR
RvRD3Jm.png


% points won
OTJiPgz.png

As such, this seems to me to beg the question: does using dominance ratio really add anything useful compared to just using the simple nuts and bolts metric of % of points won? The explained variation and correlation is almost exactly the same. If anything, % of points won seems to do slightly better, which may give credence to the idea that DR has a slightly unnecessary skew in favor of better serving.

This was pretty spur of the moment, and I may have overlooked something obvious. Calling tennis stat enthusiasts @falstaff78 @Chanwan @TheFifthSet @Gary Duane @Meles and the rest.
Nicely done. Serve definitely counts and that is what made Sampras so great. Big serves get you through events efficiently conserving energy, especially slams. So DR probably also helps account for winning tournaments.

I like stats and have worked with them quite a bit in my vocation, but have never had any formal training so I'm impressed with your effort here.

DR brings around to the Roddick's and the Kyrgios's of the world. I promised @Gary Duane I'd ponder the inverse of DR on return. Specifically why a player with a touch worse return game like Roddick, gets shutdown in slams versus the success of Sampras. Kyrgios is facing the same issue.

One also wonders about players like Djokovic (and now Fedal) who boost their 2nd serve points won to make their serve games more effective; is it the same? Is it the same for a player like Thiem or even Zverev who relies more on first strike off the return of their first serve. (The Zedbot got his name from me for actually cranking his 2nd serve numbers behind a big 2nd and first strike.)

To me these are the better questions and DR is pretty much settled as a factor.

We could also get into return style. Players like Thiem and Rublev have very good return games. Rublev has a superlative 2nd return, but his first return is a little more prey to the horsepower coming in (Thiem much the same with Raonic/Tsonga class servers.) Of course players like Murray and Federer handle these big, big serves really well in comparison. So which return is the most important? A great 2nd like Rublev or the first return of Fed and Murray? Gary would advocate you just add up the points and then really just look at games. The stats may show nothing gained by looking at the point details, but I think something is their to be had. (Thiem in particular is a nice guinea pig as he excels on first serve and return points, and more average on 2nd serve and return.)
 

73west

Semi-Pro
i did the algebra some years ago. do you get something like p^4 + 4.p^4.(1-p) + 10.p^4.(1-p)^2 + { the probability of reaching deuce x probability of winning from deuce } ?

Yes

Also, I'd point this out.

If you assume that the players are mirror images, that if I win x% of my serve points, then my opponent also wins x% of his serve points, then no matter how you do the match it is going to come out that every set is a coin flip. To make it more interesting, assume that one player is better. Assume that if I win x% of my serve points, then my opponent is going to win (x-10)% of his serve points.

That is, if I win 65% of my serve points, then my opponent will win 55% of his. Then where would I want to be on the curve to maximize my chances of winning?

The gut says there is some inflection point where maybe I want to get up to 65/45, where I am still comfortable with my serve, but starting to really eat into the other guy's. That's probably not the case, if you go with the (artificially and almost assuredly incorrect) assumption that every point is an independent random occurrence. If that's the case, the best place to be is at the extreme.

If I hold 10% of my serve points, my opponent holds 0%. He will never win a set off me.
If I hold 100% of my serve points, my opponent holds 90%. He will never win a set off me.

The probabilities go from 100% I win at the extremes to about 55% that I win if I am at 65% hold, 45% return. The reason is simply that the better player wants less uncertainty. If you are better every point, and the only way you lose is the bad side of random chance, then you want the least variable result possible, and that comes at the extremes. The 50/50 points are the most variable.

That's almost an axiom of sports: the better player/team wants less random variability in the result. I'm not talking about things like "weather conditions" where one team may be better equipped physically or mentally to deal with them. I am talking about the random chance of "I win that 65% of the time". The most evident place you see this in tennis: BO3 vs BO5. Larger sample = more predictable result = better for the favorite. Extending matches reduces variance. Better players like that.
 

falstaff78

Hall of Fame
1 combination of love holds
4 combination of 15 holds
10 combinations of 30 holds
20 combinations to get to deuce

At deuce you get a limit

One can use a clever way to get out of the limit.

Lets assume that the probability of server winning a game from deuce is X. At deuce there are 4 ways the following two points can play out:
(Outcome / prob of occurrence / prob of server winning game):

WW / p^2 / 1
WL / p.(1-p) / X
LW / (1-p).p / X
LL / (1-p)^2 / 0

Then we can say

X = p^2.1 + p.(1-p).X + (1-p).p.X + (1-p)^2.0
X = p^2 + 2p(1-p).X
X = p^2/(1 - 2p + 2p^2)

With a little bit of effort one could set this problem up in terms of an infinite series and show that its limit is the above for p < 1
 
Last edited:

falstaff78

Hall of Fame
Yes

Also, I'd point this out.

If you assume that the players are mirror images, that if I win x% of my serve points, then my opponent also wins x% of his serve points, then no matter how you do the match it is going to come out that every set is a coin flip. To make it more interesting, assume that one player is better. Assume that if I win x% of my serve points, then my opponent is going to win (x-10)% of his serve points.

That is, if I win 65% of my serve points, then my opponent will win 55% of his. Then where would I want to be on the curve to maximize my chances of winning?

The gut says there is some inflection point where maybe I want to get up to 65/45, where I am still comfortable with my serve, but starting to really eat into the other guy's. That's probably not the case, if you go with the (artificially and almost assuredly incorrect) assumption that every point is an independent random occurrence. If that's the case, the best place to be is at the extreme.

If I hold 10% of my serve points, my opponent holds 0%. He will never win a set off me.
If I hold 100% of my serve points, my opponent holds 90%. He will never win a set off me.

The probabilities go from 100% I win at the extremes to about 55% that I win if I am at 65% hold, 45% return. The reason is simply that the better player wants less uncertainty. If you are better every point, and the only way you lose is the bad side of random chance, then you want the least variable result possible, and that comes at the extremes. The 50/50 points are the most variable.

That's almost an axiom of sports: the better player/team wants less random variability in the result. I'm not talking about things like "weather conditions" where one team may be better equipped physically or mentally to deal with them. I am talking about the random chance of "I win that 65% of the time". The most evident place you see this in tennis: BO3 vs BO5. Larger sample = more predictable result = better for the favorite. Extending matches reduces variance. Better players like that.

This is one of the best posts I have ever read on TT. I will vote for it as post of the year in @Hitman 's end of year poll

A substantial insight. Thanks.
 

falstaff78

Hall of Fame
In a serve-dominant match, with few breaks, DR is a good measurement to compare how quickly/dominantly players hold serve.

See this extreme example: http://www.tennisabstract.com/charting/20131006-M-Tokyo-F-Juan_Martin_Del_Potro-Milos_Raonic.html

Set 1: Potro 36-14 on serve, Meelosh 28-6. Both won 42 points; 6/34 = 17.6% RPW for Potro, 14/50 = 28% RPW for Meelosh, DR = 1.59 for Meelosh.

Set 2: Potro 27-13 on serve, Meelosh 22-8. Both won 35 points; 8/30 = 26.7% RPW for Potro, 13/40 = 32.5% RPW for Meelosh, DR = 1.22 for Meelosh.

Total: Potro 63-27 on serve, Meelosh 50-14. Both won 77 points; 14/64 = 21.9% RPW for Potro, 27/90 = 30% RPW for Meelosh, DR = 1.37 for Meelosh.

Result: Clutchtro def. Chokenic 7-6(5) 7-5.

The beauty of tennis scoring!

Or this one:

http://www.tennisabstract.com/charting/20060305-M-Dubai-F-Roger_Federer-Rafael_Nadal.html

Or this one:

https://tt.tennis-warehouse.com/ind...-stats-a-horror-show-of-bp-profligacy.544058/
 

Chanwan

G.O.A.T.
@Sysyphus
a "stupid" question first. Can you spell this out for me? I'm probably misreading it, but I would imagine the correlation to be higher. I.e. if's I'm reading it right, 1/9 of the time the player with most points won/highest dominance ratio doesn't win the match? And what explains the rest of the winning percentage?
"Dominance ratio explained 76.9% of the variation in winning percentage (like Falstaff found). The correlation between DR and match win percentage was .877.

Percentage of total points won explained 78.3% of the variation in match win percentage, and the correlation was .885."

Also, just tried to calculate DR. Fed-Djoko, US Open final. Fed's DR is 1,02, whereas he won 49.3 % of the points - am I doing this correctly this? The DR shows us that Novak struggled in more service games than Fed, whereas the points won mere shows it was a very close 4-setter.
http://www.atpworldtour.com/en/scores/2015/560/MS001/match-stats
So it wouldn't matter to more balanced/all around players? Since they basically say the same thing, mathematically speaking. Although it would be interesting to use both on players like Kyrgios, Raonic, Karlovic, basically anyone that relies on their serves to win them sets. On the other hand It may also be interesting to analyze players who are more comfortable or just plain better at receiving but obviously there's no good archetype/sample of that player in the pro tour since everyone is presumably good on both ends of the court at a certain level.
Schwartzman is a good archetype in the latter respect.
 
Last edited:

Red Rick

Bionic Poster
@Sysyphus
a "stupid" question first. Can you spell this out for me? I'm probably misreading it, but I would imagine the correlation to be higher. I.e. if's I'm reading it right, 1/9 of the time the player with most points won/highest dominance ratio doesn't win the match? And what explains the rest of the winning percentage?
"Dominance ratio explained 76.9% of the variation in winning percentage (like Falstaff found). The correlation between DR and match win percentage was .877.

Percentage of total points won explained 78.3% of the variation in match win percentage, and the correlation was .885."
correlation is the root of the variance that is explained by the variable.

The other part of the variance is explained by other variables, which has to do with the fact that not all points are equally valuable in tennis due to the scoring system, so point distribution and points won are basically the variables you have, with point distribution being rather complex in the end.
 

Red Rick

Bionic Poster
IIRC besides USO 15 Fed has a slightly positive DR in all of Rome 06, W 08, AO 09 losses, sad stuff really. Think the only big match where he lost by TPW and DR stats yet emerged with a win was USO 04 QF v Agassi.
Yeah, Federer generally wins by just winning way more points. Even when his sets go to tiebreaks or something he's usually winning way more points.

The famous WImbly final of 2009, Federer had a DR of 1.33 IIRC.
 

Chanwan

G.O.A.T.
@Sysyphus - looking forward to dive into this. But it requires me to be fresher than I currently am
@Sysyphus - people with greater statistical minds than me have given you answers. I'll read along, but I doubt I'll be able to offer any insight of my own.
I found it interesting that @falstaff78's example of Fed 03 vs. Fed 17 yielded much better results in terms of games won for the 2003 edition despite of Fed having a much better DR in 2017 and winning more of the total points.
 
Last edited:

Red Rick

Bionic Poster
I did this quickly, but if I did it right this is the curve. Assuming all points are random: the X axis is your chance of winning a random point, Y is your chance of winning the game.

prob.PNG
What you can also see very well here is that the same %points won in total can lead to very different % games won. Say I win 60% of serve points and 0% return points, I'll win 37% of games, while winning 30% of points. If I win 30% of both my serve and return points I only win 10% of points, so this distributions is very interesting @Chanwan
 

metsman

G.O.A.T.
Yes

Also, I'd point this out.

If you assume that the players are mirror images, that if I win x% of my serve points, then my opponent also wins x% of his serve points, then no matter how you do the match it is going to come out that every set is a coin flip. To make it more interesting, assume that one player is better. Assume that if I win x% of my serve points, then my opponent is going to win (x-10)% of his serve points.

That is, if I win 65% of my serve points, then my opponent will win 55% of his. Then where would I want to be on the curve to maximize my chances of winning?

The gut says there is some inflection point where maybe I want to get up to 65/45, where I am still comfortable with my serve, but starting to really eat into the other guy's. That's probably not the case, if you go with the (artificially and almost assuredly incorrect) assumption that every point is an independent random occurrence. If that's the case, the best place to be is at the extreme.

If I hold 10% of my serve points, my opponent holds 0%. He will never win a set off me.
If I hold 100% of my serve points, my opponent holds 90%. He will never win a set off me.

The probabilities go from 100% I win at the extremes to about 55% that I win if I am at 65% hold, 45% return. The reason is simply that the better player wants less uncertainty. If you are better every point, and the only way you lose is the bad side of random chance, then you want the least variable result possible, and that comes at the extremes. The 50/50 points are the most variable.

That's almost an axiom of sports: the better player/team wants less random variability in the result. I'm not talking about things like "weather conditions" where one team may be better equipped physically or mentally to deal with them. I am talking about the random chance of "I win that 65% of the time". The most evident place you see this in tennis: BO3 vs BO5. Larger sample = more predictable result = better for the favorite. Extending matches reduces variance. Better players like that.
Well it's not just that. If we treat each set as i.i.d Bernoulli (which is obviously not the case, but just to prove the point), any favorite, in fact anyone with p greater than .4, will have a greater chance to win a B05 than a B03 due to the nature of the format and what is necessary to win the match. More ways to win a B05.

Also, in general, lower variance depends what you mean, the variance of the sample mean will decrease as n increases, but not necessarily the variance of outcomes
 
Last edited:

Gary Duane

G.O.A.T.
[Trigger warning: this thread is directed primarily at those who are unduly interested in the nitpicky and trivial aspects of tennis stats. If that ain't you, this thread's probably not for you.]
———

As many of you already know, dominance ratio (DR) is a nifty little stat to show how dominant a player has been in a given match, season, or similar. It has been shown to explain a lot of the variation in players' win %, and is thought to be a good predictor of future results. On the face of it, it's pretty similar to looking at what % of points a player wins, but not quite. How is it calculated? By dividing the percentage of points you win against your opponent's serve versus the percentage of points your opponent wins against your serve. Wouldn't that amount to pretty much exactly the same as just % of total points won? Not entirely – let's look at two hypothetical examples.

Say player A wins 70% of his serve points of his service points and 35% of his return points. Assuming an equal number of points played on serve and return, this should suggest that he wins 52.5% of points overall. His DR would be 35/30 ≈ 1.17

Player B wins 65% of service points and 40% of return points, which (assuming equal amount of serve and return points played) suggests the same % of total points won. However, played B would end up with a lower DR of 40/35 ≈ 1.14.

Therefore, it looks like ceteris paribus DR is slightly skewed toward players with better serve stats. Now, I wasn't sure whether this slight skew meant that DR is slightly inaccurate compared to just % of points won, or whether it's actually a clever way of weighting for an actual advantage of stronger serving. Perhaps the ratio of return points won vs. service points lost tells us something important beyond what just total % of points won tells us. That was my hunch, that DR would be a slightly stronger predictor.

To check this, I ran a quick regression analysis comparing first the connection between DR and match win %, then % of points won and match win %. Thanks to this excellent thread by Falstaff, where he shows that DR explains a lot of variance in match win %, there was already a data set to look at where half the data was already plotted in (the clay seasons of Nadal, Ferrer, Djokovic and Federer through 2012/2013), so I used that sample for convenience's sake.

What were the results? Dominance ratio explained 76.9% of the variation in winning percentage (like Falstaff found). The correlation between DR and match win percentage was .877.

Percentage of total points won explained 78.3% of the variation in match win percentage, and the correlation was .885.

DR
RvRD3Jm.png


% points won
OTJiPgz.png

As such, this seems to me to beg the question: does using dominance ratio really add anything useful compared to just using the simple nuts and bolts metric of % of points won? The explained variation and correlation is almost exactly the same. If anything, % of points won seems to do slightly better, which may give credence to the idea that DR has a slightly unnecessary skew in favor of better serving.

This was pretty spur of the moment, and I may have overlooked something obvious. Calling tennis stat enthusiasts @falstaff78 @Chanwan @TheFifthSet @Gary Duane @Meles and the rest.
A couple thoughts:

First, % of points as calculated by the ATP does this:

(Serve points won + return points won)/(total points played)

Great servers play fewer points on serve, in spite of the fact that they play more games on serve. The ratio of points, service/return, is usually in the vicinity of around 95%.

My data sets calculate this precisely, which is why I need all 1st and 2nd points, on serve and return, to duplicate the figure that ATP produces, which is then used by sites like TA, only TA displays to one decimal.

For example, here are the numbers for Pete Sampra, career:

(69.45+38.02)/2=53.735 (this is wrong)

The correct figure:
53.51

TA: 53.5%

The reason:
66629 points on serve
68514 points on return.

The same thing happens on games, in reverse. Great players play more games serving. It just doesn't matter for poor servers:
10441 service games
10227 return games

So adding service games and return games, only the percentages, then dividing by 2 also produces a similar error.

These are minor errors, but I thought I'd bring it up. Now, back with the rest...
 

Gary Duane

G.O.A.T.
Now, the rest of DR
As many of you already know, dominance ratio (DR) is a nifty little stat to show how dominant a player has been in a given match, season, or similar. It has been shown to explain a lot of the variation in players' win %, and is thought to be a good predictor of future results. On the face of it, it's pretty similar to looking at what % of points a player wins, but not quite. How is it calculated? By dividing the percentage of points you win against your opponent's serve versus the percentage of points your opponent wins against your serve.
DR does not totally explode and reveal itself as monstrously distorting because points, for the most part, stay within reasonable bounds.

But to illustrate how wrong the concept can be:

95/10
80/40

In the first we have Isner on grass, or someone like him. 10/5=2

Call it a game DR
In the second we have a Coria like player, on clay. 40/20 also=2

But Coria at his peak was the 2nd best clay court player in the world and was a monster. Dr. Ivo has never been on that level on grass.

In other words, as we get to extremes, it gets really wonky.

As far as I'm concerned what you just did with points and DR pretty much explodes a myth.

I would look for results from dominant servers in a different manner.

I'd look for results in the last three rounds of majors, then compare them to the first four rounds. Then I'd look at records of the "slam kings".

What you find is that return stats are low for players like Sampras and Fed in early rounds but higher, in comparison to all other players, in later rounds. The reason is that they coast. Call them lazy, arrogant, confident, or good tacticians. Their serve stats are through the roof, but looking at return stats, they seem sort of in the middle of the pack.

Look at Sampras:
24.49%, #95. Not a servebot, but not too impressive.

Now Federer:
27.26%
#31, and this for the guy so many people want to call GOAT.

Not too impressive.

Here's the real story. In the first 4 rounds of all majors, here are Fed and Sampras:
58.06 Sampras
61.96 Fed

That's about 4 points difference, a lot. Sampras looks OK, but not great.

Now last three rounds:
54.91 Fed
54.98 Sampras

For the last three rounds of majors, 55% is spectacular. We can't get a breakdown of games serving and returning, but I'd suggest this:

92/22

Games on AS is about the same as HCs, because clay and grass more or less cancel out. Figure both broke 1/5 return games, on average, and maybe a bit higher.

That's no longer servebot. That's damned good, for anyone, in the last three rounds of majors.

How good was Sampras at Wimbledon? 57% of games CAREER, last three rounds. That's so high, it's stupid.

Let's compare with Nadal, same idea, HC so USO and AO, last three rounds:
52.94

If we guess 85/21 for Rafa, that's probably about right, so he is most likely not breaking more than Fed and Sampras in final rounds.

Let's compare that with Nadal at RG, last three rounds:

62.77

This is really a bit better than Sampras, on grass, because games on clay run around 4-5% higher on clay. That's just how it is.

My take-away: we need stats, easy to look up, that are simply not available, though I think some sites do it. That will tell us a lot more.
 

falstaff78

Hall of Fame
A couple thoughts:

First, % of points as calculated by the ATP does this:

(Serve points won + return points won)/(total points played)

Great servers play fewer points on serve, in spite of the fact that they play more games on serve. The ratio of points, service/return, is usually in the vicinity of around 95%.

My data sets calculate this precisely, which is why I need all 1st and 2nd points, on serve and return, to duplicate the figure that ATP produces, which is then used by sites like TA, only TA displays to one decimal.

For example, here are the numbers for Pete Sampra, career:

(69.45+38.02)/2=53.735 (this is wrong)

The correct figure:
53.51

TA: 53.5%

The reason:
66629 points on serve
68514 points on return.

The same thing happens on games, in reverse. Great players play more games serving. It just doesn't matter for poor servers:
10441 service games
10227 return games

So adding service games and return games, only the percentages, then dividing by 2 also produces a similar error.

These are minor errors, but I thought I'd bring it up. Now, back with the rest...

Now, the rest of DR

DR does not totally explode and reveal itself as monstrously distorting because points, for the most part, stay within reasonable bounds.

But to illustrate how wrong the concept can be:

95/10
80/40

In the first we have Isner on grass, or someone like him. 10/5=2

Call it a game DR
In the second we have a Coria like player, on clay. 40/20 also=2

But Coria at his peak was the 2nd best clay court player in the world and was a monster. Dr. Ivo has never been on that level on grass.

In other words, as we get to extremes, it gets really wonky.

As far as I'm concerned what you just did with points and DR pretty much explodes a myth.

I would look for results from dominant servers in a different manner.

I'd look for results in the last three rounds of majors, then compare them to the first four rounds. Then I'd look at records of the "slam kings".

What you find is that return stats are low for players like Sampras and Fed in early rounds but higher, in comparison to all other players, in later rounds. The reason is that they coast. Call them lazy, arrogant, confident, or good tacticians. Their serve stats are through the roof, but looking at return stats, they seem sort of in the middle of the pack.

Look at Sampras:
24.49%, #95. Not a servebot, but not too impressive.

Now Federer:
27.26%
#31, and this for the guy so many people want to call GOAT.

Not too impressive.

Here's the real story. In the first 4 rounds of all majors, here are Fed and Sampras:
58.06 Sampras
61.96 Fed

That's about 4 points difference, a lot. Sampras looks OK, but not great.

Now last three rounds:
54.91 Fed
54.98 Sampras

For the last three rounds of majors, 55% is spectacular. We can't get a breakdown of games serving and returning, but I'd suggest this:

92/22

Games on AS is about the same as HCs, because clay and grass more or less cancel out. Figure both broke 1/5 return games, on average, and maybe a bit higher.

That's no longer servebot. That's damned good, for anyone, in the last three rounds of majors.

How good was Sampras at Wimbledon? 57% of games CAREER, last three rounds. That's so high, it's stupid.

Let's compare with Nadal, same idea, HC so USO and AO, last three rounds:
52.94

If we guess 85/21 for Rafa, that's probably about right, so he is most likely not breaking more than Fed and Sampras in final rounds.

Let's compare that with Nadal at RG, last three rounds:

62.77

This is really a bit better than Sampras, on grass, because games on clay run around 4-5% higher on clay. That's just how it is.

My take-away: we need stats, easy to look up, that are simply not available, though I think some sites do it. That will tell us a lot more.

Excellent insights thanks
 

metsman

G.O.A.T.
Now, the rest of DR

DR does not totally explode and reveal itself as monstrously distorting because points, for the most part, stay within reasonable bounds.

But to illustrate how wrong the concept can be:

95/10
80/40

In the first we have Isner on grass, or someone like him. 10/5=2

Call it a game DR
In the second we have a Coria like player, on clay. 40/20 also=2

But Coria at his peak was the 2nd best clay court player in the world and was a monster. Dr. Ivo has never been on that level on grass.

In other words, as we get to extremes, it gets really wonky.

As far as I'm concerned what you just did with points and DR pretty much explodes a myth.

I would look for results from dominant servers in a different manner.

I'd look for results in the last three rounds of majors, then compare them to the first four rounds. Then I'd look at records of the "slam kings".

What you find is that return stats are low for players like Sampras and Fed in early rounds but higher, in comparison to all other players, in later rounds. The reason is that they coast. Call them lazy, arrogant, confident, or good tacticians. Their serve stats are through the roof, but looking at return stats, they seem sort of in the middle of the pack.

Look at Sampras:
24.49%, #95. Not a servebot, but not too impressive.

Now Federer:
27.26%
#31, and this for the guy so many people want to call GOAT.

Not too impressive.

Here's the real story. In the first 4 rounds of all majors, here are Fed and Sampras:
58.06 Sampras
61.96 Fed

That's about 4 points difference, a lot. Sampras looks OK, but not great.

Now last three rounds:
54.91 Fed
54.98 Sampras

For the last three rounds of majors, 55% is spectacular. We can't get a breakdown of games serving and returning, but I'd suggest this:

92/22

Games on AS is about the same as HCs, because clay and grass more or less cancel out. Figure both broke 1/5 return games, on average, and maybe a bit higher.

That's no longer servebot. That's damned good, for anyone, in the last three rounds of majors.

How good was Sampras at Wimbledon? 57% of games CAREER, last three rounds. That's so high, it's stupid.

Let's compare with Nadal, same idea, HC so USO and AO, last three rounds:
52.94

If we guess 85/21 for Rafa, that's probably about right, so he is most likely not breaking more than Fed and Sampras in final rounds.

Let's compare that with Nadal at RG, last three rounds:

62.77

This is really a bit better than Sampras, on grass, because games on clay run around 4-5% higher on clay. That's just how it is.

My take-away: we need stats, easy to look up, that are simply not available, though I think some sites do it. That will tell us a lot more.
That's part of the reason I dislike these stats, because the vast majority of them are piled up against mugs in the early rounds, where variance of outcomes is bigger. It can give you an idea of a player's skills and strengths, but it's a bad tool to use to compare playing quality or playing level. Stats in the last few rounds or against certain quality of opponents would be far more instructive even though I'm not a fan of any stat that is devoid of actual context within the match in a sport like tennis.
 
Top