It's also interesting to look at the probability distribution of different final scores, and there are some results you might not expect. For example, in the case that every point (and thus every game) of a set is a 50/50 coin flip, what is the most likely final score of a set?
You might guess that 7-6 is the most likely score in this perfectly even matchup, but it's actually 6-4, which is twice as likely to occur than 7-6. Even 6-2 is more likely to happen than 7-6. Here's the breakdown by rank order of likelihood:
To put some intuition behind this - if sets were first-to-100, win by 2, you'd expect nobody but the biggest servebots to reach a tiebreaker.It is a little surprising the closer scores were down on the list, but group them together and the total is around 24.6% which is right about the same or a bit higher than the two 6-4 scores together.
Sort of like how in a 1D random walk, after a bunch of steps the expected position is 0, but the expected absolute distance from the origin is very much nonzero.
I should point out that the likelihood of 7-6 in these 50/50 examples is made even less likely by the assumption that the server has no advantage. To reference a later chart:
having serve/return point winrates of 80/30 means almost a 50% chance of reaching tiebreak (though that's pretty extreme levels of servebotness). Meanwhile, 60/50 is around 10%.
So since the players' abilities are unknown, I think the more relevant question to ask is "what does a score of 6-2 tell you about the underlying skill level of the two players, vs a score of 7-6". It doesn't matter so much that 7-6 scores are rarer in general, but that 7-6 scores should be relatively more rare in players of uneven ability.I almost hate to say it but this seems to give some weight to WTN’s claim counting sets is just as accurate as counting games. In players of absolute equal ability a more lopsided score of 6-2 is more likely then a score of 7-6. Or am I reading too much in to that?
To exaggerate, if you lost a set to Djokovic, that doesn't really say much. But if you lost 6-7, people would confidently say you're elite. If you lost 2-6, people may say you're very good, but maybe you're not quite elite and just lucked into 2 games.
In the more general case, we should be able to use a quick Bayes rule to illustrate:
If we arbitrarily assume the serve/return point winrate spread is 24% (a completely even 50% matchup would be 62/38), and we have some prior belief that the overall point winrate (serve + return) / 2 looks something like this:
And if we simulate the likelihood of seeing each particular set score vs overall point winrate as:
Then our posterior probability, or updated belief of overall point winrate, should look something like this (for some selected set score outcomes).
I'm no statistician or whatever, so I could have bungled this somewhere, but I think it looks sane enough.