Originally Posted by Flash O'Groove
Amazing work Falstaff! It's really interesting. I had the same problem than you. My wife thinks that I'm crazy to spend so much time in tennis number (instead of playing it!).
However I think their is a problem with your analysis of outlier. Are outlier really outlier when they are nearly half of the total cases? To solve this problem it would be necessary to include more cases. You can't make that with players who have won less than 5 slams, because they wouldn't be spread enough, so you have to include finals as well. The ideal thing would be to pick a bunch of top 10 players from each decades we know were not to much stoped by injuries and make a scale for their slam result. Thus we could see slam success by age for a variety of players, including the one who are not able to win or reach slam final.
Unfortunately I don't have a lot of free time now but I hope that I can someday participate to a community of tennis freak scientist.
Hey Flash always a pleasure to meet a fellow freak / nut. Not to mention a fellow member of the married club.
When I first made the little table with age buckets I thought the same thing - why are half the guys outliers? and do I need to include more cases?
Then I realized that the chart in the OP does precisely that! Essentially it includes ALL the cases, and sums them up! And the size of the sample (n=444) means it is statistically robust. There are 229 major / season finale final appearances between ages 22-25 and only 140 from ages 26-39!