Hi all, new here. Love tennis but in all reality like analyzing stuff on excel even more.
So I have been working on two excel models recently. One is an NBA model (I like basketball better than tennis), but that's not for here - the other one is for here, and it answers, analytically speaking, a lot of the questions I see repeatedly on this forum.
Note, that the model is not complete, I still want to add a few things, but for what it is now, let me describe it.
Also note, that this model is only for the open era. anything before 1968 is not here:
Fair warning: the first part of this post, is mathematical, if you want to skip it please do, but it is very important in order to understand how I got to the rankings.
When coming to choose GOAT, there are usually two main problems:
1. How do you compare eras and competition
2. What kind of values do you give to any accomplishment. is one grand slam win greater than winning 3 master series tournaments?
My model answers these questions and more, but first the prerequisites for entrance to the model - a player must have, either/or:
1. won a master series / Grand prix tournament, at the very least
2. got to a grand slam final
3. even if he did neither, but ended a year ranked at the top 10 - he can enter the model.
Players excluded? not many, some names I can mention though are Monfils and Magnus Gustafson.
I would also like to emphasize that my model rewards accomplishments, but does not punish failures. for example - Nadal might have lost in the first round of Wimbeldon this year, but it doesn't matter. he of course would not gain any points for not advancing, but he would also not lose anything.
What are the accomplishments analyzed? well, these are the brackets:
1. Any tournament win.
2. Reaching the final of a masters series tournament. For purposes of the model, I decided to treat the WCT and the grand slam cup (circa 70-89 and then 90-99) as masters series tournaments.
3. Winning a masters series tournament.
4. Reaching a world cup final, world masters ,or whatever that end of the year tournament was called during the years.
5. Winning it. notice, that for this model purposes, I treated an Olympic win / final as a World masters. I believe it is as prestigious historically although the ATP rewards it with less points.
6. Reaching a GS final
7. Winning a GS.
In terms of value for each of these accomplishments, my aim was to reach an analytic number for:
GS > WC/OLYMPICS > GS final > WC/OLYMPICS final > Masters series/WCT/grand slam cup > Masters series etc. final > any other tournament win.
On to the values:
1. any 250-500 tournament win = 1 point
2. Grand slams: calculated as the total number of gs won / by the total number of 250-500 tournaments won by players that won at least one grand slam.
the number is 6.22 which kind of makes sense, seeing that a GS awards between 4 and 8 times more ATP points than a 250-500 tournament
=6.22
3. To solve for what a Masters series or WCT win are valued at, I added the total numbers of those tournament wins in the open era, and divided by the total number of grand slams won. I then benchmarked against the number I got earlier for the GS value 6.22 - so:
6.22 / (MS+WCT WINS)/(GS WINS) = 2.66
4. for the value of a grand slam final, I simply, took the ratio between the number of players (count function, not sum) who won masters vs. those who got to a GS final (105/85), and multiplied by the value derived for a masters series at 2.66, so 2.66*(no. of players to win masters/no. of players to advance to a GS final) = 3.29
Another reason this makes sense is because of the resulted ratio between a GS win and a GS final, which is in this model 3.29/6.22 = 0.53, which is a better number for me, than the ATP's 0.6 (1200/2000), because I believe that a win is just worth more. In fact, if I could have have solved for a number around 0.45 I would have been even happier. But I am trying to be objective, and this is the number.
5. Since I have a ratio for win vs. final for a GS, I just used it for a masters series win vs. masters series final, at 0.53, so a MS/WCT/GS CUP final is worth 1.41 points.
6. Perhaps the hardest and most subjective were the WC, or the world tour finals. Here I actually used ATP. if according to ATP, a WC is worth exactly the average between a GS and a masters series tournament, well - that is what I will use too. The reason for this is that there have been too few of these (only once a year) to really play with the numbers.
However the average between GS and MS tours is (6.22+2.66)/2= 4.44
7. Using the win/final ratio here (0.53), solves a value of 2.34 for a WC or Olympic final
All right then, we have all the values, and now it's just a question of plugging in. In fact, all these values are accurate and correspond to the underlying assumptions of:
GS > WC/OLYMPICS > GS final > WC/OLYMPICS final > Masters series/WCT/grand slam cup > Masters series etc. final > any other tournament win.
NOT QUITE!!!!!!!!!!!
You see, this is not enough. There is one more thing I needed to solve for, the strength of competition, or what is a GS in the year 1985 really worth vs. one in say... 2011. How do you solve that.
Well, the next stage was to actually assign a number, or factor if you will, for each season, and for this, for the first time in this model, I used the world rankings.
Note: I only have the world ranking beginning 1973. I arbitrarily assigned a number to 1968-1973 just because I didn't have the energy to start digging and thinking about tackling this one. After all, I think I did enough so far.
Assigning seasons factors, was done this way:
1. First, I took the first phase of my analysis and plugged in to get numbers for every single player who fit the criteria.
2. Then, I took the ATP world end rankings. Threw away anyone who didn't end in the top 10, and chose to look at the top 10 instead.
3. Next, I assigned value percentages to each top 10 location. in the following fashion, and I will illustrate with an example.
Say Federer for sake of argument, has 100 points in my model (he actually has more, we'll get there. I am just trying to get to my point).
If he ended the given year at No. 1, he would have season contribution of the whole 100 points.
However, if he ended the season as No. 2, he would only be assigned 90% of his total points. at No. 3 he would have 80% etc.
Note that if a player never got to the No. 1 position (say murray)than the 100% of his points are at No. 2 (his highest achieved position) so if he ends the season at No. 3, he gets 90% of the points etc.
4. Next we add up all the points for the top 10.
5. However, there is one more correction. a player can't get more points than the player above him in any given year. For example, say federer ends at No. 3 and gets 80% of his total points - these 80% might be higher than the no. of points that Djokovic or Murray have, even though they are ranked higher and got 100%, just because Federer accomplished more -
Since that is impossible, a player can never get more points that any player above him in the rankings. he is limited by a ceiling set by those above him. so he can really get: (the MINIMUM between his % of his total points - OR the points of those above him).
6. And again, we sum all the totals, after the corrections for the top 10.
7. Now we actually have a number or a value for every season. Here a new problem arises.
You see, a really good season (e.g. 2009) can be worth up to three times more than a really bad season (e.g. 2000), since this is unfair, I did the one most controversial and up for debate thing in this whole model - I caped the difference - So that even the very worse season can never be worth less than 75% of the very best season.
Why 75%? No reason whatsoever, except that it sounded like a good number. If anyone wants me to change it, it is very easy to do so.
__________________
So, we now have two things:
1. The total value for each achievement of each player
2. The assigned value for a season's strength.
So, first things first: a season ranking:
1 1985
2 1982
3 1981
4 1978
5 1984
6 1979
7 1980
8 1990
9 1987
10 1983
11 2009
12 1975
13 2007
14 1994
15 2008
16 1977
17 2010
18 1974
19 2005
20 1976
21 1986
22 1989
23 1995
24 1988
25 2006
26 2011
27 1996
28 2004
29 1973
30 2012
31 1991
32 1992
33 1993
34 1998
35 1999
36 1997
37 2002
38 2001
39 2003
40 2000
So I have been working on two excel models recently. One is an NBA model (I like basketball better than tennis), but that's not for here - the other one is for here, and it answers, analytically speaking, a lot of the questions I see repeatedly on this forum.
Note, that the model is not complete, I still want to add a few things, but for what it is now, let me describe it.
Also note, that this model is only for the open era. anything before 1968 is not here:
Fair warning: the first part of this post, is mathematical, if you want to skip it please do, but it is very important in order to understand how I got to the rankings.
When coming to choose GOAT, there are usually two main problems:
1. How do you compare eras and competition
2. What kind of values do you give to any accomplishment. is one grand slam win greater than winning 3 master series tournaments?
My model answers these questions and more, but first the prerequisites for entrance to the model - a player must have, either/or:
1. won a master series / Grand prix tournament, at the very least
2. got to a grand slam final
3. even if he did neither, but ended a year ranked at the top 10 - he can enter the model.
Players excluded? not many, some names I can mention though are Monfils and Magnus Gustafson.
I would also like to emphasize that my model rewards accomplishments, but does not punish failures. for example - Nadal might have lost in the first round of Wimbeldon this year, but it doesn't matter. he of course would not gain any points for not advancing, but he would also not lose anything.
What are the accomplishments analyzed? well, these are the brackets:
1. Any tournament win.
2. Reaching the final of a masters series tournament. For purposes of the model, I decided to treat the WCT and the grand slam cup (circa 70-89 and then 90-99) as masters series tournaments.
3. Winning a masters series tournament.
4. Reaching a world cup final, world masters ,or whatever that end of the year tournament was called during the years.
5. Winning it. notice, that for this model purposes, I treated an Olympic win / final as a World masters. I believe it is as prestigious historically although the ATP rewards it with less points.
6. Reaching a GS final
7. Winning a GS.
In terms of value for each of these accomplishments, my aim was to reach an analytic number for:
GS > WC/OLYMPICS > GS final > WC/OLYMPICS final > Masters series/WCT/grand slam cup > Masters series etc. final > any other tournament win.
On to the values:
1. any 250-500 tournament win = 1 point
2. Grand slams: calculated as the total number of gs won / by the total number of 250-500 tournaments won by players that won at least one grand slam.
the number is 6.22 which kind of makes sense, seeing that a GS awards between 4 and 8 times more ATP points than a 250-500 tournament
=6.22
3. To solve for what a Masters series or WCT win are valued at, I added the total numbers of those tournament wins in the open era, and divided by the total number of grand slams won. I then benchmarked against the number I got earlier for the GS value 6.22 - so:
6.22 / (MS+WCT WINS)/(GS WINS) = 2.66
4. for the value of a grand slam final, I simply, took the ratio between the number of players (count function, not sum) who won masters vs. those who got to a GS final (105/85), and multiplied by the value derived for a masters series at 2.66, so 2.66*(no. of players to win masters/no. of players to advance to a GS final) = 3.29
Another reason this makes sense is because of the resulted ratio between a GS win and a GS final, which is in this model 3.29/6.22 = 0.53, which is a better number for me, than the ATP's 0.6 (1200/2000), because I believe that a win is just worth more. In fact, if I could have have solved for a number around 0.45 I would have been even happier. But I am trying to be objective, and this is the number.
5. Since I have a ratio for win vs. final for a GS, I just used it for a masters series win vs. masters series final, at 0.53, so a MS/WCT/GS CUP final is worth 1.41 points.
6. Perhaps the hardest and most subjective were the WC, or the world tour finals. Here I actually used ATP. if according to ATP, a WC is worth exactly the average between a GS and a masters series tournament, well - that is what I will use too. The reason for this is that there have been too few of these (only once a year) to really play with the numbers.
However the average between GS and MS tours is (6.22+2.66)/2= 4.44
7. Using the win/final ratio here (0.53), solves a value of 2.34 for a WC or Olympic final
All right then, we have all the values, and now it's just a question of plugging in. In fact, all these values are accurate and correspond to the underlying assumptions of:
GS > WC/OLYMPICS > GS final > WC/OLYMPICS final > Masters series/WCT/grand slam cup > Masters series etc. final > any other tournament win.
NOT QUITE!!!!!!!!!!!
You see, this is not enough. There is one more thing I needed to solve for, the strength of competition, or what is a GS in the year 1985 really worth vs. one in say... 2011. How do you solve that.
Well, the next stage was to actually assign a number, or factor if you will, for each season, and for this, for the first time in this model, I used the world rankings.
Note: I only have the world ranking beginning 1973. I arbitrarily assigned a number to 1968-1973 just because I didn't have the energy to start digging and thinking about tackling this one. After all, I think I did enough so far.
Assigning seasons factors, was done this way:
1. First, I took the first phase of my analysis and plugged in to get numbers for every single player who fit the criteria.
2. Then, I took the ATP world end rankings. Threw away anyone who didn't end in the top 10, and chose to look at the top 10 instead.
3. Next, I assigned value percentages to each top 10 location. in the following fashion, and I will illustrate with an example.
Say Federer for sake of argument, has 100 points in my model (he actually has more, we'll get there. I am just trying to get to my point).
If he ended the given year at No. 1, he would have season contribution of the whole 100 points.
However, if he ended the season as No. 2, he would only be assigned 90% of his total points. at No. 3 he would have 80% etc.
Note that if a player never got to the No. 1 position (say murray)than the 100% of his points are at No. 2 (his highest achieved position) so if he ends the season at No. 3, he gets 90% of the points etc.
4. Next we add up all the points for the top 10.
5. However, there is one more correction. a player can't get more points than the player above him in any given year. For example, say federer ends at No. 3 and gets 80% of his total points - these 80% might be higher than the no. of points that Djokovic or Murray have, even though they are ranked higher and got 100%, just because Federer accomplished more -
Since that is impossible, a player can never get more points that any player above him in the rankings. he is limited by a ceiling set by those above him. so he can really get: (the MINIMUM between his % of his total points - OR the points of those above him).
6. And again, we sum all the totals, after the corrections for the top 10.
7. Now we actually have a number or a value for every season. Here a new problem arises.
You see, a really good season (e.g. 2009) can be worth up to three times more than a really bad season (e.g. 2000), since this is unfair, I did the one most controversial and up for debate thing in this whole model - I caped the difference - So that even the very worse season can never be worth less than 75% of the very best season.
Why 75%? No reason whatsoever, except that it sounded like a good number. If anyone wants me to change it, it is very easy to do so.
__________________
So, we now have two things:
1. The total value for each achievement of each player
2. The assigned value for a season's strength.
So, first things first: a season ranking:
1 1985
2 1982
3 1981
4 1978
5 1984
6 1979
7 1980
8 1990
9 1987
10 1983
11 2009
12 1975
13 2007
14 1994
15 2008
16 1977
17 2010
18 1974
19 2005
20 1976
21 1986
22 1989
23 1995
24 1988
25 2006
26 2011
27 1996
28 2004
29 1973
30 2012
31 1991
32 1992
33 1993
34 1998
35 1999
36 1997
37 2002
38 2001
39 2003
40 2000