The dreaded EOY Rating thread

Actually ...

My data is not necessarily perfectly complete in this area, but ...

I show there were 1,948 player that had a 2021 year-end C and appealed up and subsequently got a 2022 year-end C. Of these, 645 got a 2022 C that was lower than what they appealed up to. So a full third of these appeal ups came back down and could probably be considered inappropriate.

I'm not sure what @Creighton was thinking it would be, but a third is more than I would have thought.

For 2019 to 2021, the numbers are 2,443 and 729.

For 2018 to 2019, they are 2,018 and 630.

Amazing. Really appreciate the value you add to this forum. I never thought the system would be able to accurately project that many people back down.
 
Actually ...

My data is not necessarily perfectly complete in this area, but ...

I show there were 1,948 player that had a 2021 year-end C and appealed up and subsequently got a 2022 year-end C. Of these, 645 got a 2022 C that was lower than what they appealed up to. So a full third of these appeal ups came back down and could probably be considered inappropriate.

I'm not sure what @Creighton was thinking it would be, but a third is more than I would have thought.

For 2019 to 2021, the numbers are 2,443 and 729.

For 2018 to 2019, they are 2,018 and 630.

He thought "more often than not" the appeal up is inappropriate. And based on his experience it was inappropriate more often than not. Your data shows that people are twice as good at estimating their level than the rating system when it comes to people appealing up. I am surprised by that. I would think at least 2/3 of people think they are better at tennis then they really are. Instead your data shows that despite our bias the rating system gets it wrong 2xs as often as people questioning it. A good rating system would not have any appeals. And the fact that people are guessing their level better than the rating system just shows the ntrp rating system has issues.
 
He thought "more often than not" the appeal up is inappropriate. And based on his experience it was inappropriate more often than not. Your data shows that people are twice as good at estimating their level than the rating system when it comes to people appealing up. I am surprised by that. I would think at least 2/3 of people think they are better at tennis then they really are. Instead your data shows that despite our bias the rating system gets it wrong 2xs as often as people questioning it. A good rating system would not have any appeals. And the fact that people are guessing their level better than the rating system just shows the ntrp rating system has issues.
And you are leaping to conclusions.

@Creighton noted that it is hard to get bumped down and that is true. When someone appeals up and plays at the new higher level and doesn't play (they can't anymore) at the lower level, just winning a handful of games each match with an occasional set win can be enough to keep someone at the higher level. Given that, you could make the case that you'd expect very few to get bumped back down and a third of players getting bumped down despite that means those appeals were really inappropriate, and the fact that two-thirds didn't get bumped down isn't really a stamp of approval for their appeal, but just an artifact of levels being sticky.
 
Schmke great job putting number to this stuff. We can disagree with what the data shows but you actually giving the data (or at least some decent data) is the starting point of any decent conversation. Here is a link to your blog:


I wonder what the breakdown of men and women is. I would predict that men tend to stay at the higher level they appeal up to more often than the women. But when they appeal down they tend to get bumped back up more often than the women.
 
Schmke great job putting number to this stuff. We can disagree with what the data shows but you actually giving the data (or at least some decent data) is the starting point of any decent conversation. Here is a link to your blog:


I wonder what the breakdown of men and women is. I would predict that men tend to stay at the higher level they appeal up to more often than the women. But when they appeal down they tend to get bumped back up more often than the women.
Good point! I really should have sliced it by gender. A follow up blog for tomorrow.
 
And you are leaping to conclusions.

@Creighton noted that it is hard to get bumped down and that is true. When someone appeals up and plays at the new higher level and doesn't play (they can't anymore) at the lower level, just winning a handful of games each match with an occasional set win can be enough to keep someone at the higher level. Given that, you could make the case that you'd expect very few to get bumped back down and a third of players getting bumped down despite that means those appeals were really inappropriate, and the fact that two-thirds didn't get bumped down isn't really a stamp of approval for their appeal, but just an artifact of levels being sticky.


I think the fact that the ratings are so sticky is proof they are inaccurate. It is an indictment against the whole system. Why should I be considered a 4.0 player just because I self rated as a 4.0 but if I play exactly the same but self rate as a 3.5 I am only a 3.5 player?
 
I think the fact that the ratings are so sticky is proof they are inaccurate. It is an indictment against the whole system. Why should I be considered a 4.0 player just because I self rated as a 4.0 but if I play exactly the same but self rate as a 3.5 I am only a 3.5 player?
You make a valid point about the stickiness, although changing the algorithm to be more volatile would have other potentially negative side-effects.

But more importantly, if you are going to posit that the algorithm is no good, you may want to be careful making the point a few posts ago that 2/3 of appeal players remaining at level means they are better at knowing their level than the algorithm is.
 
You make a valid point about the stickiness, although changing the algorithm to be more volatile would have other potentially negative side-effects.

But more importantly, if you are going to posit that the algorithm is no good, you may want to be careful making the point a few posts ago that 2/3 of appeal players remaining at level means they are better at knowing their level than the algorithm is.

I am not saying anything about the algorithm itself as I do not know what it is. I assume it is fine. People keep talking about the "algorithm" as if that is the be all and end all. But the algorithms are by and large well established math that people can tweak based on some preferences.

If you played chess at a site for a year and wanted to say you wanted to appeal your chess rating up 50 points (which is about quarter of a level in chess) I think about 95% would be wrong. Why is that? It is not because all these chess sites have some super secret algorithm that no one else can figure out. Not at all Elo and Glicko systems have been around for decades. The reason is if you play chess on a site you will have many games that are used to calculate your rating based on a large number of people all in the same pool. They do not separate juniors and adults or women and men. They also don't refuse to rate official matches that the players want rated. USTA does all of these things wrong and their explanations sometimes make it seem like they don't want people to care about the ratings so they are deliberately making it not so accurate.

I don't understand why they don't take this even halfway seriously. I mean at the very least stop having two separate rating systems for men and women. Allow the two halves of the tennis world to play each other. That alone would massively increase the accuracy of the rating system.

These other rating systems like UTR and WTN just do inexplicably stupid things like putting ceilings and floors in ratings. I really don't understand why no one makes a decent rating system. Now it could be with WTN that their is something wrong with the algo. I see a guy both he and his partner are much better rated then their opponents. Yet they lose more games then they win and he still improves his rating. I find that completely unacceptable. Does no one at high level tennis know anyone that learned high level stats? Its like tennis players are horrible at math or something,
 
I think there are probably a good number of players who have a fairly stable ability level from year to year, and that ability happens to fall right near the border of NTRP levels. Even stable-ability players are going to have variance in their dynamic NTRP from match to match. If someone's NTRP bounces around between say 3.46 and 3.54 over the course of the year, it can be a bit of a coinflip on whether they end on 3.5C or 4.0C. If they end on 3.48 (3.5C), appeal to 4.0A, then end on 3.49 (3.5C) the next year, does that mean it was "wrong" for them to be a 4.0A?

If all the successful appeal players were these "coinflip" type of players then you'd expect 50% of them to get a C rating back at their old level and 50% of them to stay. Not so far off from the 30-40% who get rated back at the old level in reality. The difference could be explained by the subset of A-rates who truly improved or declined in ability in the direction of their appeal.
 
You make a valid point about the stickiness, although changing the algorithm to be more volatile would have other potentially negative side-effects.

But more importantly, if you are going to posit that the algorithm is no good, you may want to be careful making the point a few posts ago that 2/3 of appeal players remaining at level means they are better at knowing their level than the algorithm is.
@schmke does this mean that if a player self rated lower than their actual level (say 2.5 vs. low end 3.5 or high end 3.0) and only played the minimum 3 matches they could sneak in with a C rating lower than their actual.
 
I think the fact that the ratings are so sticky is proof they are inaccurate. It is an indictment against the whole system. Why should I be considered a 4.0 player just because I self rated as a 4.0 but if I play exactly the same but self rate as a 3.5 I am only a 3.5 player?
Because it is impossible to create an algorithm that is absolutely perfect to model something as volatile as amateur sports skill, so there is an overlap between all combinations of adjacent levels. If you both stay 3.5 if you self rate at 3.5 and stay 4.0 if you self rate at 4.0, then it's likely you're in the overlap area between 3.5 and 4.0, so the level at which you self-rate (and self-identify) is actually the best information available to differentiate between whether you are most appropriate at 3.5 or 4.0.
 
Either the rating system is sticky or it is not.

And by “rating system” I mean much more than the algorithm. I mean how wide the levels are, the rules about what levels people can play in different rating systems for different genders what matches are ignored by the rating system etc. many people here keep talking about the algorithm as if that is the key. As long as it doesn’t have bugs there are plenty of sensible algorithms.

If the rating system is sticky it is no accurate for the reasons I gave above.

Tom suggests different reasons why people can correct the rating system about their level and stay at the level system dined the put them at 67% of the time. However when it come to people thinking they are better than the rating I do think subjective bias would come into play. So I would think without stickiness about 67% would get bounced back down. Instead we see the opposite.

we don’t know the thresholds. In the past Creighton had a link to some old thresholds that were pretty narrow. I’m not sure if the appeal up threshold is as narrow as the appeal down threshold.

We know how many appeals are granted but what percent of people that could appeal up actually do appeal up? And is there a likely difference between men and women? do we think about 5 percent of men that could appeal up do appeal up? We can then get an idea of the threshold but it is hard to say.
 
@schmke does this mean that if a player self rated lower than their actual level (say 2.5 vs. low end 3.5 or high end 3.0) and only played the minimum 3 matches they could sneak in with a C rating lower than their actual.

I know this was meant for schmke but I think the answer is pretty clear.
For women playing in a 2.5 league the answer is yes of course. Men often don’t have 2.5 leagues So it may not be as easy. On the other hand since men don’t have a 2.5 league if a 3.5 plays in a 3.0 league they may run into many 2.5 players and even if they beat them 6-0 6-0 that would be the expected score for someone.49 higher which means they would still be rated 3.0. If they lose any games well that will really lower their rating.

The reverse is also true. If you are a 3.0 player but self rate as a 4.5(or 5.0 if you have that league) and play a few doubles matches against other 4.5s (5.0s) it is unlikely you will get double bumped down. So you will have your 4.0c (4.5 c) rating.
 
Why the obsession with edge cases? I am sure it will happen to a tiny tiny population but as long as there is a healthy league size, everyone will end up playing the correct level eventually. Key word is eventually. Some players will always be out of band no matter how good the rating system is. People just need to play enough matches to be moved to the right level.

Are ratings sticky? It depends on what you mean by sticky. For the most part, the ratings seem to have pretty good groupings. A lot of players I know have bumped up as their performance improves. I think what people struggle with is that improving you tennis game as an adult is a pretty slow and long process but people want fast/quick results. Like, I am hitting my forehand 5% "better" so I should be the next rating level now. At the end of the day, that doesn't matter at all.

I have gone from a 3.5S who lost almost every match to a 4.5 currently over 8-9 years. I am ending this year with around a 70% win rate. Personally I dont feel like the rating system is sticky.
 
I have gone from a 3.5S who lost almost every match to a 4.5 currently over 8-9 years. I am ending this year with around a 70% win rate. Personally I dont feel like the rating system is sticky.
That's pretty good. I picked up tennis later in life (late 40s). I started playing USTA league last year (3.0S) and moved up to 3.5C this month. I hope I will continue to improve.
 
If a 2.5 woman is equal to like a 2.0 man, how come there are 2.5 woman leagues but not 2.5 men leagues?

There are some men who do poorly enough in 3.0 leagues that they receive a 2.5C rating. I see there are around 25 of them in my area, plus another 10 or so who got 2.5M ratings. Potentially enough to form a small league, and my area is not that big, but I think most places just don't generate enough interested players to form a 2.5 men's league.

Now, is it true that even these 2.5C men could consistently beat any 2.5 woman? Maybe. On the one hand, it seems reasonable to think that any person regardless of gender starts out exactly the same as a true first-time-holding-a-racket beginner. But on the other hand, someone playing league is not really a true beginner - when you reach the point where you actually want to spend money to compete in tennis matches, you've already reached some level of skill. And perhaps at that level there is already a substantial gender gap.
 
If there was a 2.5 men’s league that advanced to nationals, plenty of men would sign up. 3.0 men everywhere would be trying to get bumped down and appeal down so they could make a run at nationals. Every club in America would want to hang a 2.5 state championship banner on their fence as a tribute to their beginners programs and teaching skills. USTA is missing a lot of revenue here.
 
@schmke does this mean that if a player self rated lower than their actual level (say 2.5 vs. low end 3.5 or high end 3.0) and only played the minimum 3 matches they could sneak in with a C rating lower than their actual.
Absolutely. A strategy some try to employ. Self-rate as low as allowed, play the minimum three matches at that level and get a C rating that is quite possibly artificially low, but now they have a C and are golden.
 
If there was a 2.5 men’s league that advanced to nationals, plenty of men would sign up. 3.0 men everywhere would be trying to get bumped down and appeal down so they could make a run at nationals. Every club in America would want to hang a 2.5 state championship banner on their fence as a tribute to their beginners programs and teaching skills. USTA is missing a lot of revenue here.

There used to be 2.5 mens nationals. I remember reading about a Pennsylvania team where some like 6'8 dude was playing as a 2.5

I can't imagine how embarrassing that would be
 
Absolutely. A strategy some try to employ. Self-rate as low as allowed, play the minimum three matches at that level and get a C rating that is quite possibly artificially low, but now they have a C and are golden.
Yes, I thought so and I see certain areas encouraging players to do it. I had an opponent told us to use this strategy, after beating us at state and telling my self rated player we were crazy for not implementing the strategy.
 
I've noticed UTR hasn't posted USTA Adult League match results in about a month. They're unable to gain access to the match data. Does anyone know more about this? Has USTA terminated its API?
 
I've noticed UTR hasn't posted USTA Adult League match results in about a month. They're unable to gain access to the match data. Does anyone know more about this? Has USTA terminated its API?

This happens every year about this time.

If UTR could not access USTA match data, it would die. Almost every adult’s UTR is calculated using USTA matches as the primary source.
 
I would be surprised if UTR uses an API; I suspect it scrapes the data the same way that TR, etc. do.
Why scrape if an API is available? Besides, UTR said they're unable to retrieve USTA data right now. If they were scraping TennisLink, they'd have the data since TennisLink is current.
 
Why the obsession with edge cases? I am sure it will happen to a tiny tiny population but as long as there is a healthy league size, everyone will end up playing the correct level eventually. Key word is eventually. Some players will always be out of band no matter how good the rating system is. People just need to play enough matches to be moved to the right level.

Are ratings sticky? It depends on what you mean by sticky. For the most part, the ratings seem to have pretty good groupings. A lot of players I know have bumped up as their performance improves. I think what people struggle with is that improving you tennis game as an adult is a pretty slow and long process but people want fast/quick results. Like, I am hitting my forehand 5% "better" so I should be the next rating level now. At the end of the day, that doesn't matter at all.

I have gone from a 3.5S who lost almost every match to a 4.5 currently over 8-9 years. I am ending this year with around a 70% win rate. Personally I dont feel like the rating system is sticky.


Were you bumped down to 3.0 after you lost almost every match as a 3.5s?
 
Last edited:
If a 2.5 woman is equal to like a 2.0 man, how come there are 2.5 woman leagues but not 2.5 men leagues?

Just to be clear, I’m not saying a 2.5 woman is like a 2.0 man.

I’m saying that usta crams 4 rating levels into two for the men. Women’s 2.5, 3.0, 3.5, and 4.0 are all crammed into 3.0 and 3.5 for men. Purestriker points out that there may be 2.5 leagues in places but there is no national competition for men 2.5.

I disagree that 2.5 men have anything to be ashamed of any more 3.5 men. Let’s not kid ourselves. The bottom half of adult rec tennis is rarely a thing of beauty. But “rare” does not mean “never.” So it’s fun to show up hoping for a few rare shots that work just right, whether you are 2.5 or 3.5.
 
I think USTA's system is both too rickety to provide an API and that it's not really in their interest to do so. It is extremely difficult to prevent anyone from scraping data like this, so, if the sites aren't updating because of something USTA did, it likely wasn't a counter-scraping change but just something else that the existing scripts haven't been updated for. Captchas are really the only tool that can prevent scraping, and even those have issues.
 
I think USTA's system is both too rickety to provide an API and that it's not really in their interest to do so. It is extremely difficult to prevent anyone from scraping data like this, so, if the sites aren't updating because of something USTA did, it likely wasn't a counter-scraping change but just something else that the existing scripts haven't been updated for. Captchas are really the only tool that can prevent scraping, and even those have issues.
Here's a link to the USTA API Developer portal (it's located on the main page of TennisLink):

 
Yes, I thought so and I see certain areas encouraging players to do it. I had an opponent told us to use this strategy, after beating us at state and telling my self rated player we were crazy for not implementing the strategy.

I've seen people doing it, generally from overseas where the previous history isn't picked up by USTA. You can generally tell because in matches during their self-rated year they generally seem to be playing at half speed. Sometimes they go ahead and dominate some matches (like combo) that don't count for ratings. Always fun to play them the next year and get whipped. Their captains just shrug and say they are computer rated.
 
I don't see why not.
I did a quick review of the API documentation and I did not see any API calls that allowed an external user to pull past USTA match history. This API appears to allow you to create new USTA users and provide current ratings on a particular USTA member.

The USTA API provides a toolkit for accessing and contributing USTA data from properly credentialed partner applications. Client applications can retrieve player information and play history, scouting profiles, World Tennis Numbers and more. Participating vendors can create USTA player accounts on behalf of their customers and contribute play history for incorporation into Workl Tennis Number. New features and functionality are being constantly added.

@ThinkPad, did you see any API documentation examples of pulling past USTA match history using the API?
 
That is almost certainly not what UTR and TR are using. @schmke would know better than me, so perhaps he can weigh in.
I have no inside knowledge on what UTR and TR use, but my guess would be they've been web-scraping TennisLink. The API dev portal is reasonably new and so I don't believe they could have been using it from the start of what UTR and TR have been doing.
 
if you consider that the top 3.5 men compete with top 4.0 or even liw 4.5 females and they all get lumped into two groups 3.0 and 3.5 lots of people are going to be wasting their time. That’s why I’m always surprised people say they play usta to have competitive matches.

people act like mixed doubles is an entirely different game. But it actually is not that different then same gender doubles. But usta ignores that data and so the ratings are not accurate.

Don’t forget matchups aren’t made just by rating, at least in terms of league play. There’s a captain involved, and any halfway decent captain is going to try to build a decent team and put people in the lineups where they make sense. Certainly sandbagging and stacking are issues, especially in playoffs, but it’s not that common to show up to an average league match and have most of the courts be non-competitive because of the broad level definitions.

And mixed is a different game, precisely because there’s such a big difference between mens’ and ladies’ levels. Most of the time it’s a game of a strong partner trying to figure out how to get the most utility out of a much weaker partner and take up the rest of the slack. That can also happen in adult leagues, but it’s not the norm like it is in mixed.
 
How about the 7 men and 2 women who have appealed from 4.5 up to 5.0, according to @schmke? Hard to imagine the thinking there. Possibly long-time 4.5's who just really wanted to finally achieve 5.0 as a trophy, or maybe long-time 5.0's who couldn't stand the ego blow of getting bumped down?
I always hit the appeal button every year including this year, even though I knew the odds of an appeal being granted were almost nil. Why not? If there is a rating anomaly that gifts you a free year as the best player in the league, why not take it?
If memory serves, when you hit the appeal button you don't specify whether you are appealing up or down. It just moves you to the level that you are within range of appeal for.
So as a possible answer to what @TennisOTM was wondering about, imagine you are @J_R_B and are in the habit of just hitting the appeal button every year in the hopes of a fluke appeal down, and then oops, you are granted an appeal up!!!
 
If memory serves, when you hit the appeal button you don't specify whether you are appealing up or down. It just moves you to the level that you are within range of appeal for.
So as a possible answer to what @TennisOTM was wondering about, imagine you are @J_R_B and are in the habit of just hitting the appeal button every year in the hopes of a fluke appeal down, and then oops, you are granted an appeal up!!!
No, you can select up or down. I don't know why you would want to appeal up.
 
No, you can select up or down. I don't know why you would want to appeal up.

I appealed up so I can play 4.0 this summer. Just finished a USTA match today, they stacked the two highest rated players on court two, one of which just got bumped to a 4.0. Beat them with a mid level 3.5 on my team. I think they died a little inside when they saw I had just appealed up from a 3.0C. Lol.
 
If memory serves, when you hit the appeal button you don't specify whether you are appealing up or down. It just moves you to the level that you are within range of appeal for.
So as a possible answer to what @TennisOTM was wondering about, imagine you are @J_R_B and are in the habit of just hitting the appeal button every year in the hopes of a fluke appeal down, and then oops, you are granted an appeal up!!!
No, you definitely select up or down. I wouldn't take that chance. I played up a couple matches last year (only won one against a team that also had a 4.0 guy playing up, so not really "bump up" results there...), so there was much more danger that I would be in the up range than the down range, so it would not be worth the risk if I couldn't pick up or down.
 
No, you can select up or down. I don't know why you would want to appeal up.
No, you definitely select up or down. I wouldn't take that chance. I played up a couple matches last year (only won one against a team that also had a 4.0 guy playing up, so not really "bump up" results there...), so there was much more danger that I would be in the up range than the down range, so it would not be worth the risk if I couldn't pick up or down.
Thanks, I stand corrected... been a while since I appealed and for whatever reason I don't remember specifying up/down.
 
I appealed up so I can play 4.0 this summer. Just finished a USTA match today, they stacked the two highest rated players on court two, one of which just got bumped to a 4.0. Beat them with a mid level 3.5 on my team. I think they died a little inside when they saw I had just appealed up from a 3.0C. Lol.

What were you doing at 3.0 if you're good enough to beat 4.0s?
 
What were you doing at 3.0 if you're good enough to beat 4.0s?

I’ve been playing for 11 months. Last year I was a 3.0 when I played most of my matches in May/June, and my year end C rating was based heavily on those matches. I no longer play like a 3.0, hence the reason I appealed.
 
Back
Top