Dear UTR: you might want to take a closer look at your algo

schmke

Legend
After you use the appeal option, the options change. But the “Appeal Available” still appears on the TR site next to my rating because TR hasn’t been updated since I tested the appeal on the usta source site.
Apologies, I thought you were referring to something on TennisLink. Anything TR lists isn't rooted in reality, they have no clue (well, very little) if you are in range of appeal or not it seems from their performance this time around.
 

travlerajm

Talk Tennis Guru
Apologies, I thought you were referring to something on TennisLink. Anything TR lists isn't rooted in reality, they have no clue (well, very little) if you are in range of appeal or not it seems from their performance this time around.
It actually did appear on tennislink also, but only until I used the appeal option. Im pretty sure TR is taking it directly from the usta site (i.e., it has nothing to do with the TR algorithm), because I’ve never seen it before on either site. I am guessing it will disappear from TR next week when it next updates. TR is re-calibrated to the usta computer ratings every year by force-fitting the discrepancies to match the usta computer.

That said, I’m not sure what usta’s criteria for the appeal option is. It might be something else about my status (such as my return from long absence?) that makes me eligible?
 
Last edited:

Moon Shooter

Hall of Fame
I just played my first USTA match in nearly 2 years.
8.0 mixed.
.......

UTR now says I’m a 4. A UTR 4.xx with 2 decimal place accuracy. Same as my female 3.5 mixed partners. Is it the rating decay with inactivity thing?


The other issue is that as a practical matter there are only a very small number of doubles matches where teams of a single gender play against a team containing the other gender. This means that the UTR doubles rating system almost certainly does, in practice, have a separate rating for men and women. I would bet that if you wanted to increase your UTR rating you could play on a team with two men against teams of two women. How much this would increase your UTR doubles rating might vary because some areas might have more matches like that, in the pool than others. But on the whole I bet that would lead to a considerable increase in rating. When I look at the the opponents I played it was typically the male player was much more of a threat than UTR's rating suggested.
 

travlerajm

Talk Tennis Guru
The other issue is that as a practical matter there are only a very small number of doubles matches where teams of a single gender play against a team containing the other gender. This means that the UTR doubles rating system almost certainly does, in practice, have a separate rating for men and women. I would bet that if you wanted to increase your UTR rating you could play on a team with two men against teams of two women. How much this would increase your UTR doubles rating might vary because some areas might have more matches like that, in the pool than others. But on the whole I bet that would lead to a considerable increase in rating. When I look at the the opponents I played it was typically the male player was much more of a threat than UTR's rating suggested.
My main point, that many here are missing, is that these discrepancies like the example you cite don’t have to be there.

If UTR would simply fix the algorithm with a few tweaks, it would be much more accurate.

1. Do not adjust ratings based on performances of prior opponents.
2. Do not make any assumptions about rating change during period of inactivity.
3. Don’t throw out match data from more than 12 months, unless the player is a junior.

Make these fixes to repair the self-defeating issues, and the algorithm will no longer require large sample sizes to settle in. Which means UTR won’t have to put the “limited data” disclaimer on the page all the time.
 

J011yroger

Talk Tennis Guru
My main point, that many here are missing, is that these discrepancies like the example you cite don’t have to be there.

If UTR would simply fix the algorithm with a few tweaks, it would be much more accurate.

1. Do not adjust ratings based on performances of prior opponents.
2. Do not make any assumptions about rating change during period of inactivity.
3. Don’t throw out match data from more than 12 months, unless the player is a junior.

Make these fixes to repair the self-defeating issues, and the algorithm will no longer require large sample sizes to settle in. Which means UTR won’t have to put the “limited data” disclaimer on the page all the time.

Don't count doubles matches where there is more than 2.0 point difference between partners.

I am 5-6 points higher than my 9.0 partner.

I would win more points against Nadal than she would win against me.

J
 

Moon Shooter

Hall of Fame
My main point, that many here are missing, is that these discrepancies like the example you cite don’t have to be there.

If UTR would simply fix the algorithm with a few tweaks, it would be much more accurate.

1. Do not adjust ratings based on performances of prior opponents.
2. Do not make any assumptions about rating change during period of inactivity.
3. Don’t throw out match data from more than 12 months, unless the player is a junior.

Make these fixes to repair the self-defeating issues, and the algorithm will no longer require large sample sizes to settle in. Which means UTR won’t have to put the “limited data” disclaimer on the page all the time.

You are correct that they make deliberate decisions about the algorithm that simply make it worse. It is for PR reasons and marketing.

I think 3 is the key. But they keep marketing as the same system regardless of age gender ..... Of course changing the weight of recent matches for a 13 year old the same way you change the weight of recent matches for a 50 year old makes no sense. But they are marketing this bug as a feature. I think they could make it the same algorithm but they could just weight the past games much less if you have many recent games. So a match from 28 months ago will count much more to your rating if you only played 2 other matches since then. But if you have played 20 matches in the last 6 months then the match from 28 months ago should be negligible.

Maybe Schmke could verify what I am thinking, but I don't think they can do 1 because they have a floor and a cap in ratings. Consider this: Today Djokovic has a 16.26 rating Nadal has a 16.22 and Medvedev has a 16.21 rating.

Now lets say Djokovic just starts playing so much better than the field that without the cap he would be rated 16.72 because he would beat nadal and Medvedev as though he .5 utr points better than better than them. (Im not saying it is likely at his age but imagine he is younger for purposes of illustration). Also assume that Nadal and Medvedev play at the same strength so they don't improve or get worse. Well Djokovic can't be 16.72 because they made a rule that says you can't be higher than 16.5! So then what happens? Well if they just cap him at 16.50 then their rating will not accurately predict outcomes between him and Nadal or really anyone he plays. So in order for the rating to be a bit better at predicting how he will play against nadal, medvedev and the rest of the field the rating system will have to lower nadal's and medvedev's rating! But remember they did not get worse the only change was Djokovic just playing even better. So if we lower their ratings even though they can still beat other top players by the same margin their ratings will seem inaccurate for the rest of the field! So that means you have to lower the rest of the fields rating. But then there is a floor of 1.0. So you just have to constantly shuffle everyone around to fit between 1.0 and 16.5

So the question is why do they have a rule that no one can be rated over 16.5? It just seems like a dumb decision. Because it will mean you have to constantly be shuffling everyone's ratings around every day. The other problem is people will never know what a given rating difference is supposed to predict. I think in the NTRP system someone at USTA could tell me if you are .28 rating points higher than your opponent your predicted outcome in a match is you should win about X% of the games. Now it may vary based on level but they can probably tell you something along those lines if they wanted to. (they seem to treat their rating as some top secret formula so they may not tell you but they could tell you.) But UTR can't. That is because the differences between players will vary based on how much better the current top player is than the floor. So having a cap means that UTR's rating will never translate to anything of a very concrete meaning such as if you are .5 utr points better than someone else then you should only lose 5 games in a match or you will lose rating points. If you lose less than 5 games in a match you would gain rating points. Since they have the cap they can never tell us something concrete like that therefore their rating system can not be easily tested. It is just a constant shambling hodge podge of numbers. Why? Well they chose to make a rule that no one can ever have a rating above 16.5!
 

travlerajm

Talk Tennis Guru
Don't count doubles matches where there is more than 2.0 point difference between partners.

I am 5-6 points higher than my 9.0 partner.

I would win more points against Nadal than she would win against me.

J
But being a 5.0 ntrp male in 9.0 mixed is a statistically proven advantage, almost as much edge as being a 4.5 in 8.0. The best player on the court in doubles usually wins, so he should have the chance to prove it. It should still count, but maybe discount the expected win %.
 
Yeah, I think their fundamental issue is that the thing they're trying to do - a single universal tennis rating system - just isn't that great of an idea.

The rating system that works best for serious juniors (learn and grow rapidly, play a lot of matches) is probably not the same as the rating system that works best for rec adults in the US (play only a few rated matches per year, may slowly get better or worse) is probably not the same as the rating system that works best for pros.

And there's the added complication that it's even harder to make a rating system that works both for people who care about it (and thus change their schedule to improve it) AND for people who ignore it.

I doubt the folks at UTR are idiots and making up these random rules just for the heck of it. If I had to guess, both "drop matches older than 12 months" and "dynamically adjust ratings as your former opponents play more" both help rate juniors more accurately - kids change fast (just imagine a kid who gets a rating when they're 12 years old, then how much different they can be after a growth spurt at 14 years old.) ...but those both make the system entirely unusable for adult rec players, who change skill slowly and who only play a few rated matches per year.
 

schmke

Legend
1. Do not adjust ratings based on performances of prior opponents.

Maybe Schmke could verify what I am thinking, but I don't think they can do 1 because they have a floor and a cap in ratings.
I don't know that adjusting based on future performance of prior opponents has anything to do with the floor and cap.

The reason to do such adjusting is to give consideration to a result against a mis-rated player affecting your rating inappropriately, the idea being that if you lost to them when they were a 5.2 but after a few matches they were a 7.7, that loss was not nearly as bad as it seemed. If you don't do the future/prior adjustment, your rating is unfairly dinged. It can obviously go the other way too.

Of course, you can go to far with this future/prior adjustment as they very well may have been a 5.2 when you played them and have improved since then to become the 7.7, and giving you credit for them being a 7.7 could be inappropriate too.

There is probably a balance in there somewhere, where exactly UTR is on finding the balance I don't know.

I do not do any such future/prior adjustments in my ratings, other than what I do for year-end calculations, and I believe the USTA does something similar. But the algorithm I have used for doing ratings for football does have a future/prior factor, although the effect of the result against a prior opponent diminishes over time so if that opponent ends up being a lot better than they were when you played them, there is some effect but not too much if it was the first game of the season and we are now in week 16.

All of that has nothing to do with caps and floors. I think they have caps and floors just to provide consistency of a sort, e.g. knowing that 15+ means a pro, and 16+ a very top pro, the best recreational players are somewhere 8-11, etc. And you have to have a floor lest things go negative which would look bad.

Now, could some consistency be realized without a cap? Probably, an artificial cap does seem unnecessary, but they seem to think they need it for some reason.
 

Moon Shooter

Hall of Fame
I don't know that adjusting based on future performance of prior opponents has anything to do with the floor and cap.

The reason to do such adjusting is to give consideration to a result against a mis-rated player affecting your rating inappropriately, the idea being that if you lost to them when they were a 5.2 but after a few matches they were a 7.7, that loss was not nearly as bad as it seemed. If you don't do the future/prior adjustment, your rating is unfairly dinged. It can obviously go the other way too.
....

All of that has nothing to do with caps and floors. I think they have caps and floors just to provide consistency of a sort, e.g. knowing that 15+ means a pro, and 16+ a very top pro, the best recreational players are somewhere 8-11, etc. And you have to have a floor lest things go negative which would look bad.

Now, could some consistency be realized without a cap? Probably, an artificial cap does seem unnecessary, but they seem to think they need it for some reason.


Yes I agree that they are likely primarily making the daily changes in rating due to trying to accommodate new data. And they are likely trying to strike a balance between the possibility of improvement (or decline) of a player versus just having an accurate measure of a persons play.

I think the other reason is because matches are occasionally dropping off of the 12 month cliff so ratings will seem to change for no reason.

But I think the cap and floor are an additional reason that ratings have to change even when players are not playing - or changing in strength. Just take the simplified case of Djokovic. Lets say Nadal doesn't play over the next 11 months but Djokovic plays the players ranked 3-6 several times and posts phenomenal results against them and others. He plays well enough that normally if there were no cap he would go from 16.26 to 16.68 or something. But he is capped out at 16.50. So likely the rating system will only boost him to like 16.45 (I'm assuming there has to be some sort of diminishing returns as you approach 16.5 so that players are not constantly burying the needle at 16.50) So sure the rating may just move those he played down to try to compensate for his inability to go up past 16.50. But what about Nadal who didn't play at all? If he stays at his rating then the comparison between Nadal and Djokovic's ratings will not accurately/fully reflect Djokovic's recent very high performances. So it would seem Nadal's rating would need to shift downward to make it accurate, even though Nadal never played. If they didn't have the cap then Djokivic would go up to 16.68 and you wouldn't have to mess with Nadal's rating.

Now I suppose they likely "stretch" 16.00 -16.50 more than say going 4.00-4.50 by using diminishing returns or something. (like limits in calculus) I think that is likely the case. In other words Djokovic being .04 higher than Nadal (16.26 versus 16.22) yields a higher predicted winning chance for Djokovic than a 4.67 player versus a 4.63 player. So this it is very hard to do what op wants due to the cap. But maybe it could be done theoretically if we knew how they "stretched" the top end of 16.XX.
 

jdawgg

Semi-Pro
Maybe UTR doesn’t care, or isn’t looking to address, gradients of play at the ATP level. There’s rankings 1-1000 for that, with, you guessed it, 1,000 different rankings. Actually I’ve noticed even some UTR 12s have a global ranking on UTR. Like 3,000 or something.
 

jmnk

Hall of Fame
That said, the chess system sounds cool, not sure why they didn’t try to replicate that.
In what sense they did _not_ replicate the chess system? It is virtually identical. Obviously in chess they do not really lower the ranking due to non-activity - but that is more because of the nature of chess. In chess your level does not really significantly drop off due to age or inactivity. Plus the main problem with tennis ranking - which is lack of data/matches - does not really exist in chess so it is somewhat naturally more accurate. But it is not because the algorithm is different, it is because chess players actually play.

When you have a player that plays like 5-7 matches over 3 years, and 90% of them are mixed doubles there's really only so much you can do with that data.
 
Last edited:

TennisOTM

Professional
Don't count doubles matches where there is more than 2.0 point difference between partners.

I am 5-6 points higher than my 9.0 partner.

I would win more points against Nadal than she would win against me.

J

I like this idea but I'd put the cutoff at more like 4.0. A 2.0 cutoff would exclude the pro mixed doubles matches at Grand Slams, which are generally UTR 15/12 vs. 15/12.
 

S&V-not_dead_yet

Talk Tennis Guru
In what sense they did _not_ replicate the chess system? It is virtually identical. Obviously in chess they do not really lower the ranking due to non-activity

They used to. And people would sandbag by not playing any officially rated matches and then enter a big tournament. At some point, they changed it to the current method.

- but that is more because of the nature of chess. In chess your level does not really significantly drop off due to age or inactivity.

Yes, it does. Your understanding of the fundamentals is still sound but many things decline with age and inactivity:
- Calculation speed
- Concentration
- Familiarity with many patterns and knowing quickly what to do rather than spending precious time figuring it out
- Opening theory: lines which were sound 20 years ago may have been discovered faulty
- Stamina: putting out tremendous mental effort over a 2 hour game is very draining

My last rated game was > 30 years ago. The young me would crush the present me but my rating has stayed constant. If I enter a tournament, I'm gonna get whupped.
 

Moon Shooter

Hall of Fame
In what sense they did _not_ replicate the chess system? It is virtually identical. Obviously in chess they do not really lower the ranking due to non-activity - but that is more because of the nature of chess. In chess your level does not really significantly drop off due to age or inactivity. Plus the main problem with tennis ranking - which is lack of data/matches - does not really exist in chess so it is somewhat naturally more accurate. But it is not because the algorithm is different, it is because chess players actually play.

When you have a player that plays like 5-7 matches over 3 years, and 90% of them are mixed doubles there's really only so much you can do with that data.

In chess if you achieve a 1400 1600 1800 2000 2200 or 2400 etc rating that is an accomplishment proving you acquired a certain level of competence at the game that you have for the rest of your life. That competence will be recognized by people decades later. If I am 200 points lower than someone else I have an idea that will lose approximately 75% of the games.

Tennis ratings are a disaster. I was a utr 4.4? yesterday today I am a 6.03! great! except I didn't actually play any games to change that rating. Is that any sort of accomplishment? UTR is like a magic 8 ball. UTR does not keep track of your highest rating it often seems like a number pulled out of someone's rear end. No one can tell you what a UTR or USTA difference of .5 means. No one has any clue what that difference means. I might be a 3.5 that can play evenly with a 4.0 or I might be a 3.5 that would typically lose 6-0 6-0 to a higher 3.5. Tennis ratings are a joke compared to chess ratings.

The question is not how tennis ratings are different than chess ratings. The question is how is it even possible that people constantly devise tennis ratings in ways that make them unmeaningful in all the way chess ratings are meaningful. It is only explained by conscious effort to make them meaningless and indeed that is what USTA pretty much admits. They say they are afraid people will focus on the number too much so they make it vague and practically meaningless.
 

travlerajm

Talk Tennis Guru
In chess if you achieve a 1400 1600 1800 2000 2200 or 2400 etc rating that is an accomplishment proving you acquired a certain level of competence at the game that you have for the rest of your life. That competence will be recognized by people decades later. If I am 200 points lower than someone else I have an idea that will lose approximately 75% of the games.

Tennis ratings are a disaster. I was a utr 4.4? yesterday today I am a 6.03! great! except I didn't actually play any games to change that rating. Is that any sort of accomplishment? UTR is like a magic 8 ball. UTR does not keep track of your highest rating it often seems like a number pulled out of someone's rear end. No one can tell you what a UTR or USTA difference of .5 means. No one has any clue what that difference means. I might be a 3.5 that can play evenly with a 4.0 or I might be a 3.5 that would typically lose 6-0 6-0 to a higher 3.5. Tennis ratings are a joke compared to chess ratings.

The question is not how tennis ratings are different than chess ratings. The question is how is it even possible that people constantly devise tennis ratings in ways that make them unmeaningful in all the way chess ratings are meaningful. It is only explained by conscious effort to make them meaningless and indeed that is what USTA pretty much admits. They say they are afraid people will focus on the number too much so they make it vague and practically meaningless.
Good summary of chess ratings vs tennis.
 

S&V-not_dead_yet

Talk Tennis Guru
In chess if you achieve a 1400 1600 1800 2000 2200 or 2400 etc rating that is an accomplishment proving you acquired a certain level of competence at the game that you have for the rest of your life. That competence will be recognized by people decades later. If I am 200 points lower than someone else I have an idea that will lose approximately 75% of the games.

Tennis ratings are a disaster. I was a utr 4.4? yesterday today I am a 6.03! great! except I didn't actually play any games to change that rating. Is that any sort of accomplishment? UTR is like a magic 8 ball. UTR does not keep track of your highest rating it often seems like a number pulled out of someone's rear end. No one can tell you what a UTR or USTA difference of .5 means. No one has any clue what that difference means. I might be a 3.5 that can play evenly with a 4.0 or I might be a 3.5 that would typically lose 6-0 6-0 to a higher 3.5. Tennis ratings are a joke compared to chess ratings.

The question is not how tennis ratings are different than chess ratings. The question is how is it even possible that people constantly devise tennis ratings in ways that make them unmeaningful in all the way chess ratings are meaningful. It is only explained by conscious effort to make them meaningless and indeed that is what USTA pretty much admits. They say they are afraid people will focus on the number too much so they make it vague and practically meaningless.

I've had very few blowout matches in USTA and when the score was lopsided, it was usually due to one person/team being at the higher end of a rating group and the other being at the lower end. I completely disagree that NTRP [and by association, UTR] is useless [my UTR fluctuated within roughly a 50 point range and would move to a lesser degree when I was less active, which is to be expected].

NTRP is vague in that I only know my level, not any gradation in between but it still accomplishes the most important thing: giving me competitive matches. The rest is more window dressing.

Looking at UTR, I see players with lower #s that I know are better than i. Interestingly, I don't see much of the reverse. But it's still within the margin of error.

Do you get competitive matches using either NTRP or UTR?
 

DCNJ

Rookie
Tennis ratings are a disaster. I was a utr 4.4? yesterday today I am a 6.03! great! except I didn't actually play any games to change that rating.

Is that a good thing or a bad thing? I know I've heard a lot of higher-rated people dreading the effect on their ratings by playing younger players who are underrated (happens in normal times, but accentuated with the lack of OTB tournaments recently. With something like the UTR, the actual skill of the players could be better approximated (since the ratings lag development), and so losing to an 1800 who should really be a 2100 doesn't feel as bad.
 

DCNJ

Rookie
I think this lack of clarity as to how UTR is spitting out numbers does hurt its legitimacy. I think NTRP is much better since people can see how it is done and gain a better understanding.

I don't see how you can say that; I don't think either is particularly transparent with their exact algorithm, but UTR gives more information. In fact, in the ET interview with the main USTA person overseeing the ratings process I believe the person said that it was a lot of work to apply the ratings systems to the doubles environment and that they (the USTA) wouldn't explain how they did it (proprietary information or something like that). I don't have a problem with that, but to say that people 'see how it's done' is quite a statement....
 

S&V-not_dead_yet

Talk Tennis Guru
In chess if you achieve a 1400 1600 1800 2000 2200 or 2400 etc rating that is an accomplishment proving you acquired a certain level of competence at the game that you have for the rest of your life. That competence will be recognized by people decades later.

How's that different from tennis? If I achieve a UTR 10, isn't that also an accomplishment, proof of a certain level of competence?

If I am 200 points lower than someone else I have an idea that will lose approximately 75% of the games.

It would be interesting to see how Elo deals with small sample sizes, like we saw in tennis during the pandemic. As long as the sample size is big enough, I would guess that UTR does pretty well. It doesn't do so well with limited data [during the pandemic, my singles UTR went up 200 points without me playing any matches. According to their stats, at one point I was 23,000th in the world.].
 

Moon Shooter

Hall of Fame
How's that different from tennis? If I achieve a UTR 10, isn't that also an accomplishment, proof of a certain level of competence?

Who knows? I achieved a rating of 6.03. What does that mean? They say if I am 2 points higher than I am so much better than someone else that they won't even rate the match unless I lose. Well I was 4.4 yesterday. So am I so much better today that the player I was yesterday could not compete? What does a .5 difference in rating mean in UTR? What does it mean in USTA?

If you don't know what that means then the rating is meaningless. In chess people know what the ratings mean. Therefore they are meaningful.


It would be interesting to see how Elo deals with small sample sizes, like we saw in tennis during the pandemic. As long as the sample size is big enough, I would guess that UTR does pretty well. It doesn't do so well with limited data [during the pandemic, my singles UTR went up 200 points without me playing any matches. According to their stats, at one point I was 23,000th in the world.].


The sample size is small because the rating groups choose to limit the games counted. UTR cuts off at 12 months. USTA won't even rate many of their own matches let alone matches outside of USTA. That means the sample sizes are small by design and therefore the ratings are crap. And because the ratings are crap people don't bother posting games to be rated. Probably 70% of my matches are not rated.
 

jmnk

Hall of Fame
They used to. And people would sandbag by not playing any officially rated matches and then enter a big tournament. At some point, they changed it to the current method.
my understanding is a bit different. I thought the way it worked/works is that no matter what your _current_ ranking might be you can never enter a tournament for a level that is lower than your all-time level highest level ever was. So if let's say you achieved 1850 Elo in like 2018, and now you are 1775, you still cannot enter a tournament for under-1800.

Yes, it does. Your understanding of the fundamentals is still sound but many things decline with age and inactivity:
- Calculation speed
- Concentration
- Familiarity with many patterns and knowing quickly what to do rather than spending precious time figuring it out
- Opening theory: lines which were sound 20 years ago may have been discovered faulty
- Stamina: putting out tremendous mental effort over a 2 hour game is very draining

My last rated game was > 30 years ago. The young me would crush the present me but my rating has stayed constant. If I enter a tournament, I'm gonna get whupped.
well, of course your level drops with age and/or inactivity but that drop off is nowhere near close to what happens in tennis.
 

jmnk

Hall of Fame
really :rolleyes:. I'm not sure what your obsession with proving that chess ranking is so much better than NTRP/UTR but your arguments are completely wrong.
In chess if you achieve a 1400 1600 1800 2000 2200 or 2400 etc rating that is an accomplishment proving you acquired a certain level of competence at the game that you have for the rest of your life. That competence will be recognized by people decades later.
if I ever find a person that is impressed by someone getting 1400, 1600, or 1800 Elo chess ranking then that would be the first such person I have encountered.

If I am 200 points lower than someone else I have an idea that will lose approximately 75% of the games.
and if you are 2 UTR points lower you are going to lose ~100% of matches. Satisfied?

Tennis ratings are a disaster. I was a utr 4.4? yesterday today I am a 6.03! great! except I didn't actually play any games to change that rating. Is that any sort of accomplishment?
for the nth time - how many actual matches have you played against opponents with verified ranking? Go and look up @GSG UTR profile. Or Mark Sansait's one. Both have pretty extensive play history, their ranking info is solid, it fluctuates a bit but within expected range. Those ranking make sense _because they actually play matches_.

UTR is like a magic 8 ball. UTR does not keep track of your highest rating it often seems like a number pulled out of someone's rear end.
of course it does. Your profile shows you ranking history, and you can easily find the highest level you ever had.

No one can tell you what a UTR or USTA difference of .5 means. No one has any clue what that difference means. I might be a 3.5 that can play evenly with a 4.0 or I might be a 3.5 that would typically lose 6-0 6-0 to a higher 3.5. Tennis ratings are a joke compared to chess ratings.

The question is not how tennis ratings are different than chess ratings. The question is how is it even possible that people constantly devise tennis ratings in ways that make them unmeaningful in all the way chess ratings are meaningful. It is only explained by conscious effort to make them meaningless and indeed that is what USTA pretty much admits. They say they are afraid people will focus on the number too much so they make it vague and practically meaningless.

Because it sports! That's why they play the game - to see what the outcome is. Of course it is possible that on a given day, against a given player you may play evenly with higher rated player, and lose badly to similarly ranked one. Ranking is not meaningless - it tells you with very, very high level of probability how you are going to do against a given player _on average_ . I believe @schmke published some data on that topic - and both UTR and NTRP are on average very accurate.

Same in chess. My profile on chess.com shows I was 2018 in February 2021. And then 1733 in April. And now I'm at 1937. Am I to believe that my actual level varied that wildly? - Of course not, these are just particular dates data points. It is safe to assume I'm about 1900 Elo. Do I (sometimes) lose to players 200 point lower - yes. Do I (sometimes) beat players 200 points higher - yes. Do I (sometimes) lose badly to other ~1900 players - yes. Do I (sometimes) play even games with players 100 points higher - yes. _There's absolutely no difference between chess players and chess Elo and tennis players and UTR (or NTRP).

So no, this is terrible summary of chess ratings vs tennis. Just play matches and your rating will work out fine.
 

time_fly

Hall of Fame
I think UTR needs to fix its pricing scheme more than its algorithm. Outside of college coaching and recruiting I don’t see how it offers nearly the value they try to charge for premium access.
 

travlerajm

Talk Tennis Guru
UTR news of the day:

My 3.5C mixed partner’s UTR from my last match, where we crushed a 4.0C male / 4.5C female team, has just risen higher than the 4.5C female. Meanwhile, my UTR dropped down since yesterday.

The 3.5 is nearly 60 years old and has never been higher than 3.5 in her life.
 

Moon Shooter

Hall of Fame
JMNK

We both agree that a big problem for both UTR and USTA is they don't consider enough data. But you overlook the fact that both UTR and USTA excludes huge amounts of data. It is a choice they are making.

I have a verified doubles utr rating based on 5 matches. It is not provisional. Yesterday my rating was 6.03 today it is 4.43. So I am a fool if I play someone today. I should wait until it hits 6.03 again and then I will have a much better shot.

Do my opponents have many games counting toward their UTR? I don't think so. Why? Because UTR refuses to consider games played over 12 months ago or even consider that doubles skill may have some correlation to singles skill. So again they are to blame for ignoring the data.

I played 5 USTA matches and not one of them counts toward my USTA rating. They are ignoring the data. So they are to blame for the inaccuracies.

Internet chess ratings are pretty good. But you should get a rating based on over the board play. Because over the internet not only can people cheat with computers but they also might play on someone else's account. On more than a few occasions I was beating a much higher player like a drum - and suspected a kid was playing on his dad's account or vice versa. People can also be distracted while playing, play while drunk, and have technical issues. All of this can lead to you having a good or bad run and having rating fluctuations.

In chess many players don't play rated tournaments unless they are pretty good at chess. Like the top 5% of people who know the rules and can play the game. I invite anyone on these forums to go ahead and play some blitz games on chess.com and see how you do. Even more than tennis chess is something that is easier to learn the younger you start. So learning as an adult and reaching a 1600 or 1800 over the board rating is an achievement.


"Ranking is not meaningless - it tells you with very, very high level of probability how you are going to do against a given player _on average_ ."

No it doesn't. NTRP generally seperates people so the average 3.5 will lose to the average 4.0. But if you yourself achieve a 3.5 rating that means you are basically somewhere between the the top 40% and the top 75% of players. It is also likely that if about half the players in 3.5 played in 4.0 they would remain 4.0 players. And more than half of the 3.0 players could play in 3.5 and would keep the 3.5 rating.


When we combine the 3.0 3.5 and 4.0 men we are talking about 75% of all players. And these ratings are just a blur about separating them.
 

S&V-not_dead_yet

Talk Tennis Guru
my understanding is a bit different. I thought the way it worked/works is that no matter what your _current_ ranking might be you can never enter a tournament for a level that is lower than your all-time level highest level ever was. So if let's say you achieved 1850 Elo in like 2018, and now you are 1775, you still cannot enter a tournament for under-1800.

30+ years ago, your rating would decline over time; it was built-in to the algorithm. I'm not aware of any rule which required you to use the high watermark rating for tournament entry. Every tournament I've played used my current rating.

These days, ratings don't decline over time.

well, of course your level drops with age and/or inactivity but that drop off is nowhere near close to what happens in tennis.

On average, I'd agree. However, I can think of exceptions where the player kept physically sharp playing other sports and so when he returned, it was like riding a bike, whereas his mental abilities took a larger hit.
 

S&V-not_dead_yet

Talk Tennis Guru
Who knows? I achieved a rating of 6.03. What does that mean? They say if I am 2 points higher than I am so much better than someone else that they won't even rate the match unless I lose. Well I was 4.4 yesterday. So am I so much better today that the player I was yesterday could not compete? What does a .5 difference in rating mean in UTR? What does it mean in USTA?

If you don't know what that means then the rating is meaningless. In chess people know what the ratings mean. Therefore they are meaningful.

Yes, that's a huge change over a short time. As more matches are recorded, the jumps will decrease. You're still in the "settling out" phase.

You want close to 100% accuracy, 100% of the time, irrespective of sample size. That's not realistic.

The sample size is small because the rating groups choose to limit the games counted. UTR cuts off at 12 months. USTA won't even rate many of their own matches let alone matches outside of USTA. That means the sample sizes are small by design and therefore the ratings are crap. And because the ratings are crap people don't bother posting games to be rated. Probably 70% of my matches are not rated.

So record your matches in myutr.com [with your opponent's agreement to do likewise]: there is a section for non-sanctioned matches. That solves the sample size problem.

The ratings work for me as I get competitive matches.
 

S&V-not_dead_yet

Talk Tennis Guru
if I ever find a person that is impressed by someone getting 1400, 1600, or 1800 Elo chess ranking then that would be the first such person I have encountered.

Well, it would be impressive to a 1200. :)

Same in chess. My profile on chess.com shows I was 2018 in February 2021.

Hey, congrats on the Expert rank. I never broke through although there's still hope.
 

S&V-not_dead_yet

Talk Tennis Guru
I think UTR needs to fix its pricing scheme more than its algorithm. Outside of college coaching and recruiting I don’t see how it offers nearly the value they try to charge for premium access.

The target audience [outside juniors and college] is people like MoonShooter who place a great deal of emphasis on rating. Many don't even care enough to check the site let alone pay for premium access.

The only scenario where I'd consider paying is if I was close to getting bumped in NTRP but then I'd probably get a report from @schmke which deals with NTRP rather than getting a UTR report and trying to extrapolate.
 

S&V-not_dead_yet

Talk Tennis Guru
@Moon Shooter,

Your emphasis on rating as a measure of your skill is like someone trying to lose weight who weighs himself every 5 minutes and is depressed when he sees the # rise by a few ounces, not realizing that weight fluctuation during the day is normal. You believe that only a linear progression, even at small time intervals, of lower #s proves you're losing weight. I think most would agree that's the exact wrong approach because you're overwhelmed by randomness. Weigh yourself fewer times and trust the process of increased calorie burning and better diet, etc. That takes the randomness out of the equation.
 

Moon Shooter

Hall of Fame
@Moon Shooter,

Your emphasis on rating as a measure of your skill is like someone trying to lose weight who weighs himself every 5 minutes and is depressed when he sees the # rise by a few ounces, not realizing that weight fluctuation during the day is normal. You believe that only a linear progression, even at small time intervals, of lower #s proves you're losing weight. I think most would agree that's the exact wrong approach because you're overwhelmed by randomness. Weigh yourself fewer times and trust the process of increased calorie burning and better diet, etc. That takes the randomness out of the equation.

It's more like looking at a scale and having it tell you that you weigh somewhere between 180 and 440 pounds.
 

Moon Shooter

Hall of Fame
Yes, that's a huge change over a short time. As more matches are recorded, the jumps will decrease. You're still in the "settling out" phase.

You want close to 100% accuracy, 100% of the time, irrespective of sample size. That's not realistic.

No I want them to stop ignoring data, so the sample size does not remain to small.


So record your matches in myutr.com [with your opponent's agreement to do likewise]: there is a section for non-sanctioned matches. That solves the sample size problem.

The ratings work for me as I get competitive matches.


No it doesn't solve the problem because they still cut off the data at 12 months. They also refuse to consider singles ratings at all for doubles ratings and vice versa - even when they have very little data to work with. This means most adults will never have an accurate number. And because most adults will never have an accurate number that is why most adults don't care about the number. So most adults have no interest in posting games there. See how this bad decision leads to a snowball effect of problems?
 

TennisOTM

Professional
The target audience [outside juniors and college] is people like MoonShooter who place a great deal of emphasis on rating. Many don't even care enough to check the site let alone pay for premium access.

The target audience also includes people interested in playing UTR events. Premium access also gets you a discount on registration. In my area at least, there are starting to be enough adult tournaments, flex leagues, etc. that the yearly premium fee could end up paying for itself.
 

S&V-not_dead_yet

Talk Tennis Guru
The target audience also includes people interested in playing UTR events. Premium access also gets you a discount on registration. In my area at least, there are starting to be enough adult tournaments, flex leagues, etc. that the yearly premium fee could end up paying for itself.

Aah, I didn't know that. Thanks for the info!
 

S&V-not_dead_yet

Talk Tennis Guru
It's more like looking at a scale and having it tell you that you weigh somewhere between 180 and 440 pounds.

So if you can't get a more accurate scale, stop weighing yourself so often. The scale isn't going to change [at least in the short-term]. The only thing that can change is how you use it.
 

S&V-not_dead_yet

Talk Tennis Guru
No it doesn't solve the problem because they still cut off the data at 12 months.

You'd have to talk to the designers who made that decision. I assume it has something to do with the relevancy of the older data, which makes sense if the target audience is juniors who A) play a ton of matches; and B) improve rapidly.

The designers want UTR to be universal but these assumptions obviously don't hold so well for the typical adult player.

They also refuse to consider singles ratings at all for doubles ratings and vice versa - even when they have very little data to work with.

Again, a design decision with certain tradeoffs. I can see the pros and cons.

This means most adults will never have an accurate number.

Why not? Why would including both improve accuracy? Yes, it increases the sample size but you're potentially mixing apples and oranges.

And because most adults will never have an accurate number that is why most adults don't care about the number. So most adults have no interest in posting games there. See how this bad decision leads to a snowball effect of problems?

No, I don't see. I have a reasonably accurate UTR and I'd be fine using it to enter a tournament. I have no interest posting non-sanctioned matches because I just don't care enough about my UTR. You obviously care a lot more so you should post them. Not an ideal solution but at least it gives you more granularity.
 

Moon Shooter

Hall of Fame
Well, it would be impressive to a 1200. :)

A 1200 player is about average of USCF ratings.


If you consider youth ratings it is much higher. It is true that youth players will include many kids that have not fully developed the mental capacity to play chess. But many have and the importance of indicating the ratings of kids demonstrates that when you are talking about adults that get a rating you are already likely dealing with people that were in the upper 90% of ability. Chess is a game that most people acknowledge is easier to learn as a child than as an adult.
 

Moon Shooter

Hall of Fame
So if you can't get a more accurate scale, stop weighing yourself so often. The scale isn't going to change [at least in the short-term]. The only thing that can change is how you use it.

I'm proposing they make the scale more accurate. What is wrong with that? Checking my rating less frequently will not make it more accurate.
 

Moon Shooter

Hall of Fame
You'd have to talk to the designers who made that decision. I assume it has something to do with the relevancy of the older data, which makes sense if the target audience is juniors who A) play a ton of matches; and B) improve rapidly.

The designers want UTR to be universal but these assumptions obviously don't hold so well for the typical adult player.

Right not including any games older than 12 months does not work well for the typical adult player. That is my point. It is a design decision it is not something that is inevitable for any rating system. They seem to have made the decision to use the same algo for adults as kids due to marketing even though it clearly does not work so well for adults.


Again, a design decision with certain tradeoffs. I can see the pros and cons.



Why not? Why would including both improve accuracy? Yes, it increases the sample size but you're potentially mixing apples and oranges.

The rating works well if it predicts future results better. If you have many singles and doubles matches then it makes sense to seperate them out. If you have very few games of singles or doubles it does not make sense to completely ignore the other data. Lets say you are playing doubles with someone with a doubles rating of 3.2. But you see they only have 2 doubles matches. But you also see they have 20 singles matches and a singles rating of 8.36. Are you going to completely ignore the 8.36 when you consider how good they may be at doubles?

Likewise when UTR has people that may have played say 4 doubles matches and 3 singles matches it doesn't make sense for them to not at least consider all the data to some extent. Of course when the number of matches in a catagory gets larger then the importance of the other category can drop off. But until that happens it is silly to think only one category is useful to helping you determine strength of play.


No, I don't see. I have a reasonably accurate UTR and I'd be fine using it to enter a tournament. I have no interest posting non-sanctioned matches because I just don't care enough about my UTR. You obviously care a lot more so you should post them. Not an ideal solution but at least it gives you more granularity.

Asking someone I play to include their match won't do much unless they are also going to post other matches they play. And UTR purposely ignores data that would make their ratings more accurate so most people are uninterested in bothering.

You see you think everyone is just uninterested in their rating because only rare people like me would ever be interested in an objective measure of their tennis skill compared to the rest of the country or world. But I think the main reason people are uninterested in these ratings is because they are inaccurate, and they are inaccurate because they intentionally ignore relevant data.
 

S&V-not_dead_yet

Talk Tennis Guru
I'm proposing they make the scale more accurate. What is wrong with that? Checking my rating less frequently will not make it more accurate.

Checking your rating less frequently will not make the instantaneous reading more accurate. But it will shield you from those fluctuations which drive you crazy. It's just a matter of choosing how to use the tool.

As to making the scale more accurate, maybe they have no motivation. Maybe by changing it to suit your concerns will lessen its relevancy to its target audience.
 

S&V-not_dead_yet

Talk Tennis Guru
Right not including any games older than 12 months does not work well for the typical adult player. That is my point. It is a design decision it is not something that is inevitable for any rating system. They seem to have made the decision to use the same algo for adults as kids due to marketing even though it clearly does not work so well for adults.

Hence the "Universal" in UTR.

The rating works well if it predicts future results better. If you have many singles and doubles matches then it makes sense to seperate them out. If you have very few games of singles or doubles it does not make sense to completely ignore the other data. Lets say you are playing doubles with someone with a doubles rating of 3.2. But you see they only have 2 doubles matches. But you also see they have 20 singles matches and a singles rating of 8.36. Are you going to completely ignore the 8.36 when you consider how good they may be at doubles?

I guess it depends on how much work that change will be vs what payoff I expect to receive from it and what potential negative effects it could have on the target audience. If it's to make rec players happy, that isn't going to generate nearly as much revenue as the target audience of juniors/collegiates/coaches.

Likewise when UTR has people that may have played say 4 doubles matches and 3 singles matches it doesn't make sense for them to not at least consider all the data to some extent. Of course when the number of matches in a catagory gets larger then the importance of the other category can drop off. But until that happens it is silly to think only one category is useful to helping you determine strength of play.

I see the logic in that. But the bottom line is that your proposed changes aren't going to be implemented any time soon so do you change your behavior to alter the outcome?

Asking someone I play to include their match won't do much unless they are also going to post other matches they play. And UTR purposely ignores data that would make their ratings more accurate so most people are uninterested in bothering.

You see you think everyone is just uninterested in their rating because only rare people like me would ever be interested in an objective measure of their tennis skill compared to the rest of the country or world. But I think the main reason people are uninterested in these ratings is because they are inaccurate, and they are inaccurate because they intentionally ignore relevant data.

I can only go by what I observe: I don't see people obsessing over their UTR and I've never heard someone comment that they would be interested if only the ratings were more accurate. People use UTR, tennisrecord, tennislink, etc as a window into their results. Most don't demand a high level of accuracy. I've seen a high degree of correlation among all of the sources I've checked and that's good enough for me.[/quote][/QUOTE]
 

Moon Shooter

Hall of Fame
Hence the "Universal" in UTR.

The universal rating does not need to mean it uses the same k factor for older games for young people as it does for adults. It can just mean that everyone gets a rating with the same algorithm and that algorithm can account for age if accounting for age makes it more accurate.

I guess it depends on how much work that change will be vs what payoff I expect to receive from it and what potential negative effects it could have on the target audience. If it's to make rec players happy, that isn't going to generate nearly as much revenue as the target audience of juniors/collegiates/coaches.

It would only make the algorithm better if it predicts the outcomes better. I am not suggesting they do this if it leads to worse predictions.


I see the logic in that. But the bottom line is that your proposed changes aren't going to be implemented any time soon so do you change your behavior to alter the outcome?

I don't know what you are saying. How do you know what changes may or may not be implemented?

I can only go by what I observe: I don't see people obsessing over their UTR and I've never heard someone comment that they would be interested if only the ratings were more accurate. People use UTR, tennisrecord, tennislink, etc as a window into their results. Most don't demand a high level of accuracy. I've seen a high degree of correlation among all of the sources I've checked and that's good enough for me.

You don't see several threads and discussions like this one, where people are saying how the tennis ratings systems fail in many ways?

I don't think it is correct to say people are "obsessing" over their UTR or NTRP just because they mention problems.

You seem to live in an area where there are lots of USTA and UTR match opportunities. I am not sure you can speak for "most" people. I don't claim to speak for most people either. But your smug comments about how well it works well for you in your part of the country therefore everyone else must be just "obsessing" and should change their behavior, is condescending.
 

S&V-not_dead_yet

Talk Tennis Guru
The universal rating does not need to mean it uses the same k factor for older games for young people as it does for adults. It can just mean that everyone gets a rating with the same algorithm and that algorithm can account for age if accounting for age makes it more accurate.

If it's not the same algorithm, they'd have a hard time claiming it was universal. But there isn't much mixing among the groups so who knows how accurate it is in predicting outcomes?

I don't know what you are saying. How do you know what changes may or may not be implemented?

I don't. I'm making a guess. If the prime target group to benefit from the changes you mentioned are adult rec players who play infrequently, I don't think UTR is going to make any changes.

You don't see several threads and discussions like this one, where people are saying how the tennis ratings systems fail in many ways?

Not to the point where UTR would take notice. I'm not arguing that the system is perfect but it works for a large enough population, particularly their target audiences.

I don't think it is correct to say people are "obsessing" over their UTR or NTRP just because they mention problems.

You go well beyond mentioning problems. You spend a lot of time thinking about the issues. You've also stated in the past that UTR is very important to you because it allows you to gauge your progress. I won't use the "O" word then; it doesn't change how things are nor how they are likely to turn out vis a vis UTR's algorithm.

You seem to live in an area where there are lots of USTA and UTR match opportunities. I am not sure you can speak for "most" people. I don't claim to speak for most people either. But your smug comments about how well it works well for you in your part of the country therefore everyone else must be just "obsessing" and should change their behavior, is condescending.

It does work well for me but I recognize I live in a tennis hotspot. So I take advantage of it. If I lived in a coldspot, I'd change my behavior to get the outcomes I wanted. I would not spend a lot of time thinking about the algorithm but rather how I could harness it to my benefit or at least to get more benefit than I'm currently getting.

Part of my comments' motivation was to counter your assertion that ratings are crap. Let's agree that they work well for me and not so well for you. The question then becomes will you change your behavior to reflect that?
 
In chess if you achieve a 1400 1600 1800 2000 2200 or 2400 etc rating that is an accomplishment proving you acquired a certain level of competence at the game that you have for the rest of your life. That competence will be recognized by people decades later.

I think the equivalent of this for tennis is ATP/WTA rankings. If you reach a certain ranking (really, any ranking), that's the kind of thing you can brag about for life.
 
Top