Share your WTN - Crowd sourcing the NTRP to WTN mapping

travlerajm

Talk Tennis Guru
Here are some potential off the cuff qualities that people might find valuable in a rating system:

1) Transparency - is it clear how your matches/games count - this leads to the rating gaining legitimacy in the community.
2) Does it apply to all tennis players or just certain groups such as age groups, gender groups, nationalities etc. If it only a applies to small groups it is not so relevant.
3) Is it accurate. You would test this by seeing if it accurately predicts outcomes.
3a) how many matches are required for it to provide decent accuracy?
4) Does it provide a goal or motivation for players? For example I think UTR fails in this regard because everyday is a completely new day and it doesn't seem to track your highest ratings. USTA and WTN might be better at this.
5) Does it allow people to set up tennis events where players feel like they are playing with and against appropriately skilled players.
6) Is it easy to record match results and participate in the rating system. UTR seems to be one of the best for this, but even UTR could improve.
Some other things:

7. does it provide match ratings? To my knowledge TR does this, but WTN and UTR don’t. This is my favorite thing about TR, because it allows me to compare my level on different days. Maybe I am comparing two racquet setups, and if I play two matches with each, TR will spit out objective opponent-adjusted numbers that let me compare how I did with each setup.

8. does it provide pre-match ratings? TR and WTN both do this. UTR does not. This is really important for accuracy because it allows you to have a snapshot of your level at each point in time. It should not be assumed that every stays the same constant level over time. Some are improving some are declining.

9. does it provide pre-match opponent ratings? TR and WTN both do this. UTR does not. There is no way to tell on UTR how difficult your past opponent was, especially if more than a year ago, because UTR just assigns past opponents with a current UTR.

10. Is the rating stable? TR and WTN appear to be stable, because the rating does not change when past opponents have new results. This is not true with UTR.

11. Does the rating include all available data? WTN seems to be superior to both TR and UTR in this respect. A big hole in TR ratings is that it draws a blank for players who have only played either same-gender or mixed, then switch over. This makes opponent ratings less reliable and trickles through the system. UTR has the aforementioned problem that it ignores older data beyond a year, compromising the accuracy of the entire player network by limiting rated players to a minority of the total player pool.

12. does it provide separate ratings for doubles and singles? UTR and WTN both do this. TR does not.

14. does it provide separate ratings for mixed and same-gender? TR does this. WTN and UTR do not.

15. Does it update regularly? TR, WTN, and UTR all do this. TLS does not, disqualifying it from further analysis.

16. can it account well for recent upticks or downticks in results? TR and WTN both can do this. UTR appears to be stickier and unable to detect recent trends.
 

Moon Shooter

Hall of Fame
oh, ok, I may have indeed used the wrong term. I'm asking if your opponents and partners (for doubles) have 'reliable' UTR ranking.


I don't know or care. I think I tend to played slightly more matches in the past 12 months than many of my opponents or partners. That is the problem with UTR in my area. Not everyone plays that many matches every year. I understand that someone can possibly improve or decline considerably in 12 months. But if I know someone was a 9.24 12 months ago (or a 4.89 in TR) that clearly gives me some valuable information about what I might be facing. UTR says no. As soon as the 12 month cliff hits the algorithm says they don't know if you are playing a 1.00 or a 16.00. It just doesn't work well for my area.

WTN and USTA/TR go back a few years and at least consider a persons rating from back then. They weight recent results more heavily (which I agree with) but they don't pretend that a match 12 months ago gives us absolutely no information about a players ability.
 

Moon Shooter

Hall of Fame
Some other things:

7. does it provide match ratings? To my knowledge TR does this, but WTN and UTR don’t. This is my favorite thing about TR, because it allows me to compare my level on different days. Maybe I am comparing two racquet setups, and if I play two matches with each, TR will spit out objective opponent-adjusted numbers that let me compare how I did with each setup.

8. does it provide pre-match ratings? TR and WTN both do this. UTR does not. This is really important for accuracy because it allows you to have a snapshot of your level at each point in time. It should not be assumed that every stays the same constant level over time. Some are improving some are declining.

9. does it provide pre-match opponent ratings? TR and WTN both do this. UTR does not. There is no way to tell on UTR how difficult your past opponent was, especially if more than a year ago, because UTR just assigns past opponents with a current UTR.

10. Is the rating stable? TR and WTN appear to be stable, because the rating does not change when past opponents have new results. This is not true with UTR.

11. Does the rating include all available data? WTN seems to be superior to both TR and UTR in this respect. A big hole in TR ratings is that it draws a blank for players who have only played either same-gender or mixed, then switch over. This makes opponent ratings less reliable and trickles through the system. UTR has the aforementioned problem that it ignores older data beyond a year, compromising the accuracy of the entire player network by limiting rated players to a minority of the total player pool.

12. does it provide separate ratings for doubles and singles? UTR and WTN both do this. TR does not.

14. does it provide separate ratings for mixed and same-gender? TR does this. WTN and UTR do not.

15. Does it update regularly? TR, WTN, and UTR all do this. TLS does not, disqualifying it from further analysis.

16. can it account well for recent upticks or downticks in results? TR and WTN both can do this. UTR appears to be stickier and unable to detect recent trends.

Good points. This is my own view on the points you raise.

7 8 and 9 are part of what I consider transparency/legitimacy. (I list it as 1) Ideally we should be able to calculate (or at least come relatively close) to our performance rating in a match and also calculate our new dynamic rating. People understand how the ratings work they understand that it is not just who won the match but what the score was and who your partner and opponents ratings were that are important.

10 is part of what I had in mind with goal setting (#4).

In 11 you mention reasons why the accuracy of many existing ratings suffer. (my list #3) I think reasonable people can disagree how far back we should go and whether doubles should be completely separate from singles etc. But the way you test these theories is see how well they predict results.

12) seems to be a good thing overall. But completely separating them all the time even when you have limited data like WTN and UTR does may not be the best option. I would suggest a more blended approach until you get enough singles and doubles matches to have a solid rating for each independantly. But on the whole I think this can be a plus.

14) I would say this ties in with 2 as well as accuracy 3 on my list. I think it is unlikely that a rating system that applied to everyone and allowed people to post results including coed matches would benefit from separating out mixed matches from other doubles matches. But I would be willing to test it and see.

I largely agree with 15 but some might say that conflicts with 5 on my list. If I say all people 5.0 UTR and under can play in this tournament what day do we choose? I think this can be dealt with by saying your rating on X date or whatever. But there are some advantages to having a slower update.

16) I would say this is just a matter of accuracy. I don't want the rating system to overreact to recent results or under react. The way I would tell is whether the ratings the system delivers accurately predicts outcomes.
 

TennisOTM

Professional
Some other things:

10. Is the rating stable? TR and WTN appear to be stable, because the rating does not change when past opponents have new results. This is not true with UTR.

Stability is nice, but the risk of emphasizing this criterion too much is that you'll stabilize on a number that's wrong and ignore subsequent information that could improve it. If your opponent's rating was unreliable when you played them, then their rating changed as they added more data in subsequent days / weeks, doesn't it make sense to update your score for that match using the new information about your opponent's level that was not available at the time?

USTA and also TR do this when they do their year-end rating calculations, which can come out quite differently than what your final dynamic rating was. The difference with UTR is that they don't have a separate year-end calculation moment - they are essentially doing the equivalent of USTA/TR's year-end calculation every single night, so it's not exactly a fair comparison.
 

jmnk

Hall of Fame
Stability is nice, but the risk of emphasizing this criterion too much is that you'll stabilize on a number that's wrong and ignore subsequent information that could improve it. If your opponent's rating was unreliable when you played them, then their rating changed as they added more data in subsequent days / weeks, doesn't it make sense to update your score for that match using the new information about your opponent's level that was not available at the time?

USTA and also TR do this when they do their year-end rating calculations, which can come out quite differently than what your final dynamic rating was. The difference with UTR is that they don't have a separate year-end calculation moment - they are essentially doing the equivalent of USTA/TR's year-end calculation every single night, so it's not exactly a fair comparison.
ahh, you and your logic - who needs it ;)
 

travlerajm

Talk Tennis Guru
TR has been revealing a major wart this year. I’ve played 5 matches, and 4 of those matches show up NC. 80% of my matches somehow leaking through the algorithm’s grasp and not getting calculated match ratings. All of the players for all of the matches have been playing recent matches (not self rate issue), but TR is ignoring these due to gaps in the algo.
 

Moon Shooter

Hall of Fame
TR has been revealing a major wart this year. I’ve played 5 matches, and 4 of those matches show up NC. 80% of my matches somehow leaking through the algorithm’s grasp and not getting calculated match ratings. All of the players for all of the matches have been playing recent matches (not self rate issue), but TR is ignoring these due to gaps in the algo.


They will calculate it in a bit.

The odd thing with their algo, from my perspective, is why I would get a performance rating in a match that is higher than my rating going into the match yet it causes my rating to go down?

WTN doesn't post performance ratings for each match like TR but it seems to have a similar issue. I saw someone's rating get better when based on the score and the rating of the persons partner and opponents as well as the score there is no question his rating should have gotten worse.

I am not sure why that would happen in either rating system.
 

travlerajm

Talk Tennis Guru
They will calculate it in a bit.

The odd thing with their algo, from my perspective, is why I would get a performance rating in a match that is higher than my rating going into the match yet it causes my rating to go down?

WTN doesn't post performance ratings for each match like TR but it seems to have a similar issue. I saw someone's rating get better when based on the score and the rating of the persons partner and opponents as well as the score there is no question his rating should have gotten worse.

I am not sure why that would happen in either rating system.
I think the case where match rating is higher than pre-match rating, but post-match rating goes down, is explained by having the average rating based on N number of recent matches.

When you post a new match, the match from N+1 matches ago drops out of the average. If that match that dropped off had a higher performance rating than the most recent match, then your average will go down, even if the newest match has higher rating then the pre-match average.
 

Moon Shooter

Hall of Fame
I think the case where match rating is higher than pre-match rating, but post-match rating goes down, is explained by having the average rating based on N number of recent matches.

When you post a new match, the match from N+1 matches ago drops out of the average. If that match that dropped off had a higher performance rating than the most recent match, then your average will go down, even if the newest match has higher rating then the pre-match average.

Thats a possible explanation but unless the TR starts to drop off matches after 6 matches this doesn't explain my situation.

As for the WTN case I suppose it is possible but I tend to doubt it. The players rating seems way out of whack in every match.
 

travlerajm

Talk Tennis Guru
Thats a possible explanation but unless the TR starts to drop off matches after 6 matches this doesn't explain my situation.

As for the WTN case I suppose it is possible but I tend to doubt it. The players rating seems way out of whack in every match.
I haven’t been able to fully reverse engineer TR’s averaging formula. But I’ve come close. It seems that they are taking x number of matches, where x might be variable depending on time window. Then they toss out low-rating outliers.
 

travlerajm

Talk Tennis Guru
Thats a possible explanation but unless the TR starts to drop off matches after 6 matches this doesn't explain my situation.

As for the WTN case I suppose it is possible but I tend to doubt it. The players rating seems way out of whack in every match.
Ive noticed that WTN pre-match ratings for opponents seem to be adjusting and becoming seemingly more accurate over time.

This is in contrast to UTR, where the previous pre-match ratings are not snapshots.
 

S&V-not_dead_yet

Talk Tennis Guru
An adult who plays only USTA and NTRP tournaments with a singles utr of 8 (or 7 or 6) is probably better than a junior with the same ranking, at least in my experience. I agree with you about the doubles, however.

In my extremely limited experience as a 4.5 playing a UTR tournament, I beat a 7 [easily], beat an 8 [barely], and lost to a 9 [not a blowout; 3&3]. At the time, my UTR was in the upper 7s.
 

TennisOTM

Professional
WTN has now been linked to USTA results for 3 months. Does anyone think it has become more aligned with NTRP over that time? From what I'm seeing, there are still quite a lot of players with nonsensically low or high WTN for their NTRP level, especially for doubles. Even for players that are producing lots of new match result data that should be causing their WTN to shift to something more realistic, the numbers are barely moving.

I wonder if USTA / WTN have any idea that the algorithm is producing so many ridiculous estimates, and if they're trying to do something about it?
 

schmke

Legend
WTN has now been linked to USTA results for 3 months. Does anyone think it has become more aligned with NTRP over that time? From what I'm seeing, there are still quite a lot of players with nonsensically low or high WTN for their NTRP level, especially for doubles. Even for players that are producing lots of new match result data that should be causing their WTN to shift to something more realistic, the numbers are barely moving.

I wonder if USTA / WTN have any idea that the algorithm is producing so many ridiculous estimates, and if they're trying to do something about it?
I'll do another analysis at some point, but my guess right now is that it will still show wildly large ranges of WTNs for a given NTRP level.
 

Le Renard

New User
Went 4 weeks on vacation, came back and nothing has moved haha.

UTR: 3.5
USTA: 3.0876 (dynamic calculation from "Tennis Record")

WTN is ... 25.3
And 2 guys that I beat 6/2 6/3 and 6/3 6/1 have ratings of 25.6 and 25.1 weeks after our games in May 2022.... very odd!

I might be the real WTN underdog hahaha... joke aside the WTN is still all over the place in my humble opinion.
 
Also had issues logging in due to server issues on their end, but finally got in:

NTRP: 3.5C (female)
WTNs: 19.5 (GZ 17.7-21.2) - blue checkmark
WTNd: 20.9 - gray checkmark

UTR has me higher for doubles, but I agree with WTN that I'm a bit better at singles.

Since WTN was first published in early June, I've had some movement in my numbers:

WTNs: 17.9
WTNd: 19.1
During the same time TR's dynamic NTRP estimate has gone from 3.43 to 3.71, although they aren't including the tournament data or mixed for doubles. Interestingly, I don't think UTR has changed much - I don't have it noted anywhere but I think doubles was a hair over 4 and is down to 3.99, and singles has gone up from around 3.6 to 3.8. UTR, like WTN, does include tournament and mixed so it makes you wonder what they're doing differently with the same data.

As for the WTN ratings themselves, they seem to somewhat make sense compared to opponents considering the results for women's league and tournaments. However when you look at mixed you see some odd things - for example men who play only mixed have a WTNd too high (meaning unfavorable) and women who play mixed having lower (favorable) WTNd compared to similar women who only play women's league.
 

tpro2000

Rookie
Idk if you're still looking into this, but I just found mine:

NTRP: 4.5C
WTNs: N/A - don't play singles
WTMd: 14
UTR: 7.71 doubles
 

Moon Shooter

Hall of Fame
That seems pretty consistent with what I’m seeing. Upper 3.5 men about 22-24 lower to mid 3.5 men about 25-28. Women do about 3 wtn points better (lower) for the same ntrp level as 3.5 men. So upper 3.5 females are about 19-21 and lower to mid 3.5 females would be about 22-25.
This may seem off if you think the same ntrp level male is as strong or stronger as the same female. I’m not sure how that happened since it seemed like usta plugged in 29.9 for self rate 3.0 males and 32.1 for self rate 3.0 females.

if usta wtn allowed some matches between men and women we could see if this difference is accurate or not.
 

ChaelAZ

G.O.A.T.
Just peeked at the various guesstimates and such.

So Now:

TLS - 3.4 (TLS to UTR Conversion - 4.93, see below)
TR - 3.58 (up from 3.49, so maybe some SR matches calc'd)
USTA 2021 3.5C
WTNs 19.9
WTNd 21.5
UTRs 5.74
UTRd 5.36


Overall, pretty much on par with past experiences in each. And won't be playing leagues until Spring, so will see if any tourneys get thrown in the mix.
 

amd31321

New User
Does anybody know how to improve your WTN? I am curious just because I am slightly confused about how it works exactly. I'm a 5.0 who is currently a 13.2. I want to get into the single digits and have beaten a few opponents in the single digits. Is there a certain margin you have to win by? Are only matches in the game zone counted (like a match between a UTR 7 and UTR 10 wouldn't count).

One other thing that really shocked me is a lot of top juniors in my area are rated really lowly for their skill level (some in the 20s even though they are between 10-11 UTR and 3-4 star recruits).
 

Moon Shooter

Hall of Fame
If you want to improve your rating the. Play people that you think are not as good as their rating would suggest. Avoid people that have ratings that are worse then what you would expect.

edit: Partner with people who have a worse rating then you think they should have. Do not partner with people that have a rating better than what you think they should have.

I hate not being able to say “ higher rated” or “rated lower” because wtn wanted to prove their uniqueness by making “higher” ratings for worse tennis players.
 
Last edited:
Anyone know how far back in history they scavenge to find your data?
From USTA site, “Results provided by the USTA from as far back as 2016…”
 

TennisOTM

Professional
It's amazing to me how WTN went from being so hyped up from USTA just a few months ago, to seemingly irrelevant now. Are there any USTA players (or anyone else) finding it useful??

I looked at my page today for the first time in awhile, and they don't even have a full record of my USTA league match results. How is it that UTR, completely unaffiliated with USTA, can pull every single one of my league matches just a day or two after they're played, while WTN, supposedly partnered with USTA, is missing several matches played weeks ago?

Not to mention the highly questionable WTN rating calculations documented on this thread, especially for doubles, which appear to have not improved at all. So disappointing.
 

LOBALOT

Legend
It's amazing to me how WTN went from being so hyped up from USTA just a few months ago, to seemingly irrelevant now. Are there any USTA players (or anyone else) finding it useful??

I looked at my page today for the first time in awhile, and they don't even have a full record of my USTA league match results. How is it that UTR, completely unaffiliated with USTA, can pull every single one of my league matches just a day or two after they're played, while WTN, supposedly partnered with USTA, is missing several matches played weeks ago?

Not to mention the highly questionable WTN rating calculations documented on this thread, especially for doubles, which appear to have not improved at all. So disappointing.

The USTA has worse ADHD than me!!!
 

schmke

Legend
It's amazing to me how WTN went from being so hyped up from USTA just a few months ago, to seemingly irrelevant now. Are there any USTA players (or anyone else) finding it useful??

I looked at my page today for the first time in awhile, and they don't even have a full record of my USTA league match results. How is it that UTR, completely unaffiliated with USTA, can pull every single one of my league matches just a day or two after they're played, while WTN, supposedly partnered with USTA, is missing several matches played weeks ago?

Not to mention the highly questionable WTN rating calculations documented on this thread, especially for doubles, which appear to have not improved at all. So disappointing.
I probably contributed to exposing it's irrelevancy by writing a fair amount about it and showing how it was all over the place when mapped to NTRP, but I only did that because they were hyping it so much with repeated e-mails promoting it and its publication. I haven't written on it for months to not pile on, but also because it didn't seem to be changing or adapting, it simply is what it is, and for right now it doesn't appear relevant for league players, and perhaps not even tournament players, at least adults. Perhaps it is better for juniors, I haven't analyzed that.
 

Chalkdust

Professional
Gosh I had already forgotten that it even exists.
About 5 months ago I was getting constantly bombarded with emails from USTA hying up the *exciting news!* that WTN is about to be released.
This despite the fact that my USTA membership lapsed about 3 years ago.
When it came out I lookup mine up purely out of curiosity...
And have not heard a peep about it from USTA or anyone else pretty much since then.
Hopefully it did not consume too much of y'all's membership dollars!
 

LOBALOT

Legend
I probably contributed to exposing it's irrelevancy by writing a fair amount about it and showing how it was all over the place when mapped to NTRP, but I only did that because they were hyping it so much with repeated e-mails promoting it and its publication. I haven't written on it for months to not pile on, but also because it didn't seem to be changing or adapting, it simply is what it is, and for right now it doesn't appear relevant for league players, and perhaps not even tournament players, at least adults. Perhaps it is better for juniors, I haven't analyzed that.

Right you may remember I thought a high number was good and I am not and then you pointed out a high number was bad which confirmed the fact I had some work to do in my game.
 

Moon Shooter

Hall of Fame
I probably contributed to exposing it's irrelevancy by writing a fair amount about it and showing how it was all over the place when mapped to NTRP, but I only did that because they were hyping it so much with repeated e-mails promoting it and its publication. I haven't written on it for months to not pile on, but also because it didn't seem to be changing or adapting, it simply is what it is, and for right now it doesn't appear relevant for league players, and perhaps not even tournament players, at least adults. Perhaps it is better for juniors, I haven't analyzed that.

WTN is much more relevant for me since I and people around me play more mixed usta then same gender usta.

There is no question that wtn had some sort issue with a few outliers. But other than that it is a better indicator of strength for the typical player than ntrp because ntrp ignores too much data. WTN is the only rating system that I am interested in.

Usta has not explained who can run wtn tournaments. It may be that they will severely limit who can run a tournament so only people that literally own a tennis facility can run a tournament. In that case wtn will have very limited value. But that is due to how usta is allowing the rating to be used. It is not a knock against the rating system itself.
 

Moon Shooter

Hall of Fame
It's amazing to me how WTN went from being so hyped up from USTA just a few months ago, to seemingly irrelevant now. Are there any USTA players (or anyone else) finding it useful??

I looked at my page today for the first time in awhile, and they don't even have a full record of my USTA league match results. How is it that UTR, completely unaffiliated with USTA, can pull every single one of my league matches just a day or two after they're played, while WTN, supposedly partnered with USTA, is missing several matches played weeks ago?

Not to mention the highly questionable WTN rating calculations documented on this thread, especially for doubles, which appear to have not improved at all. So disappointing.

This is not a problem with the rating system it is a problem with usta incompetence. Usta own tennis link does not even have all my usta matches. If a five year old drives a Porsche into a house don’t blame the engineers of the Porsche.
 

schmke

Legend
WTN is much more relevant for me since I and people around me play more mixed usta then same gender usta.

There is no question that wtn had some sort issue with a few outliers. But other than that it is a better indicator of strength for the typical player than ntrp because ntrp ignores too much data. WTN is the only rating system that I am interested in.

Usta has not explained who can run wtn tournaments. It may be that they will severely limit who can run a tournament so only people that literally own a tennis facility can run a tournament. In that case wtn will have very limited value. But that is due to how usta is allowing the rating to be used. It is not a knock against the rating system itself.
I don't think the question is who can "run" a tournament, anyone can. Perhaps you are really asking what tournament results the USTA will allow to be fed into the system, and that likely requires the tournament be sanctioned or affiliated in some way with the USTA or a member organization or facility.
 

schmke

Legend
This is not a problem with the rating system it is a problem with usta incompetence. Usta own tennis link does not even have all my usta matches. If a five year old drives a Porsche into a house don’t blame the engineers of the Porsche.
But for players where there were a lot of results available to the WTN algorithm, the wild ranges of WTNs for a given NTRP was simply way too large and had large quantities of clear misses, let alone not passing the gender neutral test at all. Some of this is how the algorithm is being used, but some has to be the algorithm itself.
 

J011yroger

Talk Tennis Guru
But for players where there were a lot of results available to the WTN algorithm, the wild ranges of WTNs for a given NTRP was simply way too large and had large quantities of clear misses, let alone not passing the gender neutral test at all. Some of this is how the algorithm is being used, but some has to be the algorithm itself.

I wonder what they thought would happen?

J
 

OnTheLine

Hall of Fame
I just looked for the first time in a while .... yeah, I don't think they know what they are doing.

We just completed our 2023 ESL 40+ leagues. Not one match is listed. League ran Sept 10 - Nov 12.

Not.One.Match.Listed

The men have a few match listed for their 40+ leagues .... the women? None.

Mixed league: from the summer they had been listed with little numbers by them. Now, they are still listed and all listed as N/A

Pathetic.
Useless.
 

Moon Shooter

Hall of Fame
But for players where there were a lot of results available to the WTN algorithm, the wild ranges of WTNs for a given NTRP was simply way too large and had large quantities of clear misses, let alone not passing the gender neutral test at all. Some of this is how the algorithm is being used, but some has to be the algorithm itself.

Yes maybe 1 out of 100 had some odd ratings that were hard to explain. And process of elimination may suggest the algorithm. It is stupid to have a floor and a cap in any rating system and that might explain some of it. But for 99% of players wtn is the best rating system we have.

edit: I mean for 99% of adult rec players - It seems a mess for pros and I don’t know about juniors or college players.
 
Last edited:

schmke

Legend
Yes maybe 1 out of 100 had some odd ratings that were hard to explain. And process of elimination may suggest the algorithm. It is stupid to have a floor and a cap in any rating system and that might explain some of it. But for 99% of players wtn is the best rating system we have.

edit: I mean for 99% of adult rec players - It seems a mess for pros and I don’t know about juniors or college players.
Ummm, 1 out of 100 had odd ratings? Did you even read my blog and what I wrote?

This is the chart that maps NTRP levels to WTN levels for male singles players with a high confidence WTN.

20220612-wtnsm70.png


Take the 3.5 level for example. It shows there are a noticeable number of 3.5s with WTNs as low/good as 14/15 and a significant number at 16, but also 3.5s with WTNs as high as 26/27 and a significant number at 24/25. This is a huge range and not 1 out of 100. If there were a direct mapping from NTRP to WTN you'd expect one NTRP level to map to about 4-5 WTN levels, and even if you say the algorithms are different and you wouldn't expect a direct mapping, a reasonable person would probably still expect one NTRP level to map to no more than 6-8 WTN levels, but here we have one NTRP level mapping to a full 14+ levels.

The WTN even says the Game Zone for players is generally +/-2 to 2.5 which is that 4-5 WTN level range, so WTN is saying that the good number of NTRP 3.5 players that are 16 or lower/better should not have competitive matches with the large number of NTRP 3.5 players that are at 21 or higher/worse WTNs.

Or looking at it another way, a WTN 20 has players that are NTRP 3.0 thru 4.5 and should have competitive matches. And if you use a Game Zone range of 2.5 the number of 3.0s and 4.5s that should be competitive grows. Again, this is not 1 in 100 that are odd.

I suppose you can choose to believe these WTNs are accurate, but that requires saying NTRP is utter garbage and completely inaccurate. With all its faults, I certainly don't believe that is true and just looking at specific players that you know reveals a much higher rate of "odd" matches than 1 in 100.
 

nyta2

Legend
low to middling ntrp4.5 (according to tennisrecord ~4.2) here...

my WTN:
singles: 11.0
singles(GZ): 9.1-12.8
doubles: 16.6

~14-17 seems about right?

for completeness:

my UTR
singles: 8.09
doubles: 7.53
update:

wtn:
singles: 11.6 (13.4-9.8)
doubles: 15.5

utr:
singles: 7.99 (no new matches since last update)
doubles: 8.45 (playing lots of usta dubs, mx,... went to nationals and played dubs)

side note, big advantage that utr has... is that they pull usta data, but usta doesn't pull utr data...
 

Moon Shooter

Hall of Fame
Ummm, 1 out of 100 had odd ratings? Did you even read my blog and what I wrote?

This is the chart that maps NTRP levels to WTN levels for male singles players with a high confidence WTN.

20220612-wtnsm70.png


Take the 3.5 level for example. It shows there are a noticeable number of 3.5s with WTNs as low/good as 14/15 and a significant number at 16, but also 3.5s with WTNs as high as 26/27 and a significant number at 24/25. This is a huge range and not 1 out of 100. If there were a direct mapping from NTRP to WTN you'd expect one NTRP level to map to about 4-5 WTN levels, and even if you say the algorithms are different and you wouldn't expect a direct mapping, a reasonable person would probably still expect one NTRP level to map to no more than 6-8 WTN levels, but here we have one NTRP level mapping to a full 14+ levels.

The WTN even says the Game Zone for players is generally +/-2 to 2.5 which is that 4-5 WTN level range, so WTN is saying that the good number of NTRP 3.5 players that are 16 or lower/better should not have competitive matches with the large number of NTRP 3.5 players that are at 21 or higher/worse WTNs.

Or looking at it another way, a WTN 20 has players that are NTRP 3.0 thru 4.5 and should have competitive matches. And if you use a Game Zone range of 2.5 the number of 3.0s and 4.5s that should be competitive grows. Again, this is not 1 in 100 that are odd.

I suppose you can choose to believe these WTNs are accurate, but that requires saying NTRP is utter garbage and completely inaccurate. With all its faults, I certainly don't believe that is true and just looking at specific players that you know reveals a much higher rate of "odd" matches than 1 in 100.
Yes I read your blog. I think there are many problems with drawing the conclusion that WTN is a off.

1) When you posted that all the data was not entered into the WTN number.

2) you seem to assume that the USTA system will accurately match someone's strength at singles so if WTN singles differs from their USTA end of year rating then WTN must be to blame. There are numerous problems with this assumption:
A) I have found USTA singles to be a completely out of whack. I know 3.0 players that are good at singles that can reliably beat 4.0 doubles focused players at singles. Many USTA players I know don't even play singles.

B) Are you only including players that earned a c rating in the prior year and played singles that year? If not there are just too many ways the USTA rating will be way off. If guy got a high 4.0 rating back in 2017 from playing singles and didn't play again for 4 years and entered as a 3.5 self rate and played a few doubles games in mixed doubles those doubles games may not effect his self rate 3.5 rating at all. So this person who may have been close to a 4.5 singles player will show up as a year end 3.5 player based on a form he filled out. He may actually be better than he was in 2017! its just that USTA doesn't rate all the games. There are many other situations like this and in between where the USTA rating is way off for singles games.

C) This isn't the first time USTA rating systems has been exposed as a mess. UTR also showed that USTA ratings have a huge spread. The levels in USTA are very sticky. Anytime you have a system where you predict a 6-0 6-0 match in the exact same level - like USTA does - you are going to have a morass. NTRP may be pretty tight in certain areas with many USTA teams and matches. But for much of the USA the NTRP levels are just going to be very loose approximations of skill. That's why you will have teams blow out their flight and then get blown out districts. And then that team that won districts will get blown out at state. And then that team that won state will get blown out at sectionals. And then that team that won sectionals will get blown out at nationals. All in the same USTA level. Does that graph still seem surprising to you?

I'm not saying UTR or WTN are perfect. UTR like NTRP ignores far too much data and like NTRP is often worthless. WTN does not have enough cross play between men and women. They also won't be able to go all the way up to pro levels because they have a caps and floors. WTN also did have some strange ratings for a few people. But by and large WTN is pretty accurate for my area. USTA rating is pretty much irrelevant. In my area a 3.5 player could have anywhere from 16-29 singles WTN. If they don't have at least a 24 or lower they probably won't play singles for a team at 3.5. But if you forced some of those 3.5 players to play singles yes you would see a range. If you go by the WTN you will get a much better match.
 

schmke

Legend
Yes I read your blog. I think there are many problems with drawing the conclusion that WTN is a off.

1) When you posted that all the data was not entered into the WTN number.
WTN has 5+ years of data, if anything missing some matches from 2022 would make one expect it to be closer to the most recent year-end NTRP levels as it gives less opportunity for players to have changed from what they were at the end of 2021.

2) you seem to assume that the USTA system will accurately match someone's strength at singles so if WTN singles differs from their USTA end of year rating then WTN must be to blame. There are numerous problems with this assumption:
A) I have found USTA singles to be a completely out of whack. I know 3.0 players that are good at singles that can reliably beat 4.0 doubles focused players at singles. Many USTA players I know don't even play singles.
I'm not saying WTN is solely to blame, I'm just saying there is a wide disparity of WTNs within a given NTRP level, more than you'd expect. Is NTRP perfect? Of course not, so I'm sure some of its flaws contribute to the variance. But WTN is the newcomer so where it is wildly different, it makes you go hmm and without a clear explanation why the difference is accurate, it has to accept at least part of the blame too. And we've had plenty of people share their ratings and experience that supports on questioning how accurate WTN really is.

Additionally, the chart I shared was for WTN singles ratings with a high confidence. In order to have high confidence they had to have played quite a few singles matches, so yes, while NTRP mixes singles and doubles, the population used for this analysis clearly plays a lot of singles so it is, IMHO, still a somewhat fair analysis.

B) Are you only including players that earned a c rating in the prior year and played singles that year? If not there are just too many ways the USTA rating will be way off. If guy got a high 4.0 rating back in 2017 from playing singles and didn't play again for 4 years and entered as a 3.5 self rate and played a few doubles games in mixed doubles those doubles games may not effect his self rate 3.5 rating at all. So this person who may have been close to a 4.5 singles player will show up as a year end 3.5 player based on a form he filled out. He may actually be better than he was in 2017! its just that USTA doesn't rate all the games. There are many other situations like this and in between where the USTA rating is way off for singles games.
Yes, I only used C ratings from 2021, so your speculation about the flaws if I hadn't done so don't really apply.

C) This isn't the first time USTA rating systems has been exposed as a mess. UTR also showed that USTA ratings have a huge spread. The levels in USTA are very sticky. Anytime you have a system where you predict a 6-0 6-0 match in the exact same level - like USTA does - you are going to have a morass. NTRP may be pretty tight in certain areas with many USTA teams and matches. But for much of the USA the NTRP levels are just going to be very loose approximations of skill. That's why you will have teams blow out their flight and then get blown out districts. And then that team that won districts will get blown out at state. And then that team that won state will get blown out at sectionals. And then that team that won sectionals will get blown out at nationals. All in the same USTA level. Does that graph still seem surprising to you?
Yes, there are UTR differences when the same analysis is done, but using the range we see for WTN, the ~14+ levels we see for a given NTRP level would equate to 5.6 levels of UTR, but I think we generally see a given NTRP level have a spread of 4 UTR levels so the range is not as large. Also, UTR is using results from their own events which can be a reason for differences, while WTN and NTRP should be based on largely the same results since it is from the USTA's collaboration with the ITF.
 

jmnk

Hall of Fame
Oh, the monthly 'USTA ranking is useless' rant by @Moon Shooter . I was kind of missing that.... :)
@Moon Shooter seems to be the only person that can, in the same post, say with a straight face both:
" I know 3.0 players that are good at singles that can reliably beat 4.0 doubles focused players at singles. " - which simply implies that a system (i.e the NTRP one) where _both_ singles and doubles are combined is bad.
AND
" NTRP ignores far too much data. " - which simply implies that a system that does _not_ combine results from all matches: singles, doubles, mixed is bad.
 

TennisOTM

Professional
I've been crunching some numbers to analyze the ratings of 4.0 men in my area. I took the group of guys who got 4.0C at year-end 2021 and who played enough league matches this year to get a projected year-end 2022 rating on Tennisrecord. There are 175 guys in the dataset. I pulled the UTR and WTN for all 175 guys as well. For guys who played both singles and doubles, I calculated a single UTR and WTN for each player as a weighted average, weighted by how many singles and doubles matches they played this year.

On Tennisrecord, the full range of projected year-end ratings is 3.37 to 4.07. Seems like a fairly typical range for guys a year after getting a 4.0C, perhaps skewing a bit low as TR tends to do (their median is 3.64, so perhaps about 0.1 too low on average across the board).

How does this compare to UTR and WTN ranges for the same players?

The full UTR range is 4.56 to 7.38. That range of 2.82 units is much smaller than I was expecting. If I remove the single lowest and single highest UTR, the range is 4.97 to 7.10, only 2.13 units. That's pretty remarkable given that many of these players played only a few matches, played mixed doubles and non-USTA matches, etc. - all things you might expect to produce a bunch of outliers.

The WTN range, though, is pretty ridiculous: 9.1 to 30.2, a range of 21.1 units. I think that's equivalent to more than 8 units of UTR, and more than 2 whole units of NTRP (i.e. the range from a 3.0 to a 5.0). And the wide range is not just because of single outliers - there are a bunch of guys leading up/down to those tails.
 

travlerajm

Talk Tennis Guru
I've been crunching some numbers to analyze the ratings of 4.0 men in my area. I took the group of guys who got 4.0C at year-end 2021 and who played enough league matches this year to get a projected year-end 2022 rating on Tennisrecord. There are 175 guys in the dataset. I pulled the UTR and WTN for all 175 guys as well. For guys who played both singles and doubles, I calculated a single UTR and WTN for each player as a weighted average, weighted by how many singles and doubles matches they played this year.

On Tennisrecord, the full range of projected year-end ratings is 3.37 to 4.07. Seems like a fairly typical range for guys a year after getting a 4.0C, perhaps skewing a bit low as TR tends to do (their median is 3.64, so perhaps about 0.1 too low on average across the board).

How does this compare to UTR and WTN ranges for the same players?

The full UTR range is 4.56 to 7.38. That range of 2.82 units is much smaller than I was expecting. If I remove the single lowest and single highest UTR, the range is 4.97 to 7.10, only 2.13 units. That's pretty remarkable given that many of these players played only a few matches, played mixed doubles and non-USTA matches, etc. - all things you might expect to produce a bunch of outliers.

The WTN range, though, is pretty ridiculous: 9.1 to 30.2, a range of 21.1 units. I think that's equivalent to more than 8 units of UTR, and more than 2 whole units of NTRP (i.e. the range from a 3.0 to a 5.0). And the wide range is not just because of single outliers - there are a bunch of guys leading up/down to those tails.
Now that you’ve identified a discrepancy, in the engineering world, the next step is a root cause investigation.

Do the players with WTN numbers at the extremes have any sort of profile similarity?

Eg, in my case I found that WTN, TLS, and TR were matching up about the same, but UTR was giving a much lower rating about 2.5 UTR points lower than the corresponding UTR for the other ratings. Looking into in, it was obvious that UTR doesn’t properly adjust for partner strength when it calculates match rating, so it depresses mixed match rating when players play as the stronger partner, while inflating mixed match rating for the weaker partner.

My WTN match ratings and drifts of the avg for the year seemed logical, but one of my friends who is a 5.0C but only played ITF pro matches this year and got crushed, has a WTN of 29 that doesn’t make any sense, so it’s obvious the algo couldn’t handle his situation for some reason.
 
Last edited:

TennisOTM

Professional
Now that you’ve identified a discrepancy, in the engineering world, the next step is a root cause investigation.

Do the players with WTN numbers at the extremes have any sort of profile similarity?

Eg, in my case I found that WTN, TLS, and TR were matching up about the same, but UTR was giving a much lower rating about 2.5 UTR points lower than the corresponding UTR for the other ratings. Looking into in, it was obvious that UTR doesn’t properly adjust for partner strength when it calculates match rating, so it depresses mixed match rating when players play as the stronger partner, while inflating mixed match rating for the weaker partner.

My WTN match ratings and drifts of the avg for the year seemed logical, but one of my friends who is a 5.0C but only played ITF pro matches this year and got crushed, has a WTN of 29 that doesn’t make any sense, so it’s obvious the algo couldn’t handle his situation for some reason.

I can't make much sense of it. The guys at the WTN extremes are an assorted mix of singles players, doubles players, younger guys, older guys, guys who play mixed and guys who don't, etc.

Generally the rank order seems to be reasonably OK, i.e. the high-end WTN guys are mostly the high-end 4.0 players pushing the 4.5 border, and the low-end WTN guys are mostly the low-end 4.0 players pushing the 3.5 border. It's just that the magnitude of the spread between them is so out of whack.
 

travlerajm

Talk Tennis Guru
I can't make much sense of it. The guys at the WTN extremes are an assorted mix of singles players, doubles players, younger guys, older guys, guys who play mixed and guys who don't, etc.

Generally the rank order seems to be reasonably OK, i.e. the high-end WTN guys are mostly the high-end 4.0 players pushing the 4.5 border, and the low-end WTN guys are mostly the low-end 4.0 players pushing the 3.5 border. It's just that the magnitude of the spread between them is so out of whack.
It seems like the algo hasn’t been robustness tested at the extremes. So if someone dominates at their level, there is nothing to rein in the rating from exceeding a logical upper bound. And same for someone who loses badly at level. That explains why my 5.0C friend got a singles WTN rating of 29 when he got crushed by 5.5-6.0 ntrp guys, even though he was a solid 5.0 ntrp.
 

Moon Shooter

Hall of Fame
WTN has 5+ years of data, if anything missing some matches from 2022 would make one expect it to be closer to the most recent year-end NTRP levels as it gives less opportunity for players to have changed from what they were at the end of 2021.

Again you are assuming that NTRP is accurate. The problem is NTRP often has insufficient data to match up with WTN. My year end for 2021 completely ignored all five USTA matches I played that year. I should get a year end c rating this year. But it will ignore the majority of my doubles matches. I think you would actually find that the WTN certified doubles ratings are more in line with NTRP.


I also think part of the problem is if you have several games against 12 year old and you were competitive but now that he is 17 he is playing like a college player that might be a problem. That problem will get sorted once you include more recent games from both players as long as the players have more games.

I'm not saying WTN is solely to blame, I'm just saying there is a wide disparity of WTNs within a given NTRP level, more than you'd expect. Is NTRP perfect? Of course not, so I'm sure some of its flaws contribute to the variance. But WTN is the newcomer so where it is wildly different, it makes you go hmm and without a clear explanation why the difference is accurate, it has to accept at least part of the blame too. And we've had plenty of people share their ratings and experience that supports on questioning how accurate WTN really is.

If you want to find out which rating system is better, people can run simulations and predict results based on both and see which systems predicts results better. This is done to test chess algorithms all the time although I do not know how to do the technical part of it. Jeff Sonas did this for chess his website is chessmetrics. The problem is NTRP published categories are so wide it is hard to really get good data. You might be able to run a meaningful test based on your estimates of they dynamic rating.



Additionally, the chart I shared was for WTN singles ratings with a high confidence. In order to have high confidence they had to have played quite a few singles matches, so yes, while NTRP mixes singles and doubles, the population used for this analysis clearly plays a lot of singles so it is, IMHO, still a somewhat fair analysis.

It is fair if they played USTA singles recently. If not their singles play may not have had any impact on their USTA rating at all. So in those cases you are looking at an NTRP number that does not include any singles performances at all and saying WTN is off because its singles number does not match USTA's.

Yes, I only used C ratings from 2021, so your speculation about the flaws if I hadn't done so don't really apply.

That's good I think that makes your analysis much better. But even so NTRP still ignores far too much data in coming to their C rating.

Yes, there are UTR differences when the same analysis is done, but using the range we see for WTN, the ~14+ levels we see for a given NTRP level would equate to 5.6 levels of UTR, but I think we generally see a given NTRP level have a spread of 4 UTR levels so the range is not as large.

A 4.50 UTR can be a male NTRP 3.0, 3.5 or 4.0.

https://cdn.universaltennis.com/public/media/UTR_Player_Range.pdf

Now I think there are all sorts of problems with singles UTR due to the 12 month cut off and the fact that most of UTRs data is from USTA and USTA rarely has doubles games. But if someone played at least 8 doubles games for their 4.50 UTR they will likely play like a 4.50 UTR player. Trying to say whether they are a 3.0 or a 3.5 or a 4.0 is irrelevant because those categories are a mess.

Also, UTR is using results from their own events which can be a reason for differences, while WTN and NTRP should be based on largely the same results since it is from the USTA's collaboration with the ITF.

The more matches the better. The problem with USTA data is they ignore too much data. WTN will continue to have problems unless USTA broadens the matches that can count for WTN. One big way to have much more data is to get true coed matches. If USTA allows WTN events (and makes it easy to do as a practical matter) where people can play regardless of gender then WTN will become much better than USTA.
 

Moon Shooter

Hall of Fame
Oh, the monthly 'USTA ranking is useless' rant by @Moon Shooter . I was kind of missing that.... :)
@Moon Shooter seems to be the only person that can, in the same post, say with a straight face both:
" I know 3.0 players that are good at singles that can reliably beat 4.0 doubles focused players at singles. " - which simply implies that a system (i.e the NTRP one) where _both_ singles and doubles are combined is bad.

No it doesn't imply that.


AND
" NTRP ignores far too much data. " - which simply implies that a system that does _not_ combine results from all matches: singles, doubles, mixed is bad.

Again that is not implied.

Whether to include doubles games in singles ratings and vice versa depends on how many matches you have. You seem to only think in basic terms of combining results must either be good or bad. When in fact it is a bit more nuanced. It is probably good to use doubles results when you (and the people you played) have very few singles matches but if someone has over 30 singles matches against other players with established ratings in the year it would make little sense to add in doubles results.

Not combining men and women into one system and allowing them to play each other in matches where the ratings can be properly assessed is from a ratings accuracy perspective bad.
 
Top