NTRP and Mixed League

Moon Shooter

Hall of Fame
I've seen UTR have NTRP 4.5 men rated less than 1.0 higher than their NTRP 3.5 women partners,

Sure NTRP ignores quite a bit of data. I wouldn't assume that it is UTR that is inaccurate unless there is very little data. NTRP does work better if you have very little data.


and have seen UTR ratings vary wildly (@travlerajm is an example) with no new matches played by the player (and likely none by anyone a few degrees of separation away), so I wouldn't go so far as to say that UTR is the gold standard and its prediction for mixed matches is correct. Data is part of it, but a stable algorithm is a huge part of it too.

Once you get at least 8 doubles matches in, then I think UTR is the gold standard. You can say NTRP is so stable because it doesn't change. But it doesn't change because they just leave the number there! UTR would not be volatile either if they just spit out a number based in large part on your say so and refused to change it except once per year.

Plus NTRP categories are so huge they seem more stable because they are not even trying to be very precise. .5 NTRP can cover 6 UTR half points. So if your UTR is varying between 2.5 and 5.5 that is no different then if male player changes from 3.0 to 3.5. Its like complaining that UTR keeps changing its mind whether the guy is in Illinois, Michigan or Missouri and saying NTRP is better because it consistently says the guy is North America.
 

travlerajm

Talk Tennis Guru
confused again. If that was the case, and, as you said, " Black (shown next to singles or doubles match) is your men’s rating " then the 'black match rating of the last match' should match the 'main' rating. well, it does not for you......
My "main" rating at the top of my page was identical to my men's averaged rating following my most recent men's match (and stable for 24 months), until a few months ago, when my most recent men's match became >24 months old. It got downgraded by exactly 0.02 at that point, maybe because the algorithm adds in an "assumed rustiness" factor?
 
Last edited:

travlerajm

Talk Tennis Guru
The main guy from NorCal who originally inspired this thread is an interesting case for testing some of these theories. As a 4.0 he has played many matches at 7.0, 8.0, and 9.0 mixed: close to 40 matches in EACH of those league types since 2019. It seems that some of you would predict that his numerical match ratings would be quite different in the different league types because of the very different levels of his partners.

But the data from Tennisrecord show remarkable consistency. I took all his matches from 2019-2022 that produced a match rating on TR, and grouped them by the three leagues:

7.0: average match rating of 3.99 (36 matches)
8.0: average match rating of 4.02 (34 matches)
9.0: average match rating of 4.04 (36 matches)

Statistically those differences in the averages are basically nothing, which suggests to me that the match rating algorithm, which compares the straight sum of male rating plus female rating, actually does quite well regardless of whether the male or the female is the higher NTRP.
My conclusion from this is that the guy probably isn't playing with enough stagger in 7.0, where he could potentially have an advantage.

My 8.0 mixed partner who I've played the most matches with (I believe with 10-0 record over the years) has played a ton of matches in 3.5 ladies dubs, 7.0 mixed, and 8.0 mixed.
I looked up her records over the 5-year period where she played with me (100+ total matches).

Her average TR match ratings:
3.5 ladies dubs: 3.14 (60 matches)
All mixed: 3.18 (46 matches)
7.0 mixed: 3.19 (20 matches)
8.0 mixed: 3.18 (26 matches)
8.0 mixed with me: 3.24 (5 matches showing a TR rating)

I'm very strict about enforcing her net positioning (having to remind her a few times a match not to back up), so that would explain why she outperforms her mixed average when she plays with me.
Ladies' dubs doesn't have an unbalanced pairing, so it's not surprising that she performs worse than in mixed (statistically significant enough that std error bars don't overlap).
 

travlerajm

Talk Tennis Guru
confused again. If that was the case, and, as you said, " Black (shown next to singles or doubles match) is your men’s rating " then the 'black match rating of the last match' should match the 'main' rating. well, it does not for you......
I just looked at my record. My "main" rating is exactly the mean of my last 7 men's matches. The individual match rating is still calculated using a different rolling average (but I can't tell easily what the formula is).
 

schmke

Legend
Remember UTR does not use a self rates.
Sure they do, it is more or less the same as NTRP.

With NTRP, all the self-rating does is determine what level you begin playing at. You have no actual rating, that is determined by your match results as you begin playing.

With UTR, you implicitly self-rate by choosing what tournament/draw to enter. You similarly have no actual rating until you have match results.

So they are really no different. NTRP just formalizes the self-rating to facilitate establishing the level you begin playing at. UTR is less formal and leaves it up to the player or tournament director on where they begin playing.

And for UTRs calculated for USTA League players, there really is no different as both UTR and NTRP have no rating for the player when they start and are looking at the same matches.

Once people have about 8 fairly close matches in UTR against opponents that also have at least 8 games in UTR the ratings start to stabilize and I think they are more accurate than the stale ntrp ratings that ignores huge amounts of mixed doubles data.
I admittedly don't follow detailed UTRs more than myself and my wife, and by proxy since he has posted his wild swings, @travlerajm. It is true he has limited matches and that could contribute to instability, but I believe his swings have been from 4 to 7 without posting any new results. That screams instability of some kind.

My UTR has been somewhat stable, for the most part staying in a range of 0.3-0.4 while I've posted results, of which I've had about 18-20 in the previous 12 months while my rating has been varying a bit.

My wife's UTR has not been stable, varying a full 1.0 or so at times, often a change of 0.4-0.5 with no new results posted by her or any opponents. These changes will happen mid-week one way and then the other, while all our play is on weekends and her results and any prior opponents have been posted and not changing. And she has had ~12 results in her sliding 12 month window so well more than the 8 you are suggesting. While this is not as bad as @travlerajm , it is still indicative of an algorithm that is not stable.

UTR is an algorithm that goes back and recalculates things retroactively and/or with iterations, and a sign of an unstable algorithm is one that fails to converge with iteration, and instead oscillates back and forth between different values with each iteration. It would appear that when my wife's UTR jumps up or down 0.4 from a Wednesday to Thursday with no new results posted, that the algorithm is having trouble converging and is in fact oscillating to some degree.

This leads me to conclude that the algorithm is inherently unstable, at least for certain situations. But my wife has had a sufficient number of matches so the fact that it is unstable for her situation is concerning given that many USTA players have a similar number of matches recorded.
 

Moon Shooter

Hall of Fame
Sure they do, it is more or less the same as NTRP.

With NTRP, all the self-rating does is determine what level you begin playing at. You have no actual rating, that is determined by your match results as you begin playing.

With UTR, you implicitly self-rate by choosing what tournament/draw to enter. You similarly have no actual rating until you have match results.

The big difference is UTR will rate matches not in UTR or USTA. I can play a match anywhere in the world with a friend and get at least an unverified utr rating out of it. USTA only rates USTA matches. UTR is geared for a much more wide open field.
So they are really no different. NTRP just formalizes the self-rating to facilitate establishing the level you begin playing at. UTR is less formal and leaves it up to the player or tournament director on where they begin playing..

I agree with the substance but think this is a pretty big difference while you think it is minor. The fact that people are posting results all over the world with other players and the system is not pegged to a certain USA level means UTR needs to account for the possibility the players may be 3.0 players or 6.5 players. USTA pretty much knows if someone self rates at 3.0 they are not a pro.

And for UTRs calculated for USTA League players, there really is no different as both UTR and NTRP have no rating for the player when they start and are looking at the same matches..

Again if I self rated as a 5.5 and got blown out for 3 matches I could get a USTA 5.0 c rating. And then I could post about what a horrible system USTA is. They are basically taking my word for it while UTR which is an international system is not. So UTR is going to vary for the first few matches. The USTA confidence is solely based on the player's say so and understanding of the the USTA system. UTR can't rely on that because they are new and they are international.

I admittedly don't follow detailed UTRs more than myself and my wife, and by proxy since he has posted his wild swings, @travlerajm. It is true he has limited matches and that could contribute to instability, but I believe his swings have been from 4 to 7 without posting any new results. That screams instability of some kind..

4-7 in UTR could pretty much be a 4.0 player in NTRP. So NTRP would say you are a "4.0 player" and seem stable but really that is just because the 4.0 men's level is the size of North America and UTR is trying to pinpoint what state you are in. After about 8 matches in the relevant game - singles or doubles - and I think UTR is better.

My UTR has been somewhat stable, for the most part staying in a range of 0.3-0.4 while I've posted results, of which I've had about 18-20 in the previous 12 months while my rating has been varying a bit..

.3 -.4 is about a .1-.2 difference in NTRP. So it has been staying fairly stable.

My wife's UTR has not been stable, varying a full 1.0 or so at times, often a change of 0.4-0.5 with no new results posted by her or any opponents. These changes will happen mid-week one way and then the other, while all our play is on weekends and her results and any prior opponents have been posted and not changing. And she has had ~12 results in her sliding 12 month window so well more than the 8 you are suggesting. While this is not as bad as @travlerajm , it is still indicative of an algorithm that is not stable.

UTR is an algorithm that goes back and recalculates things retroactively and/or with iterations, and a sign of an unstable algorithm is one that fails to converge with iteration, and instead oscillates back and forth between different values with each iteration. It would appear that when my wife's UTR jumps up or down 0.4 from a Wednesday to Thursday with no new results posted, that the algorithm is having trouble converging and is in fact oscillating to some degree.

This leads me to conclude that the algorithm is inherently unstable, at least for certain situations. But my wife has had a sufficient number of matches so the fact that it is unstable for her situation is concerning given that many USTA players have a similar number of matches recorded.

I fully agree that these ratings should not be so volatile. I don't understand why tennis rating systems act like they have some super secret formula when Elo systems have been used by US chess since about 1960. But oracle that runs and likely sponsors UTR probably wants to make it seem "magical" or something. There is no reason for all the recalculating and volatility and it cheapens the product. Chael has said he would like to achieve a 6.0 UTR rating. And I think that is great goal for an armature tennis player that I would work toward myself. The problem is I am a much worse player than he is, and I have already gotten over a 6.0 UTR rating some Tuesday in November. These tennis rating systems put such an emphasis on secrecy and therefore lack transparency and stability. Like someone is going to try to steal their rating algorithm! My god grow up! The secrecy means no one understands what these numbers mean and therefore they lose significance. If I am 1.00 utr point higher than another what is my predicted score against them? If you don't know then the numbers have less meaning. NTRP is better at this in some ways but they are so vague splitting about 50% of all men tennis players into 3.0 and 3.5 that it is hard to really care. Again what does it mean if you are a 3.0 or a 3.5 or a 4.0. Well all of these players could be about a 4.5 utr or they could be drastically different.

When will tennis finally get a decent rating system? I tried contacting Jeff Sonas who had done some decent chess rating systems but he did not respond. Oh well.
 

dsp9753

Semi-Pro
The big difference is UTR will rate matches not in UTR or USTA. I can play a match anywhere in the world with a friend and get at least an unverified utr rating out of it. USTA only rates USTA matches. UTR is geared for a much more wide open field.


I agree with the substance but think this is a pretty big difference while you think it is minor. The fact that people are posting results all over the world with other players and the system is not pegged to a certain USA level means UTR needs to account for the possibility the players may be 3.0 players or 6.5 players. USTA pretty much knows if someone self rates at 3.0 they are not a pro.



Again if I self rated as a 5.5 and got blown out for 3 matches I could get a USTA 5.0 c rating. And then I could post about what a horrible system USTA is. They are basically taking my word for it while UTR which is an international system is not. So UTR is going to vary for the first few matches. The USTA confidence is solely based on the player's say so and understanding of the the USTA system. UTR can't rely on that because they are new and they are international.



4-7 in UTR could pretty much be a 4.0 player in NTRP. So NTRP would say you are a "4.0 player" and seem stable but really that is just because the 4.0 men's level is the size of North America and UTR is trying to pinpoint what state you are in. After about 8 matches in the relevant game - singles or doubles - and I think UTR is better.



.3 -.4 is about a .1-.2 difference in NTRP. So it has been staying fairly stable.



I fully agree that these ratings should not be so volatile. I don't understand why tennis rating systems act like they have some super secret formula when Elo systems have been used by US chess since about 1960. But oracle that runs and likely sponsors UTR probably wants to make it seem "magical" or something. There is no reason for all the recalculating and volatility and it cheapens the product. Chael has said he would like to achieve a 6.0 UTR rating. And I think that is great goal for an armature tennis player that I would work toward myself. The problem is I am a much worse player than he is, and I have already gotten over a 6.0 UTR rating some Tuesday in November. These tennis rating systems put such an emphasis on secrecy and therefore lack transparency and stability. Like someone is going to try to steal their rating algorithm! My god grow up! The secrecy means no one understands what these numbers mean and therefore they lose significance. If I am 1.00 utr point higher than another what is my predicted score against them? If you don't know then the numbers have less meaning. NTRP is better at this in some ways but they are so vague splitting about 50% of all men tennis players into 3.0 and 3.5 that it is hard to really care. Again what does it mean if you are a 3.0 or a 3.5 or a 4.0. Well all of these players could be about a 4.5 utr or they could be drastically different.

When will tennis finally get a decent rating system? I tried contacting Jeff Sonas who had done some decent chess rating systems but he did not respond. Oh well.

Why does it matter? If you want to have competitive matches, you need some type of system where players are divided into groups. Since USTA is nation wide and caters to all communities, it has vague groups of 3.0, 3.5, 4.0, etc where the matches should be somewhat competitive. If you make the groups too small like UTR, then you wont have enough players in each group to have competitive leagues. If you make the ratings dynamically updating like UTR, then it makes it too much of a hassle to captain and manage leagues.

How does knowing exactly how the algorithm work help you as a player? What benefit is there? Knowing people, it would only cause more drama, gossiping, and grievances and people complain about how someone's rating is too low, high, etc etc.

At the end of the day, go out play and try to get better. As you get better, you will win more matches.
 

Moon Shooter

Hall of Fame
Why does it matter?

Why does what matter?
If you want to have competitive matches, you need some type of system where players are divided into groups. Since USTA is nation wide and caters to all communities, it has vague groups of 3.0, 3.5, 4.0, etc where the matches should be somewhat competitive.

Sure but USTA ignores so much data that they don't even claim people in the exact same level are competitive. My point is they could use much more data and make the leagues more competitive.


If you make the groups too small like UTR, then you wont have enough players in each group to have competitive leagues. If you make the ratings dynamically updating like UTR, then it makes it too much of a hassle to captain and manage leagues.

Publishing the full ratings does not mean you need to create new catagories for each hundredth of a point.
How does knowing exactly how the algorithm work help you as a player?

It makes the rating meaningful. If I know that being 1 utr point higher than someone else means I should win 3 out of five games that helps me know what the number is supposed to mean. If I don't know that then who knows? I mean if I say "you have a 12 on the moonshooter rating and Vox has an 8" what is that even supposed to mean? Who knows and therefore who cares? But if I say each point difference equates to one less game the lower rated player should win in a match then it has some meaning people can understand.

What benefit is there?

Having a meaningful rating system that people understand would mean people could set goals and see if their practice routines are actually improving their game. When people play against the same people that are also playing and improving they may not realize this. Am I losing to this guy that I used to be able to beat because he is improving faster than me or am I getting worse? I would consider that valuable information.

Knowing people, it would only cause more drama, gossiping, and grievances and people complain about how someone's rating is too low, high, etc etc.

At the end of the day, go out play and try to get better. As you get better, you will win more matches.

As I got better I lost more matches because I played better players. But as for the past 6 months it is hard to say if I have gotten better or worse because there is not good rating system in tennis. I would still like to know.
 

schmke

Legend
The big difference is UTR will rate matches not in UTR or USTA. I can play a match anywhere in the world with a friend and get at least an unverified utr rating out of it. USTA only rates USTA matches. UTR is geared for a much more wide open field.
Sure, but I only play USTA (virtually no UTR events in my area) so the matches used for NTRP and UTR are exactly the same so this is a moot point.

And it is this head to head comparison where the unexplained volatility when no new matches are added to the system that I'm pointing out is a negative for UTR. May UTR work well in other cases? Sure, but it clearly works very poorly in some too. And if it has wild instability/volatility for players with 10+ matches due solely to periodic/iterative recalculations with no new matches added, that is an indication there may be a problem in the algorithm.
 

schmke

Legend
Remember UTR does not use a self rates.
Oh the irony ...

No more than an hour after I read this and posted the earlier reply that UTR does effectively have self-rates because someone has to elect what tournament/draw to enter, I check my inbox and guess what, UTR is introducing self-rating!

In an e-mail I (and I assume others) received today, UTR is launching "Estimated UTR Ratings" where "The ratings are generated after a player takes a brief four-step questionnaire". This sounds an awful lot like the USTA's self-rate questionnaire that spits out an "estimated level" the player should play at. UTR apparently will give an estimated range, something like 5.75-7.75 is an example they give. This is pretty much identical to a USTA self-rate level.
 

travlerajm

Talk Tennis Guru
My current UTR page (as of today) has more evidence that something is amiss in the UTR algorithm.

My UTR is based on two mixed matches. In the first match, my partner and I won exactly the same number of games as our opponents, who had a combined UTR of 11.2. My partner that match has a UTR of 3.4. So my expected match rating for that match would be about 7.8 to balance the combine ratings on each side.

My second match, my opponents have a combined UTR of 9.8, and my partner 4.1. We won twice as many games as our opponents in a lop-sided win, so my expected match rating would have to be significantly higher than the 5.7 difference, probably in the 8 range.

Yet UTR says I’m 5.2 based on these matches. Does not compute. Of course, I won’t be a 5 for long. I could be an 8 tomorrow.
 

Moon Shooter

Hall of Fame
Sure, but I only play USTA (virtually no UTR events in my area) so the matches used for NTRP and UTR are exactly the same so this is a moot point.

It is a moot point for you. But it is not a moot point for the rest of the world that UTR is trying to operate in.

I intend to post match results that are neither USTA not UTR. They will not be "verified" ratings but the algorithm works the same.

Again if some actual 2.0 strength player self rated himself as a 5.5 and played 3 matches and lost them all 6-0 to mid 5.5 players what would his USTA rating be? Would it not likely be a 5.0C or at least a 4.5C? So yes USTA is anchoring to your self rate. If you self rate fairly correctly this anchoring is good. But when you have a system that you are going to throw out to the world and have colleges give scholarships worth hundreds of thousands of dollars on, anchoring on says so, like USTA does may not be the best option.

In the above case UTR would likely have that persons provisional rating bouncing all over from a 5.0 equivalent to a 2.0 equivalent. But that is actually more accurate then USTA.

UTR can use self rates as well, but I don't think that will make the ratings more accurate. I think you will still see the same volatility if you only play 3 matches. That is because if you only have 3 match results the range someone's rating could be is very large.

And it is this head to head comparison where the unexplained volatility when no new matches are added to the system that I'm pointing out is a negative for UTR. May UTR work well in other cases? Sure, but it clearly works very poorly in some too. And if it has wild instability/volatility for players with 10+ matches due solely to periodic/iterative recalculations with no new matches added, that is an indication there may be a problem in the algorithm.

I agree that the volatility is a huge negative to the system. But new matches are always being added to the system. Remember UTR stupidly has a floor and a ceiling. So every match means everyone has to be at least slightly jostled around within that ceiling and floor. The ceiling and floor also means they can never say for example that if you are 1.00 utr point higher than the other player we predict you would win 6-3 6-3. So the ceiling and floor make it impossible for UTR to have real tangible meaning for players.

I am not a big fan of UTR. But if you play mixed doubles and you see someone's NTRP based on 3 matches they played a year ago that ignores dozens of mixed matches the person has played more recently UTR is going to be a better measure. The difference in ratings between men and women is well documented by your own data which shows having a higher male rated player gives more bang for the buck. If they were in fact equivalent and canceled each other out that wouldn't happen.

Also I have explained the problems with NTRP for mixed. UTR has published a chart about the rating conversions that fits pretty well with what I see. And it shows that the male catagories are much larger than the female categories. I may not have a full handle of how that came to be, but it is pretty obvious to me that the skill level between a lower male 3.0C and a top 3.0C is larger than for women. The same is true for 3.5 and 4.0.


The sum of UTR ratings which goes to a hundredth are a much better predictor of mixed doubles outcomes than the sum of USTA ratings which only go to .5 (which covers between 4-6 UTR half points).


Another reason for the difference between men and women's ntrp categories may be the fact that you are in fact a 3.5c rated player even if you only played 3 USTA adult rated matches 18 months ago. So UTR is keeping track of all those USTA mixed matches you have been playing and adjusting your rating but USTA ignores that. Since women have more same gender leagues they will have narrower levels because their USTA rating is properly keeping up with their actual strength at a given time.
 

Moon Shooter

Hall of Fame
My current UTR page (as of today) has more evidence that something is amiss in the UTR algorithm.

My UTR is based on two mixed matches. In the first match, my partner and I won exactly the same number of games as our opponents, who had a combined UTR of 11.2. My partner that match has a UTR of 3.4. So my expected match rating for that match would be about 7.8 to balance the combine ratings on each side.

My second match, my opponents have a combined UTR of 9.8, and my partner 4.1. We won twice as many games as our opponents in a lop-sided win, so my expected match rating would have to be significantly higher than the 5.7 difference, probably in the 8 range.

Yet UTR says I’m 5.2 based on these matches. Does not compute. Of course, I won’t be a 5 for long. I could be an 8 tomorrow.



Depending on how well established your opponents and your partners rating is you could be anywhere from a 3.0 to a 9.0 UTR. If they only have 2 matches as well then sure it is a complete crap shoot. USTA would just sit there saying you are whatever you self rate is and give no more information but the NTRP algorithm wouldn't actually know better based on so few results.

NTRP *is* better if it included more results such as games played before 12 months ago. Both rating systems have big problems because they both ignore far too much relevant data.

It seems like UTR is not picking your mixed matches and perhaps not picking up you mixed matches in your area. I am not sure why that is but it is a problem for UTR especially when they have the arbitrary 12 month cut off. My most recent mixed matches have not been recorded in UTR either. I am not sure why. But I think it is because they were at sectionals and Tennis Link does not acknowledge that a sectionals event happened. Tennis link says no one qualified for post season play even though 4 teams did in our area. So UTR may not be counting them as verified play because Tennis link isn't recognizing the event.

In any event talking about what UTR does with 2 matches is just silly. Really fewer than 8 matches and even there if you partner or opponents have very few matches it can be volaltile as well. I have 8mixed matches and it has me hovering about 4.75-5.25. I know a few of those matches I got lucky winning many deuce games. But that luck ran the other way in two of my more recent games so I anticipate I will settle in about 4.5-4.75.

For whatever reason I have been able to predict how my rating would be effected by the results of my partners and opponents. When my partners get big wins with another team mate my rating goes down if they get big losses my rating goes up. When my opponents get big wins my rating goes up when they get big losses it goes down.
 

travlerajm

Talk Tennis Guru
Depending on how well established your opponents and your partners rating is you could be anywhere from a 3.0 to a 9.0 UTR. If they only have 2 matches as well then sure it is a complete crap shoot. USTA would just sit there saying you are whatever you self rate is and give no more information but the NTRP algorithm wouldn't actually know better based on so few results.

NTRP *is* better if it included more results such as games played before 12 months ago. Both rating systems have big problems because they both ignore far too much relevant data.

It seems like UTR is not picking your mixed matches and perhaps not picking up you mixed matches in your area. I am not sure why that is but it is a problem for UTR especially when they have the arbitrary 12 month cut off. My most recent mixed matches have not been recorded in UTR either. I am not sure why. But I think it is because they were at sectionals and Tennis Link does not acknowledge that a sectionals event happened. Tennis link says no one qualified for post season play even though 4 teams did in our area. So UTR may not be counting them as verified play because Tennis link isn't recognizing the event.

In any event talking about what UTR does with 2 matches is just silly. Really fewer than 8 matches and even there if you partner or opponents have very few matches it can be volaltile as well. I have 8mixed matches and it has me hovering about 4.75-5.25. I know a few of those matches I got lucky winning many deuce games. But that luck ran the other way in two of my more recent games so I anticipate I will settle in about 4.5-4.75.

For whatever reason I have been able to predict how my rating would be effected by the results of my partners and opponents. When my partners get big wins with another team mate my rating goes down if they get big losses my rating goes up. When my opponents get big wins my rating goes up when they get big losses it goes down.
Any rating algorithm should be able to calculate a match rating for any given match where the other players on court are already rated. Your rating can then be determined from averaging your match ratings.

You missed my point, which is UTR obviously has a serious bug.
 
After I bought the briffidi, I’ve maybe bought a few racquets but I’ve sold off a ton.
My current UTR page (as of today) has more evidence that something is amiss in the UTR algorithm.

My UTR is based on two mixed matches. In the first match, my partner and I won exactly the same number of games as our opponents, who had a combined UTR of 11.2. My partner that match has a UTR of 3.4. So my expected match rating for that match would be about 7.8 to balance the combine ratings on each side.

My second match, my opponents have a combined UTR of 9.8, and my partner 4.1. We won twice as many games as our opponents in a lop-sided win, so my expected match rating would have to be significantly higher than the 5.7 difference, probably in the 8 range.

Yet UTR says I’m 5.2 based on these matches. Does not compute. Of course, I won’t be a 5 for long. I could be an 8 tomorrow.
UTR is not for you Tjam, I suggest you never look at it again, will free up a lot of time.
 

Moveforwardalways

Hall of Fame
For a UTR mixed doubles league then, would each male and female pairing be of the same UTR? That would certainly be a different dynamic to go on court with the females being just as good as the males. It would also virtually eliminate mixed doubles league for many of us since most areas would not have enough women in the UTR 7-8 or higher range to form a league. I think USTA does it this way for a reason, which is to get enough people of both genders playing in order to have a league.
 

Moon Shooter

Hall of Fame
Any rating algorithm should be able to calculate a match rating for any given match where the other players on court are already rated. Your rating can then be determined from averaging your match ratings.

You missed my point, which is UTR obviously has a serious bug.

No just because it gives you a range of ratings after two matches (with people that may not have established ratings either) it does not mean there is a bug. It means when you only play two matches with people that also may not have established ratings, there is no way to tell for sure where you fit between 1 and 17 in their scale. If they assigned you a number and stuck with that number no matter what it wouldn't make the number more accurate or better. It would just be hiding the variability that exists. USTA hides the huge variability that exist in each level by sticking with my 3.0S rating since June of 2021 and ignoring my mixed doubles matches. But I can tell you my UTR is much more accurate than my USTA rating.

When you have a floor and ceiling and you modify past ratings based on future matches every result reshuffles everyone else's. It's how the UTR system works. I really dislike that they designed it that way and I agree that system leaves much to be desired when it comes to adult rec tennis. But it is not a bug, and it seems to work pretty well for pro tennis and college/high school tennis. I just think they could have a system that works just as well for those players that doesn't have the drawbacks their current system has for adult players.

And if you think looking at USTA ratings is a better predicter for mixed doubles matches even when people have established UTR's ok. I'm tired of arguing. But UTR's data takes much of the mystery out of Schmke's data on mixed doubles if you understand the rating systems.
 

Moon Shooter

Hall of Fame
For a UTR mixed doubles league then, would each male and female pairing be of the same UTR? That would certainly be a different dynamic to go on court with the females being just as good as the males. It would also virtually eliminate mixed doubles league for many of us since most areas would not have enough women in the UTR 7-8 or higher range to form a league. I think USTA does it this way for a reason, which is to get enough people of both genders playing in order to have a league.


I think UTR could and should just say the sum total of the doubles team should not exceed X.XX. It should not have any restrictions on gender. This would greatly increase the pool of available players and help eliminate cross gender imbalances.

The problem is their ratings jump around so much so what day do you use? I mean if I was a 4.75 today and sign up to play a match and then we find out the first day of the tournament I am suddenly a 5.25 what rating should count? I talked to a guy at UTR and they said they are just sort of leaving that to the tournament directors and also trying to give a rating predicter that tries to give more consistency. I think they are also introducing self rates due to this issue.

There are a few potential solutions to this issue (for example I think UTR could give you a quarterly rating that is the average of your rating the prior quarter, and the tournament director could say which quarter they are using)

But I think most UTR people tend to want to increase their rating not sandbag so they don't mind and might even be happy to play people a bit higher rated then they are. I mean the reason I would play in a UTR event is to try to get good results and gain rating points. But if they start going down the same road as USTA they will have the same problems as USTA - I just wish they would go down a different road.
 

TennisOTM

Professional
My conclusion from this is that the guy probably isn't playing with enough stagger in 7.0, where he could potentially have an advantage.

My 8.0 mixed partner who I've played the most matches with (I believe with 10-0 record over the years) has played a ton of matches in 3.5 ladies dubs, 7.0 mixed, and 8.0 mixed.
I looked up her records over the 5-year period where she played with me (100+ total matches).

Her average TR match ratings:
3.5 ladies dubs: 3.14 (60 matches)
All mixed: 3.18 (46 matches)
7.0 mixed: 3.19 (20 matches)
8.0 mixed: 3.18 (26 matches)
8.0 mixed with me: 3.24 (5 matches showing a TR rating)

I'm very strict about enforcing her net positioning (having to remind her a few times a match not to back up), so that would explain why she outperforms her mixed average when she plays with me.
Ladies' dubs doesn't have an unbalanced pairing, so it's not surprising that she performs worse than in mixed (statistically significant enough that std error bars don't overlap).

Interesting data, but those differences are pretty tiny. Even if the difference you noted between 3.14 ladies and 3.18 mixed is statistically significant, that's a pretty small magnitude of difference. The difference between 3.18 in all mixed matches vs. 3.24 with you is also pretty small.

I'm willing to admit there could be something there, but I'm guessing you'd find similar differences with any partner pairings who mesh well. For example the Bryan brothers probably would've had higher individual ratings from matches they played together vs. ones they played with other random partners. In other words, this result is probably not unique to the scenario of extreme skill-gap partners with a good take-over-the-court player.
 

travlerajm

Talk Tennis Guru
Interesting data, but those differences are pretty tiny. Even if the difference you noted between 3.14 ladies and 3.18 mixed is statistically significant, that's a pretty small magnitude of difference. The difference between 3.18 in all mixed matches vs. 3.24 with you is also pretty small.

I'm willing to admit there could be something there, but I'm guessing you'd find similar differences with any partner pairings who mesh well. For example the Bryan brothers probably would've had higher individual ratings from matches they played together vs. ones they played with other random partners. In other words, this result is probably not unique to the scenario of extreme skill-gap partners with a good take-over-the-court player.
I will try to find other examples randomly sampled.

My hypothesis is that 3.5 women who play many usta matches in both ladies’ and mixed 8.0 will have higher rating in mixed. The same will be true of 4.5 men who play many matches in both men’s doubles and mixed.
 

Moon Shooter

Hall of Fame
I will try to find other examples randomly sampled.

My hypothesis is that 3.5 women who play many usta matches in both ladies’ and mixed 8.0 will have higher rating in mixed. The same will be true of 4.5 men who play many matches in both men’s doubles and mixed.

Travlerajm you have only played 5 matches with her in the past 5 years and you are trying to draw generalized conclusions from that?

This combined with your view that UTR has a bug because your rating bounces around after you played two matches, makes me think you are not understanding how important data is to the accuracy of these rating systems.

There are thousands of reasons that could explain what you see with these tiny sample sizes. You are trying to catch a fart in a whirlwind.

Schmke likely tried to find a much larger sample size or used some sort of data dump to see that unbalanced pairs with higher rated men do better at the 7.0 and 8.0 level. I think the results are adequately explained by:

1) it is easier to find a single good 4.0 or 4.5 male and pair them with a single 3.0 or 3.5 female than it is to find 4-6 3.5 (or 4.0) players that are all at the top of their level.

and

2) Each ntrp male rating point up to the top of 4.5 covers more range of skill than each female ntrp rating point. This is demonstrated by the UTR chart. So you are getting more skill per hundredth of an ntrp point when you spend them on the male than you would when you spend them on the female. This changes when you hit 9.0 mixed where I think the 5.0 female with a 4.0 male seems at least as good if not slightly better.
 

Max G.

Legend
There are a few potential solutions to this issue (for example I think UTR could give you a quarterly rating that is the average of your rating the prior quarter, and the tournament director could say which quarter they are using)

You're slowly reinventing all the things you don't like about USTA NTRP :)

It's a bad thing for planning when ratings can change real fast, because then the day before (or the week before) a tournament or league, you have no idea if you're eligible. So you can take the ratings and average them over a long period and then use them for a while. You suggested quarters, USTA does 1 year, but the idea's the same.
 
For UTR, changing ratings don't matter much, for one thing people aren't trying to sandbag, event organizers can do what they want based on their knowledge of players so no one will be "disqualified" if something extreme happens (good luck finding a huge group of players whose UTR changes more than .5 points in a week). Whatever major problems UTR has, they don't include sandbagging and fluctuating ratings for putting on events. The UTR events are running very smoothly.
 

travlerajm

Talk Tennis Guru
My current UTR page (as of today) has more evidence that something is amiss in the UTR algorithm.

My UTR is based on two mixed matches. In the first match, my partner and I won exactly the same number of games as our opponents, who had a combined UTR of 11.2. My partner that match has a UTR of 3.4. So my expected match rating for that match would be about 7.8 to balance the combine ratings on each side.

My second match, my opponents have a combined UTR of 9.8, and my partner 4.1. We won twice as many games as our opponents in a lop-sided win, so my expected match rating would have to be significantly higher than the 5.7 difference, probably in the 8 range.

Yet UTR says I’m 5.2 based on these matches. Does not compute. Of course, I won’t be a 5 for long. I could be an 8 tomorrow.
My incongruent rating from second match has now been downgraded even further, to a UTR 4. But to keep things in line when my partner was a 3.5Cand my female opponent was a 4.5C, UTR had to knock the male opponent all the way down to a UTR 3.0. Poor guy. He’s a strong 4.0 ntrp. Maybe his UTR luck will be better tomorrow.
 
My incongruent rating from second match has now been downgraded even further, to a UTR 4. But to keep things in line when my partner was a 3.5Cand my female opponent was a 4.5C, UTR had to knock the male opponent all the way down to a UTR 3.0. Poor guy. He’s a strong 4.0 ntrp. Maybe his UTR luck will be better tomorrow.
My advice is still to ignore UTR, I don't think it applies to you TJ. Unless you've been trying to enter UTR singles events every weekend and getting denied the level you want.8-B:-D
 

travlerajm

Talk Tennis Guru
My advice is still to ignore UTR, I don't think it applies to you TJ. Unless you've been trying to enter UTR singles events every weekend and getting denied the level you want.8-B:-D
I like the concept of UTR, but wish they would fix the bugs! My pro client also has UTR that is jumping around like a Mexican jumping bean because the algorithm can’t seem to handle outlier results, despite him being “100% verified.”
 
I like the concept of UTR, but wish they would fix the bugs! My pro client also has UTR that is jumping around like a Mexican jumping bean because the algorithm can’t seem to handle outlier results, despite him being “100% verified.”
I understand, I guess in your world it isn't working, but like I said, what's it stopping you from doing in tennis? For all the thousands of people entering and playing UTR matches every week, it's working to perfection.
 

Max G.

Legend
For UTR, changing ratings don't matter much, for one thing people aren't trying to sandbag, event organizers can do what they want based on their knowledge of players so no one will be "disqualified" if something extreme happens (good luck finding a huge group of players whose UTR changes more than .5 points in a week). Whatever major problems UTR has, they don't include sandbagging and fluctuating ratings for putting on events. The UTR events are running very smoothly.

My experience with UTR events was that they just handwave the ratings away when they're not there. I played in a few UTR leagues (one team league, a few flex singles leagues) and in both, the listed level seemed like a suggestion at best - there were people in the league outside of it, the coordinator assigned unranked people to levels just by... who knows how...

Which works out fine, actually!

The rating system is there to support some particular use case. It's not there in a vaccum - it's part of the whole organizing system that's trying to set people up with matches in particular ways.
 

Moon Shooter

Hall of Fame
You're slowly reinventing all the things you don't like about USTA NTRP :)

It's a bad thing for planning when ratings can change real fast, because then the day before (or the week before) a tournament or league, you have no idea if you're eligible. So you can take the ratings and average them over a long period and then use them for a while. You suggested quarters, USTA does 1 year, but the idea's the same.


The tournament can pick a rating date that is well in advance of their event. There is no reason they have to go by the rating that was just published yesterday. In chess tournaments simply say what published rating date they use. It is not hard.

USTA ignores so much data you could have played 50 matches in the past 3 years and USTA will ignore them all except 3 matches you played 3 years ago. So to say USTA does this every year is only part of the story.
 

Moon Shooter

Hall of Fame
I like the concept of UTR, but wish they would fix the bugs! My pro client also has UTR that is jumping around like a Mexican jumping bean because the algorithm can’t seem to handle outlier results, despite him being “100% verified.”

No rating system can handle outlier events well. You need more data. I know you want to insist there are bugs based on 3 matches or something but that is just not how it works.


My incongruent rating from second match has now been downgraded even further, to a UTR 4. But to keep things in line when my partner was a 3.5Cand my female opponent was a 4.5C, UTR had to knock the male opponent all the way down to a UTR 3.0. Poor guy. He’s a strong 4.0 ntrp. Maybe his UTR luck will be better tomorrow.

Choosing a NTRP 4.5 female in an 8.0 mixed match is throwing away rating points that would be better spent on the male part of the team.

As far as a "strong 4.0 ntrp" having a "UTR 3.0" I think you are likley just rounding down 3.9X or something and/or it is likely he has not played many UTR matches in that catagory (singles or doubles). Unless you paid for the UTR service they just give you the first number but there is a big difference between a 3.0X UTR and a 3.99 UTR player.

Because if he is actually a 3.0X UTR after more than say 7 or 8 matches then he is not a strong 4.0 ntrp. Maybe he used to be but he is not playing like one. No way.
 

travlerajm

Talk Tennis Guru
No rating system can handle outlier events well. You need more data. I know you want to insist there are bugs based on 3 matches or something but that is just not how it works.




Choosing a NTRP 4.5 female in an 8.0 mixed match is throwing away rating points that would be better spent on the male part of the team.

As far as a "strong 4.0 ntrp" having a "UTR 3.0" I think you are likley just rounding down 3.9X or something and/or it is likely he has not played many UTR matches in that catagory (singles or doubles). Unless you paid for the UTR service they just give you the first number but there is a big difference between a 3.0X UTR and a 3.99 UTR player.

Because if he is actually a 3.0X UTR after more than say 7 or 8 matches then he is not a strong 4.0 ntrp. Maybe he used to be but he is not playing like one. No way.
The 3.0 UTR guy is back up to a UTR 7.1 in singles today. And my UTR is back up to UTR 8 today, after it’s usual swing through the ladies 3.5 ntrp rating range.

I really do not understand why some people turn a blind eye to obvious bugs in the UTR algorithm and blame it on small sample size. A properly designed algorithm only needs 2-3 matches (including the score ) to give a stable and reasonably accurate rating. UTR obviously can’t do that. And since the majority of league players only play a few matches per year, and UTR only uses matches going back 12 months, UTR is only able to provide stable ratings for a minority of league players.
 
Last edited:

Moon Shooter

Hall of Fame
The 3.0 UTR guy is back up to a UTR 7.1 in singles today. And my UTR is back up to UTR 8 today, after it’s usual swing through the ladies 3.5 ntrp rating range.

I really do not understand why some people turn a blind eye to obvious bugs in the UTR algorithm and blame it on small sample size. A properly designed algorithm only needs 2-3 matches (including the score ) to give a stable and reasonably accurate rating. UTR obviously can’t do that. And since the majority of league players only play a few matches per year, and UTR only uses matches going back 12 months, UTR is only able to provide stable ratings for a minority of league players.

Ok what should my NTRP rating be after I played 3 matches and lost them all 0-6 0-6?

Can you give me a ball park or do you need more information? Like do you need to have a good idea what my opponents ratings were and what my partners ratings were? Of course you do. Even then you might be way off because of the scores of the matches.

I agree that UTR is foolish for drawing a hard line a 12 months.

Stable ratings are not necessarily accurate ratings. I just played someone whose rating has remained stable at 3.0s for years. But he has recently been playing Mixed doubles and has a UTR of a bit over 5.0 and I think that is a much more accurate than his stable rating of 3.0S.
 

travlerajm

Talk Tennis Guru
Ok what should my NTRP rating be after I played 3 matches and lost them all 0-6 0-6?

Can you give me a ball park or do you need more information? Like do you need to have a good idea what my opponents ratings were and what my partners ratings were? Of course you do. Even then you might be way off because of the scores of the matches.

I agree that UTR is foolish for drawing a hard line a 12 months.

Stable ratings are not necessarily accurate ratings. I just played someone whose rating has remained stable at 3.0s for years. But he has recently been playing Mixed doubles and has a UTR of a bit over 5.0 and I think that is a much more accurate than his stable rating of 3.0S.
If a player plays 3 matches in 4.0 league and loses all 3 6-0, 6-0, I can tell you definitely that the player is neither a 4.0 or a 3.5 player.
 

silentkman

Hall of Fame
Ok what should my NTRP rating be after I played 3 matches and lost them all 0-6 0-6?

Can you give me a ball park or do you need more information? Like do you need to have a good idea what my opponents ratings were and what my partners ratings were? Of course you do. Even then you might be way off because of the scores of the matches.

I agree that UTR is foolish for drawing a hard line a 12 months.

Stable ratings are not necessarily accurate ratings. I just played someone whose rating has remained stable at 3.0s for years. But he has recently been playing Mixed doubles and has a UTR of a bit over 5.0 and I think that is a much more accurate than his stable rating of 3.0S.

has anyone contacted USTA about this alleged issue and offered a viable solution? I'm not going back to read everything from the past.
 

Moon Shooter

Hall of Fame
If a player plays 3 matches in 4.0 league and loses all 3 6-0, 6-0, I can tell you definitely that the player is neither a 4.0 or a 3.5 player.

First of all I didn't say it was a 4.0 league. UTR is an international system that doesn't take national groupings into account. Plus we are talking about mixed doubles so the ratings of others may be wildly inconsistent. But even if I grant it happened in a 4.0 league with the same gender your partner could be a low level 3.5. He could even be a self rate 3.5 that is really a 3.0. So you need more information.

But lets eliminate the partner and say someone got this score playing singles in a 5.5 league. Their rating could be anything from a 1.0 to a 5.0. Just because NTRP doesn't fluctuate between NTRP 1.0 and 5.0 that does not mean the algorithm is better or that the algorithm spits out a more accurate number.
 

J011yroger

Talk Tennis Guru
I've seen UTR have NTRP 4.5 men rated less than 1.0 higher than their NTRP 3.5 women partners, and have seen UTR ratings vary wildly (@travlerajm is an example) with no new matches played by the player (and likely none by anyone a few degrees of separation away), so I wouldn't go so far as to say that UTR is the gold standard and its prediction for mixed matches is correct. Data is part of it, but a stable algorithm is a huge part of it too.

It's looking like UTR does not count the vast majority of my mixed matches towards my rating.

J
 

Moon Shooter

Hall of Fame
No like when I filter by matches that count towards my rating the 40+ mixed ones disappear.

J

Utr did not count any of my mixed matches at sectionals. But neither did tennis link. Tennis link said my team didn’t qualify for post season play. But utr did count my other usta matches including my 40 and over mixed matches.
 

Moon Shooter

Hall of Fame
For that there’s always trusty TR.

TR apes USTA so it will not give a match or mixed rating for any mixed combo matches. (games where each side have to have X.5 on each side of the court. ) UTR will include these combo games if they show up in tennis link - and most do. But tennis link has no record of our mixed combo sectionals championship. I think it is pathetic that USTA doesn't even record results for mixed doubles sectionals. It just shows how little they care about adult rec tennis.

I am not sure what the issue J011yroger is having.
 

travlerajm

Talk Tennis Guru
TR apes USTA so it will not give a match or mixed rating for any mixed combo matches. (games where each side have to have X.5 on each side of the court. )
TR gives separate mixed and same-gender match ratings and averaged ratings.
 
Last edited:

TennisOTM

Professional
TR apes USTA so it will not give a match or mixed rating for any mixed combo matches. (games where each side have to have X.5 on each side of the court. ) UTR will include these combo games if they show up in tennis link - and most do. But tennis link has no record of our mixed combo sectionals championship. I think it is pathetic that USTA doesn't even record results for mixed doubles sectionals. It just shows how little they care about adult rec tennis.

I am not sure what the issue J011yroger is having.

Intermountain has a Mixed X.5 league, and all the match results, including sectional championships, are on Tennislink, Tennisrecord, and UTR.
 

Moon Shooter

Hall of Fame
Intermountain has a Mixed X.5 league, and all the match results, including sectional championships, are on Tennislink, Tennisrecord, and UTR.

I don't doubt it. But my 6.5 ******* sectionals that was played in Milwaukee on April 23rd and 24th is completely MIA. Tennis link just says we did not qualify for post season play.

The people that ran sectionals seemed not to follow the rules and just be like "whatever" to my captains objections on other issues as well. Including allowing certain second place teams to qualify even though their state was already represented by a first place teams. You have to win state to play in sectionals - unless we choose to let you play in sectionals anyway!
 

schmke

Legend
I don't doubt it. But my 6.5 ******* sectionals that was played in Milwaukee on April 23rd and 24th is completely MIA. Tennis link just says we did not qualify for post season play.

The people that ran sectionals seemed not to follow the rules and just be like "whatever" to my captains objections on other issues as well. Including allowing certain second place teams to qualify even though their state was already represented by a first place teams. You have to win state to play in sectionals - unless we choose to let you play in sectionals anyway!
I don't know what the qualification rules are in your section for this league, but some areas/districts will advance the winner and a wildcard to the next phase of playoffs, so it isn't always just the flight/local-playoff winner that advances. This is often done to get more teams at Sectionals to fill out the flights and give more matches or even out the flights. Sometimes, where wildcards will be given is known in advance and documented, other times it is (or appears to be) more ad hoc and reacting to how many teams sign up to go to Sectionals and if there is an odd number.

For example, in PNW, our Sectionals are generally two flights of four teams playing round-robin with the flight winners playing a final to advance to Nationals. We don't have eight districts to fill all those slots, so wildcards are given out to second place teams from the districts with the most teams. This is generally known in advance so at local playoffs all the teams know finishing second there will advance to Sectionals.
 

Moon Shooter

Hall of Fame
I don't know what the qualification rules are in your section for this league, but some areas/districts will advance the winner and a wildcard to the next phase of playoffs, so it isn't always just the flight/local-playoff winner that advances. This is often done to get more teams at Sectionals to fill out the flights and give more matches or even out the flights. Sometimes, where wildcards will be given is known in advance and documented, other times it is (or appears to be) more ad hoc and reacting to how many teams sign up to go to Sectionals and if there is an odd number.

Yes we had an "ad hoc" variety of this very thing. The rules did not seem to allow it but they did it anyway. Sectionals was going to be Illinois and Wisconsin only because the top Indiana team said they couldn't make it. It is unclear if they even asked the second place Indiana team but instead just announced the second place Illinois team would play as well. And then they added the second place Wisconsin team.

Our captain objected and we were almost not even going to go.

I would have much preferred just playing 3 matches against Wisconsin's top team or played Indiana's second place team which seemed to be what the rules said could happen. No one ever explained why we couldn't do that.

We also didn't even know the date we were playing until about 5 weeks before and the block of rooms was already filled up. The ad hoc wild card team bought all the rooms up (for 120/night) and offered to sell them back to us for $450 per room per night!

For example, in PNW, our Sectionals are generally two flights of four teams playing round-robin with the flight winners playing a final to advance to Nationals. We don't have eight districts to fill all those slots, so wildcards are given out to second place teams from the districts with the most teams. This is generally known in advance so at local playoffs all the teams know finishing second there will advance to Sectionals.

There were no nationals for this and none of this was in any sort of rules and the rules that existed seem to imply this wouldn't happen. It was all based on the tournament director's say so.
 

Moon Shooter

Hall of Fame
TR gives separate mixed and same-gender match ratings and averaged ratings.

Do you see what match ratings people get when they play mixed doubles that end in a .5? So for example mixed 6.5, 7.5, or 8.5?

It never lists it when I looked. I though it was because USTA does not include any of those leagues even in your mixed exclusive rating so TR also doesn't use those matches in their Mixed exclusive ratings calculations.
 
Top