2023 Match Result Prediction Competition

TennisOTM

Professional
Competition update 4:

12 more doubles matches in the books, and we have a new leader!

Here are the standings and "W-L" records for each rating system so far:

UTR: 29-10 (74%)
WTN: 32-12 (73%)
TLS: 16-6 (73%)
TR: 24-12 (67%)

WTN blew its early lead by going just 7-5 in predicting matches this week, while UTR went 8-3 and took over first place. Meanwhile TLS and TR improved their win percentages and closed the gap. We have a tight battle - it's anyone's game.
 

schmke

Legend
Competition update 4:

12 more doubles matches in the books, and we have a new leader!

Here are the standings and "W-L" records for each rating system so far:

UTR: 29-10 (74%)
WTN: 32-12 (73%)
TLS: 16-6 (73%)
TR: 24-12 (67%)

WTN blew its early lead by going just 7-5 in predicting matches this week, while UTR went 8-3 and took over first place. Meanwhile TLS and TR improved their win percentages and closed the gap. We have a tight battle - it's anyone's game.
And my ratings went 8-2 so now 27-10 (73%).

It is interesting that UTR and WTN have predicted a few more matches than I have, TR is just one behind me, but TLS has predicted far fewer. The best so far being right at 73%/74% is very in line with my earlier post on how often generally the higher rated pair wins.
 

TennisOTM

Professional
And my ratings went 8-2 so now 27-10 (73%).

It is interesting that UTR and WTN have predicted a few more matches than I have, TR is just one behind me, but TLS has predicted far fewer. The best so far being right at 73%/74% is very in line with my earlier post on how often generally the higher rated pair wins.

I think TLS failed to record a chunk of 2022 matches in this area, so they have more unrated players than they should.

WTN has the most predictions b/c there are a bunch of players with years-old match data, and WTN does not let their rating expire like the others. Interesting that this does not seem to have hurt them much so far - the predictions in matches involving those "old" ratings are doing about as well as the others.
 

penpal

Semi-Pro
Good stuff. I can imagine team captains using this sort of information as they think through setting their lineups and trying to figure out how the opposing captain will set his lineup. Could add excitement to a match when a team knows the statistical likelihood of outcomes (e.g., 'Hey guys, we were only favored to win 1 of the 4 matches, but our underdogs on Court 10 just pulled an upset.')

Then again, I haven't been involved in USTA leagues in awhile, so maybe this has already been happening.
 

Moon Shooter

Hall of Fame
I would not say that the WTN is necessarily "worse" when it is willing to make predictions on more matches with only a 1% decrease in accuracy. UTR is basically saying it has "absolutely no idea" on matches where WTN is predicting matches with a decent accuracy.

Personally the problem I have with WTN is I had a match where the name of one of my opponents obviously did not register so it listed him with a something like a ridiculous 31.9. So the issue had nothing to do with the algorithm but rather how USTA is adding USTA players to the WTN pool. But this USTA to WTN issue meant my opponent was off by about 5 points and therefore really threw my rating into the basement.

Edit: it is interesting how off TR is when it seems to try to mimic usta.
 

TennisOTM

Professional
I would not say that the WTN is necessarily "worse" when it is willing to make predictions on more matches with only a 1% decrease in accuracy. UTR is basically saying it has "absolutely no idea" on matches where WTN is predicting matches with a decent accuracy.

Personally the problem I have with WTN is I had a match where the name of one of my opponents obviously did not register so it listed him with a something like a ridiculous 31.9. So the issue had nothing to do with the algorithm but rather how USTA is adding USTA players to the WTN pool. But this USTA to WTN issue meant my opponent was off by about 5 points and therefore really threw my rating into the basement.

Edit: it is interesting how off TR is when it seems to try to mimic usta.

Yeah I expected WTN to be worse because of some of their outlier nonsense ratings and not because of their ratings based on older data. I do think UTR would benefit by including match results from more than one year ago for those who play infrequently.

TR is in last place but not far off - could be a random-noise difference at this point. But one disadvantage they (and USTA) have is combining singles and doubles into one rating - I think they've had one or two wrong predictions so far in doubles matches involving players who are usually singles specialists.
 

Chalkdust

Professional
Considering that:
1. Rec players often have significant variation in level of play from day to day, and that
2. Styles make fights, and that
3. In doubles, partner fit is an important factor,
I would expect there to be a significant number of upsets.
Maybe 30% or so overall, perhaps more so when the gap in average level (i.e. over multiple matches) between the opponents is relatively close, much less so when the gap is large.
So any rating system that is able to predict more often than around 70% of results over time between opponents within say 1/2 a SD of each other is IMO indicative of inaccurate ratings also.
 

schmke

Legend
Yeah I expected WTN to be worse because of some of their outlier nonsense ratings and not because of their ratings based on older data. I do think UTR would benefit by including match results from more than one year ago for those who play infrequently.

TR is in last place but not far off - could be a random-noise difference at this point. But one disadvantage they (and USTA) have is combining singles and doubles into one rating - I think they've had one or two wrong predictions so far in doubles matches involving players who are usually singles specialists.
My ratings suffer from the same limitation TR does in that I'm using my ratings that mimic the USTA's dynamic rating and have still managed to more or less match UTR and WTN.

Although this discussion may spur me to brush off my singles/doubles specific ratings to see if they correctly predict a higher percentage of matches correctly.
 

TennisOTM

Professional
Competition update 5:

12 more doubles matches this week. Here are the standings and "W-L" records for each rating system so far:

UTR: 34-12 (74%)
TLS: 18-7 (72%)
TR: 28-13 (68%)
WTN: 36-18 (67%)

WTN has fallen from first to last place over the last two updates. They went a disastrous 4-6 in predictions this week, while no one else had more than 2 losses.

UTR went 6-2 and held on to first place - I have a feeling they'll be holding the top stop for quite awhile.
 

schmke

Legend
Competition update 5:

12 more doubles matches this week. Here are the standings and "W-L" records for each rating system so far:

UTR: 34-12 (74%)
TLS: 18-7 (72%)
TR: 28-13 (68%)
WTN: 36-18 (67%)

WTN has fallen from first to last place over the last two updates. They went a disastrous 4-6 in predictions this week, while no one else had more than 2 losses.

UTR went 6-2 and held on to first place - I have a feeling they'll be holding the top stop for quite awhile.
And my ratings went 6-1 this week, so now 33-11 (75%).
 

Moon Shooter

Hall of Fame
My ratings suffer from the same limitation TR does in that I'm using my ratings that mimic the USTA's dynamic rating and have still managed to more or less match UTR and WTN.

Although this discussion may spur me to brush off my singles/doubles specific ratings to see if they correctly predict a higher percentage of matches correctly.

Seperating singles and doubles is only an advantage if the player has enough matches in each type of play. If someone has a well established utr of 9.53 in either doubles or singles after say (40 matches) that is valuable predictive information on how they will do in the other type of game. Of course there can be some exceptions, but generally it is not like you have no idea if the player is a 1 utr or a 16.5. Usta rates so few matches and has so few singles matches generally they need to use both. I think a graduated approach would be best. As a player gets enough matches in both types of play their rating in the other type of play will have less influence.
 

Moon Shooter

Hall of Fame
Competition update 5:

12 more doubles matches this week. Here are the standings and "W-L" records for each rating system so far:

UTR: 34-12 (74%)
TLS: 18-7 (72%)
TR: 28-13 (68%)
WTN: 36-18 (67%)

WTN has fallen from first to last place over the last two updates. They went a disastrous 4-6 in predictions this week, while no one else had more than 2 losses.

UTR went 6-2 and held on to first place - I have a feeling they'll be holding the top stop for quite awhile.

What happens when someone has or had a “player” in their results? Are you still counting that for wtn?
 

TennisOTM

Professional
What happens when someone has or had a “player” in their results? Are you still counting that for wtn?

I don't think I found any of those within this particular group of guys. I have seen "player" in match results before but none of them was in this group of a few hundred players I'm following.
 

Moon Shooter

Hall of Fame
I don't think I found any of those within this particular group of guys. I have seen "player" in match results before but none of them was in this group of a few hundred players I'm following.

Even if there is a “player” in their results it can throw their rating off considerably.
 

TennisOTM

Professional
Even if there is a “player” in their results it can throw their rating off considerably.

It could be one of the reasons for some of their questionable ratings. Though could it be possible that each person with a "player" label actually does have a correct USTA ID# associated with it behind the scenes in the results tracking? In that case, the problem is only that they failed to link an ID to a human name for the website display, but the ratings would not be affected.
 

Moon Shooter

Hall of Fame
It could be one of the reasons for some of their questionable ratings. Though could it be possible that each person with a "player" label actually does have a correct USTA ID# associated with it behind the scenes in the results tracking? In that case, the problem is only that they failed to link an ID to a human name for the website display, but the ratings would not be affected.

At least in some cases that is clearly not the case. I checked one 3.5 player and his rating remained so far out of the norm and insensitive to the matches played. But it might not always be the case. WTN does not allow you to see any past results for “players”. So it is hard to know if all the matches are included.
 
Also beware of a TR issue where sometimes they get initial ratings for Appeal players wrong. For example, a guy might appeal down from 4.5 to 4.0, so he should start at 4.00, but TR will start him at 3.51 as if he appealed up to 4.0 from 3.5. If that happens, it cocks up their ratings for everyone who plays that guy.
 

TennisOTM

Professional
Also beware of a TR issue where sometimes they get initial ratings for Appeal players wrong. For example, a guy might appeal down from 4.5 to 4.0, so he should start at 4.00, but TR will start him at 3.51 as if he appealed up to 4.0 from 3.5. If that happens, it cocks up their ratings for everyone who plays that guy.

Whoa yeah that could be a big problem for them. I think there's (at least) one guy like that on my tracking list. He was a 4.5 playing men's league in 2021, played only a couple of mixed doubles matches last year (as a 4.5), and now he is a 4.0A, but TR has him rated at 3.50. Maybe he's a typical case of the ones TR errs on: guys who appeal a computer rating that is more than a year old. For most appeal players they seem to get it right.

He has not played yet this year, but I'll keep on eye on him to see what happens with his TR match ratings if he starts playing 4.0.
 

Moon Shooter

Hall of Fame
Whoa yeah that could be a big problem for them. I think there's (at least) one guy like that on my tracking list. He was a 4.5 playing men's league in 2021, played only a couple of mixed doubles matches last year (as a 4.5), and now he is a 4.0A, but TR has him rated at 3.50. Maybe he's a typical case of the ones TR errs on: guys who appeal a computer rating that is more than a year old. For most appeal players they seem to get it right.

He has not played yet this year, but I'll keep on eye on him to see what happens with his TR match ratings if he starts playing 4.0.

It may just be a placeholder in that case, since the person has no actual dynamic rating yet.
 

TennisOTM

Professional
Competition update 6:

11 more doubles matches this week, and we have a new leader! Here are the standings and "W-L" records for each rating system so far:

TLS: 22-8 (73%)
UTR: 39-16 (71%)
TR: 34-15 (69%)
WTN: 41-24 (63%)

Tennis League Stats (TLS) holds the top spot for the first time, and UTR falls to second after having a rough week going just 5-4.

Some may want to put an asterisk by TLS because they have predicted so few matches compared to the others. However, if I look only at the 40 matches that were predicted by all four systems, TLS has the best record for that subset as well. It's unfortunate that they are missing so many prior matches in their data, as so far it seems they have a pretty good rating algorithm.

WTN continues to plummet, having a losing record for the second week in a row. Since going 11-0 in week 1, they've been not much better than a coinflip.
 

TennisOTM

Professional
Competition update 7:

11 more doubles matches this week, and we have yet another lead change! Here are the standings and "W-L" records for each rating system so far:

UTR: 44-18 (71.0%)
TLS: 24-10 (70.6%)
TR: 36-18 (67%)
WTN: 47-28 (63%)

UTR crept back into the top spot, going 5-2 in a week with some hard-to-predict match results, including another upset win of two 3.5's against two 4.0's. Strangely, WTN was the only contender to predict that result correctly, but still went just 6-4 overall. Sorta seems like they're just throwing darts with a blindfold on.

Two more weeks left of this winter doubles-only league. Then we'll head into 18+ season and finally get some singles matches.
 

schmke

Legend
Competition update 7:

11 more doubles matches this week, and we have yet another lead change! Here are the standings and "W-L" records for each rating system so far:

UTR: 44-18 (71.0%)
TLS: 24-10 (70.6%)
TR: 36-18 (67%)
WTN: 47-28 (63%)

UTR crept back into the top spot, going 5-2 in a week with some hard-to-predict match results, including another upset win of two 3.5's against two 4.0's. Strangely, WTN was the only contender to predict that result correctly, but still went just 6-4 overall. Sorta seems like they're just throwing darts with a blindfold on.

Two more weeks left of this winter doubles-only league. Then we'll head into 18+ season and finally get some singles matches.

For this weekend's matches, my ratings went 3-3, so now 42-17 (71.2%).

Quote Reply
 

penpal

Semi-Pro
For this weekend's matches, my ratings went 3-3, so now 42-17 (71.2%).

Quote Reply
.
The biggest weakness of your rating system is that it does not have a handy dandy three-letter acronym ;).

I humbly suggest you use your blog title to create an acronym, which would give you a SCR rating. When pronouncing it verbally, I would recommend you call it a score, as in player's score - better than ess-see-are.

You're welcome for my unsolicited and probably unwanted advice :D
 

TennisOTM

Professional
Competition update 8:

12 more doubles matches this week, and we have a lead change for the third week in a row. Here are the standings and "W-L" records for each rating system so far:

TLS: 27-10 (73%)
UTR: 50-20 (71%)
TR: 42-19 (69%)
WTN: 55-31 (64%)

TLS moved back into first place - they offered predictions for only 3 of the 12 matches, which happened to be easy ones that every system predicted correctly, and they abstained from the harder ones. TR tightened the gap between 2nd and 3rd place with a pretty good week going 6-1.

If I look only at the matches that all four systems predicted (which are the same 37 matches that TLS predicted), UTR is tied with TLS at 27-10, while TR is 26-11 and WTN is 23-14.

One more week left of this winter doubles-only league. Then we'll head into our 4.0 18+ league, which has 14 teams this year, so there will be up to 35 matches per week to predict from there. I'll also use match results from 8.0 55+ league, which starts in about a month.
 

Moon Shooter

Hall of Fame
So WTN has a slightly lower percentage at the matches the others predicted then its overall prediction level.

The other systems only predict on average 56.75 matches and don't even touch the additional 29.25 matches WTN rates. In other words they are only rating about 66% of the matches WTN rates. And this improves their accuracy from WTNs 64% to their average accuracy of 71%.

In my opinion (and this is clearly just opinion) that is not clear proof they are better rating systems. Especially since the pool you are choosing is 4.0 men players which I think favors TR and UTR as well as USTA - which I believe Schmke closely mirrors.

If most of the matches are played in men's same gender league play then I would predict that would favor those systems as well. Do you have any sense of accuracy when we look at mixed doubles results or are you not using mixed doubles results?
 

schmke

Legend
Competition update 8:

12 more doubles matches this week, and we have a lead change for the third week in a row. Here are the standings and "W-L" records for each rating system so far:

TLS: 27-10 (73%)
UTR: 50-20 (71%)
TR: 42-19 (69%)
WTN: 55-31 (64%)

TLS moved back into first place - they offered predictions for only 3 of the 12 matches, which happened to be easy ones that every system predicted correctly, and they abstained from the harder ones. TR tightened the gap between 2nd and 3rd place with a pretty good week going 6-1.

If I look only at the matches that all four systems predicted (which are the same 37 matches that TLS predicted), UTR is tied with TLS at 27-10, while TR is 26-11 and WTN is 23-14.

One more week left of this winter doubles-only league. Then we'll head into our 4.0 18+ league, which has 14 teams this year, so there will be up to 35 matches per week to predict from there. I'll also use match results from 8.0 55+ league, which starts in about a month.
I'm woefully late this week, got busy.

My ratings went 8-2 so now 50-19 (72%)
 

TennisOTM

Professional
Competition update 9:

11 more doubles matches this week to round out our first complete league of the year. We have yet another lead change for the fourth week in a row.

Here are the standings and "W-L" records for each rating system so far:

UTR: 55-26 (68%)
TLS: 29-14 (67%)
TR: 47-24 (66%)
WTN: 60-37 (62%)

UTR and TLS keep leapfrogging each other - this week UTR has the lead despite going just 5-6 on the week. It was a tough week for everyone as there were multiple upsets of 3.5 players winning matches against 4.0 opponents. TR had the best week of anyone, going 5-5, and they are threatening to break into the top two.

WTN is hanging in there. At the current sample size we cannot conclude that they are clearly worse than any of the others with statistical confidence.

The sample size will start getting much larger on April 13 when 18+ league begins. Will anyone be able to hold on to 1st place for more than a week? Will WTN make a run back into contention or fall into statistical irrelevance? Stay tuned...
 

schmke

Legend
Competition update 9:

11 more doubles matches this week to round out our first complete league of the year. We have yet another lead change for the fourth week in a row.

Here are the standings and "W-L" records for each rating system so far:

UTR: 55-26 (68%)
TLS: 29-14 (67%)
TR: 47-24 (66%)
WTN: 60-37 (62%)

UTR and TLS keep leapfrogging each other - this week UTR has the lead despite going just 5-6 on the week. It was a tough week for everyone as there were multiple upsets of 3.5 players winning matches against 4.0 opponents. TR had the best week of anyone, going 5-5, and they are threatening to break into the top two.

WTN is hanging in there. At the current sample size we cannot conclude that they are clearly worse than any of the others with statistical confidence.

The sample size will start getting much larger on April 13 when 18+ league begins. Will anyone be able to hold on to 1st place for more than a week? Will WTN make a run back into contention or fall into statistical irrelevance? Stay tuned...
It was a tough week with some close upsets by my ratings too. 6-5 week and now 56-24 (70%)
 

TennisOTM

Professional
Competition update 10:

18+ league has started in this competition cohort, and 6 team matches are in the books. So 30 more matches (18 doubles and 12 singles) were played, although several of those involved brand new players with no prior rating.

Here are the updated standings and "W-L" records for each rating system so far:

TLS: 38-16 (70%)
UTR: 63-33 (66%)
TR: 55-34 (62%)
WTN: 68-51 (57%)

TLS surges back in to first place as the only competitor to improve on winning percentage this week, going 9-2 over the 11 matches they predicted, including 4-1 in singles matches. UTR fell from the top spot by going just 8-7 on the week (4-2 in singles).

TR and WTN each had losing records on the week, with WTN the worst going an abysmal 8-14 (3-7 in singles). Ouch. Is this the beginning of the end for WTN? More next week!
 

schmke

Legend
Competition update 10:

18+ league has started in this competition cohort, and 6 team matches are in the books. So 30 more matches (18 doubles and 12 singles) were played, although several of those involved brand new players with no prior rating.

Here are the updated standings and "W-L" records for each rating system so far:

TLS: 38-16 (70%)
UTR: 63-33 (66%)
TR: 55-34 (62%)
WTN: 68-51 (57%)

TLS surges back in to first place as the only competitor to improve on winning percentage this week, going 9-2 over the 11 matches they predicted, including 4-1 in singles matches. UTR fell from the top spot by going just 8-7 on the week (4-2 in singles).

TR and WTN each had losing records on the week, with WTN the worst going an abysmal 8-14 (3-7 in singles). Ouch. Is this the beginning of the end for WTN? More next week!
My ratings predicted 18 matches going 5-4 in singles and 6-3 in doubles for 11-7 overall. Interesting that TLS is predicting so few matches and even UTR predicted fewer than my ratings.

For the season I'm at 67-31 (68%).
 

TennisOTM

Professional
My ratings predicted 18 matches going 5-4 in singles and 6-3 in doubles for 11-7 overall. Interesting that TLS is predicting so few matches and even UTR predicted fewer than my ratings.

For the season I'm at 67-31 (68%).
There were some singles players who only had a UTR doubles rating - I chose not to apply their doubles rating to singles match prediction. Because your system would use doubles-based ratings to predict a singles match, I think that explains why you predicted more matches than UTR.
 

schmke

Legend
There were some singles players who only had a UTR doubles rating - I chose not to apply their doubles rating to singles match prediction. Because your system would use doubles-based ratings to predict a singles match, I think that explains why you predicted more matches than UTR.
Good point. You'd also expect UTR and WTN to do better at this point since they have discipline specific ratings to accurate reflect when players have different abilities in each.
 

Moon Shooter

Hall of Fame
There were some singles players who only had a UTR doubles rating - I chose not to apply their doubles rating to singles match prediction. Because your system would use doubles-based ratings to predict a singles match, I think that explains why you predicted more matches than UTR.
What reliability rating do they need to have before you consider them to have a utr or wtn rating?
Good point. You'd also expect UTR and WTN to do better at this point since they have discipline specific ratings to accurate reflect when players have different abilities in each.

It would depend on how many singles matches they and their opponents have. If they have less then 7 singles but have 25+ doubles matches then the usta approach is likely as good or better.
 

TennisOTM

Professional
Competition update 11:

I've entered results from seven more 18+ 4.0 team league matches into the competition. This is the first time we've had the same leader (TLS) for two consecutive weeks since updates 4 & 5.

Here are the updated standings and "W-L" records for each rating system so far (singles and doubles combined):

TLS: 48-18 (73%)
UTR: 80-34 (70%)
TR: 71-37 (66%)
WTN: 83-60 (58%)

Every competitor improved on winning percentage this week. UTR had an especially strong week, going 17-1. They are right there with TLS, especially considering that UTR is actually slightly ahead of TLS in common matches predicted.

UTR has the early lead in singles prediction at 10-2, followed by TLS (7-2), TR (11-6), and WTN (8-12).

WTN failed to take much advantage of a relatively easy week for predictions, going just 15-9. I think we are pretty close to concluding that WTN is the worst match prediction system with statistical confidence. Will they make another run to save themselves? Next update will include more 18+ 4.0 league matches and also the first 55+ 8.0 league results.
 

schmke

Legend
Competition update 11:

I've entered results from seven more 18+ 4.0 team league matches into the competition. This is the first time we've had the same leader (TLS) for two consecutive weeks since updates 4 & 5.

Here are the updated standings and "W-L" records for each rating system so far (singles and doubles combined):

TLS: 48-18 (73%)
UTR: 80-34 (70%)
TR: 71-37 (66%)
WTN: 83-60 (58%)

Every competitor improved on winning percentage this week. UTR had an especially strong week, going 17-1. They are right there with TLS, especially considering that UTR is actually slightly ahead of TLS in common matches predicted.

UTR has the early lead in singles prediction at 10-2, followed by TLS (7-2), TR (11-6), and WTN (8-12).

WTN failed to take much advantage of a relatively easy week for predictions, going just 15-9. I think we are pretty close to concluding that WTN is the worst match prediction system with statistical confidence. Will they make another run to save themselves? Next update will include more 18+ 4.0 league matches and also the first 55+ 8.0 league results.

This week my ratings went 17-5 and are now 84-36 (70%).

I had a very good week in singles going 9-1, doubles was 8-4.

It is interesting that UTR predicted four fewer matches than I did, perhaps those are four of the five I lost :)

I like that I've predicted the most matches and still doing very well including having the most correct (84 versus 83 for WTN and 80 for UTR). TLS does lead on percentage, but they've predicted far fewer matches.
 

TennisOTM

Professional
TLS does lead on percentage, but they've predicted far fewer matches.
At this point it does look like the matches TLS predicted were an easier subset. TR and UTR both improve by 3-4 percentage points when predicting those matches only, so you'd probably be right there with them. WTN strangely does even worse predicting that subset.
 

TennisOTM

Professional
Competition update 12:

Another set of 18+ 4.0 league matches are in the books. Things seem to be stabilizing as we have the same standings for the third consecutive update.

Here are the updated standings and "W-L" records for each rating system so far (singles and doubles combined):

TLS: 62-23 (73%)
UTR: 95-41 (70%)
TR: 87-46 (65%)
WTN: 100-73 (58%)

TLS had the best week at 14-5. They continue to lag way behind in number of matches predicted, but they now also have a slim lead in the common matches predicted by all four systems. TLS also took over the overall lead in singles prediction at 13-5 (72%) on the year, so they are doing quite well by all metrics other than total matches predicted.

WTN had yet another rough week, going 17-13. They still have a losing record on the year in singles at 13-20. A coin flipper would have ~85% chance of doing better than that.

Have we settled into a stable 1 through 4 ranking, or will someone make a move? More next week!
 

schmke

Legend
Competition update 12:

Another set of 18+ 4.0 league matches are in the books. Things seem to be stabilizing as we have the same standings for the third consecutive update.

Here are the updated standings and "W-L" records for each rating system so far (singles and doubles combined):

TLS: 62-23 (73%)
UTR: 95-41 (70%)
TR: 87-46 (65%)
WTN: 100-73 (58%)

TLS had the best week at 14-5. They continue to lag way behind in number of matches predicted, but they now also have a slim lead in the common matches predicted by all four systems. TLS also took over the overall lead in singles prediction at 13-5 (72%) on the year, so they are doing quite well by all metrics other than total matches predicted.

WTN had yet another rough week, going 17-13. They still have a losing record on the year in singles at 13-20. A coin flipper would have ~85% chance of doing better than that.

Have we settled into a stable 1 through 4 ranking, or will someone make a move? More next week!
My ratings went 20-6 this week, 9-4 in singles and 11-2 in doubles.

That puts my ratings at 104-42 (71%)
 

Moon Shooter

Hall of Fame
Wtn weirdly seems to give 3.5 self rate players bout the same rating they give self rat 3.0 players - around a 30. This wouldn’t be such a big deal if they did not weight the initial rating much. But they do! As an upper 3.0 lower 3.5 player If I lose to one of 3.5 self rate guys my rating tanks and the self rates rating doesn’t seem to move much.
 

TennisOTM

Professional
Competition update 13:

It has been a busy month and I've finally found time for an update. Our 18+ 4.0 league regular season is finished, and this update also includes several matches from 55+ 8.0 league.

Here are the updated standings and "W-L" records for each rating system so far in 2023 (singles and doubles combined):

UTR: 164-65 (72%)
TLS: 104-44 (70%)
TR: 156-86 (64%)
WTN: 181-122 (60%)

UTR has surged into the lead, taking over from TLS. UTR's lead over TLS is largely because of its superior performance in singles. TLS has a slim lead in doubles, but UTR is 37-13 (74%) in singles prediction vs. 26-13 (67%) for TLS.

TR and WTN have been stuck in 3rd and 4th place, respectively, since early March. They are in those spots for both doubles and singles.

To recap, I'm still using player ratings from January to generate the predictions (too much work to track all four systems dynamically). Interesting that the performances do not seem to be getting much worse even though the ratings are becoming more and more outdated every month. I think 72% for UTR is pretty impressive without the benefit of using updated ratings. Perhaps 4.0 men are just pretty stable in ability over this time frame.

Lots more to come - we'll have several 18+ district playoff matches soon and a full season of a healthy 40+ 4.0 league.
 

schmke

Legend
Competition update 13:

It has been a busy month and I've finally found time for an update. Our 18+ 4.0 league regular season is finished, and this update also includes several matches from 55+ 8.0 league.

Here are the updated standings and "W-L" records for each rating system so far in 2023 (singles and doubles combined):

UTR: 164-65 (72%)
TLS: 104-44 (70%)
TR: 156-86 (64%)
WTN: 181-122 (60%)

UTR has surged into the lead, taking over from TLS. UTR's lead over TLS is largely because of its superior performance in singles. TLS has a slim lead in doubles, but UTR is 37-13 (74%) in singles prediction vs. 26-13 (67%) for TLS.

TR and WTN have been stuck in 3rd and 4th place, respectively, since early March. They are in those spots for both doubles and singles.

To recap, I'm still using player ratings from January to generate the predictions (too much work to track all four systems dynamically). Interesting that the performances do not seem to be getting much worse even though the ratings are becoming more and more outdated every month. I think 72% for UTR is pretty impressive without the benefit of using updated ratings. Perhaps 4.0 men are just pretty stable in ability over this time frame.

Lots more to come - we'll have several 18+ district playoff matches soon and a full season of a healthy 40+ 4.0 league.
Since last update, my ratings went 39-12 in singles (76%) and 50-20 in doubles (71%) or 74% overall, and now for the year are 193-74 (72%).
 

Moon Shooter

Hall of Fame
Thanks for doing this tennisotm and schmke. It was a fun exercise and a decent sample size for 4.0 league players so I’m not sure it would be much different.

Even considering the larger number of games WTN really did underperform my expectations - I think a rating system should be able to hit about 85 percent at the 4.0 level. I do still believe wtn benefits by including the most data and does that better then other rating systems. However it has some glaring errors when dealing with older (players in their system for a long time) and newer players. Some older players have ridiculously low (good) ratings and many new players are apparently assigned a rating at the start. As a 3.0 self rate I was assigned a 29 but other players from this year who are self rate 3.5 s are assigned a 32. Assigning a rating is not ideal but it can work - as long as it isn’t weighted like an established rating. Unfortunately they do weight these initial rating that they apparently pull out of the sky pretty heavily.

The other issue is the women and men pretty much make up two completely different pools since usta does not allow coed play that would actually correct for this.

I would think these two issues are what pulled wtn into the mud. But I would be interested in what others think.
 

Moon Shooter

Hall of Fame
Also you did not include any mixed matches in this correct? I would think that schmkes rating system that mirrors usta would do the best in 4.0 single gender.

I think utr would do even better in mixed.

Schmke did you keep the rating the same as they were in January or did you use current dynamic ratings each week?
 

Chalkdust

Professional
I think a rating system should be able to hit about 85 percent at the 4.0 level.
Curious as to how you're determining 85% as a target.

I mean, let's imagine there is a perfect rating system. Perfect meaning it is 100% accurate in representing a player's results-based level over a time window.

What kind of predictive accuracy would be expected from such a system?

If you have two singles opponents with very close dynamic ratings (say 0.01-0.05 apart), how often might you expect the higher rated player to win? With players so close in average level, the outcome is going to be based on whoever is playing better on the day, with game styles also being a factor.

I would be surprised if even our perfect rating system is more than 60% accurate when player's ratings are that close.

On the other hand if you have singles opponents who are > 0.4 apart in dynamic rating, you would expect a much higher predictive accuracy.

So if you're going to set a target for predictive accuracy, you'd need to have targets in mind for the scenarios above, everything in between, and then also have a model for how frequent each of these scenarios is.

And that's for singles - doubles introduces more variables.

And this is all assuming we have a 'perfect' system, which we do not...
 

TennisOTM

Professional
Yes I think expecting 85% W/L prediction accuracy is incredibly optimistic. If you have a prediction system that good, you can quit your job and make a living and then some from sports betting.

72% accuracy seems to be about as good as anyone can do for tennis prediction in the long term. That's the number that @schmke keeps getting, regardless of NTRP level (it was the same for 4.0 and 3.0, see his posts above in this thread). 72% is also about the best accuracy anyone has had predicting professional ATP singles matches, according to the research I've seen.
 

TennisOTM

Professional
I would think these two issues are what pulled wtn into the mud. But I would be interested in what others think.

I haven't been able to make much sense of why WTN's ratings do so badly. Every time I think I have a theory, it does not hold up to scrutiny. There seem to be many exceptions to every simple explanation like the ones you propose. It really seems like the whole thing is a mess and they'll need a lot of work to improve it.
 

Moon Shooter

Hall of Fame
Yes I think expecting 85% W/L prediction accuracy is incredibly optimistic. If you have a prediction system that good, you can quit your job and make a living and then some from sports betting.

72% accuracy seems to be about as good as anyone can do for tennis prediction in the long term. That's the number that @schmke keeps getting, regardless of NTRP level (it was the same for 4.0 and 3.0, see his posts above in this thread). 72% is also about the best accuracy anyone has had predicting professional ATP singles matches, according to the research I've seen.

ok I think we need to be clear. Schmke and ntrp is not predicting 72% of the usta matches. They are correctly predicting 72% of the fraction of matches that they even attempt to predict.

I think some amount of build up of information is required. And it seems wtn has no build up at all - it is making predictions for people who have never played a rated match in their system. They just assign them a number based on who knows what. But usta ignores a ton of matches and only even tries to make predictions on a select number of matches. If you selected like they do you should be able to hit 80 maybe 85 is a stretch.
 
Top