About the New Rating System

nobodyhome · #1 04-10-2011, 01:54 AM

The new rating system has undergone some modifications from season 1. The base formula is still the same, but there are modifications on top of them. As a reminder, the base formula is:

Code:

New Rating = Old Rating + [ 50 * ( S - E ) ]

Where S is 1 if you won and 0 if you lost, and the value of E is calculated for each team as follows.

Code:

E = 1 / [1 + 10^ ([(Avg rating of your opponents)-(Avg rating of you and your teammates)] / 400)]

----------------------

However the change to your rating is now multiplied by a variable that we will call "uncertainty". This variable is influenced by two factors:

Inactivity/newness: If you have played less than 16 games in the past two weeks, the less games you have played, the higher the multiplier is. This is to allow new or rusty players to settle to their correct rating faster.

Streaking/trending: If in the past 16 games you have won more than you lost (or lost more than you won) then your multiplier will be higher. This is to settle people to correct ratings faster if they are somehow very incorrectly rated (either they are a new player to ladder and 1500 is too high or too low, or they somehow became drastically better or drastically worse, for example by starting to play a new plane).

----------------------

Secondly, your rating change is divided by a variable that is a composite of the uncertainties of everybody else in the game. The higher everybody else's uncertainty is, the greater the divisor is. In essence, what this does is to make a game "worth" less in terms of your possible loss or gain if there happens to be someone in the game that is incorrectly rated (in this case, whether you win or lose is more determined by which way these incorrectly rated people are rated, as opposed to your own ability).

----------------------

Finally, you are given a few extra bonus points for each game you win. This serves as an inflationary factor to ladder making ratings trend upward over time, so that new players start near the bottom of the ladder rather than near the middle as previously, which more accurately reflects reality.

Karl · #2 04-10-2011, 07:54 AM

I'm not sure if you're aware but there's some guy who implemented TrueSkill in C# and PHP http://www.moserware.com/

CCN · #3 04-10-2011, 11:26 AM

Quote:

Originally Posted by Karl

I'm not sure if you're aware but there's some guy who implemented TrueSkill in C# and PHP http://www.moserware.com/

after months of pain over the new rating system, after implementation Karl says this....

I don't know whether to laugh or cry :/

nobodyhome · #4 04-10-2011, 11:33 AM

I actually saw that already (well, not the PHP implementation, but the article and the C# implementation) and felt that TrueSkill had a lot of properties we didn't need and was lacking in 1-2 properties we did need. I also didn't quite understand all of it so I was uncomfortable with implementing something I couldn't get a full grasp of.

Although I suppose if I had seen the PHP implementation I could've just plugged it in and it would've been 10x faster and had come out nearly as good anyways. Oh well.

Ribilla · #5 04-10-2011, 12:02 PM

Quote:

Originally Posted by nobodyhome

I actually saw that already (well, not the PHP implementation, but the article and the C# implementation) and felt that TrueSkill had a lot of properties we didn't need and was lacking in 1-2 properties we did need. I also didn't quite understand all of it so I was uncomfortable with implementing something I couldn't get a full grasp of.

Although I suppose if I had seen the PHP implementation I could've just plugged it in and it would've been 10x faster and had come out nearly as good anyways. Oh well.

Well these changes look good anyway.

Save trueskill for season 3!

Karl · #6 04-10-2011, 07:19 PM

Do you save the match data? If so I would be curious to see a comparison of your rating system vs TrueSkill.

I would be willing to run the comparison if you could get me the data.

edit: removed my derailing parts of post

elxir · #7 04-10-2011, 07:38 PM

Quote:

Originally Posted by Karl

Do you save the match data? If so I would be curious to see a comparison of your rating system vs TrueSkill.

I would be willing to run the comparison if you could get me the data.

edit: removed my derailing parts of post

Do you mean this data? (scroll down) http://altitudeladder.com/match.php?id=219&mode=tbd_5v5

Or like, the actual logs for his code...

Karl · #8 04-10-2011, 07:46 PM

Quote:

Originally Posted by elxir

Do you mean this data? (scroll down) http://altitudeladder.com/match.php?id=219&mode=tbd_5v5

Or like, the actual logs for his code...

Indeed, all you need is the players on each team and which team won. So yea we could run a comparison of NoboSkill vs TrueSkill

nobodyhome · #9 04-10-2011, 10:49 PM

I can give you access to the database and you can grab the match data off of there. The only thing that might not make this 100% accurate is that the balancing mechanism is based off of NoboSkill and not TrueSkill, so the games that are played if ladder was truly using TrueSkill would be different.

andy · #10 04-11-2011, 02:37 AM

Quote:

Originally Posted by nobodyhome

Finally, you are given a few extra bonus points for each game you win. This serves as an inflationary factor to ladder making ratings trend upward over time, so that new players start near the bottom of the ladder rather than near the middle as previously, which more accurately reflects reality.

Does this mean that mass gaming will provide a huge benefit to your rating?

shrode · #11 04-11-2011, 04:56 AM

Quote:

Originally Posted by andy

Does this mean that mass gaming will provide a huge benefit to your rating?

Yeah I'm pretty turned off by the idea because of this reason. Doesn't the 'first-game multipliers' and stuff already aim to solve that same problem? And if that isn't sufficient, wouldn't a smarter way to solve the problem be to gradually decrease the starting rating for players? I do not want to see major rating inflation by those dominating in the 'games-played' category 6 months from now.

elxir · #12 04-11-2011, 05:02 AM

Quote:

Originally Posted by shrode

Yeah I'm pretty turned off by the idea because of this reason. Doesn't the 'first-game multipliers' and stuff already aim to solve that same problem? And if that isn't sufficient, wouldn't a smarter way to solve the problem be to gradually decrease the starting rating for players? I do not want to see major rating inflation by those dominating in the 'games-played' category 6 months from now.

i think this is counter-balanced by people who play fewer games but are more efficient, due to the multipliers involved in the lower rank/streaking combo

nobodyhome · #13 04-11-2011, 05:51 AM

The inflationary value is negligible. The scenario you are worried about will most likely not happen because your expected value per gain is still governed by the base equation. As soon as you get too many points from the inflationary value, you quickly become "overrated" and end up simply giving those points away.

andy · #14 04-11-2011, 10:44 AM

Quote:

Originally Posted by nobodyhome

The inflationary value is negligible. The scenario you are worried about will most likely not happen because your expected value per gain is still governed by the base equation. As soon as you get too many points from the inflationary value, you quickly become "overrated" and end up simply giving those points away.

Makes sense. Thanks.

Tekn0 · #15 04-11-2011, 10:56 AM

How many matches under the new ladder scheme does one have to play to be ranked somewhat accurately??

Ribilla · #16 04-11-2011, 12:35 PM

Quote:

Originally Posted by Tekn0

How many matches under the new ladder scheme does one have to play to be ranked somewhat accurately??

Well that depends on how accurately everyone else is ranked, I lost 5/6 games on the trot last night because everyone is around 1500 and I kept getting crappy teams. Until it's balanced out you won't trend as fast as you should.

Tekn0 · #17 04-11-2011, 02:39 PM

Quote:

Originally Posted by Ribilla

Well that depends on how accurately everyone else is ranked, I lost 5/6 games on the trot last night because everyone is around 1500 and I kept getting crappy teams. Until it's balanced out you won't trend as fast as you should.

Sigh... same here. I'm not sure if I should play ladder when it's still so unbalanced.

But what I wanted to know was, in the previous season they said you need 100 games once things are balanced to reach accurate rating.

With this new algorithm one should reach their "true" rating much quicker (again once others are mostly balanced). So roughly how much would it be once many of them are rated properly?

ryebone · #18 04-11-2011, 07:42 PM

Quote:

Originally Posted by Tekn0

Sigh... same here. I'm not sure if I should play ladder when it's still so unbalanced.

But what I wanted to know was, in the previous season they said you need 100 games once things are balanced to reach accurate rating.

With this new algorithm one should reach their "true" rating much quicker (again once others are mostly balanced). So roughly how much would it be once many of them are rated properly?

To be fair, the unbalanced nature of teams thus far is a still a crapshoot in terms of which side you end up on. While it is possible to lose 5/6 games completely due to imbalanced teams, one could just as easily win 5/6 games because of it. I can understand if you don't want to play due to the uncertainty of team balance, but just be aware that it can go both ways. Besides, it'll eventually cancel itself out once players are more accurately rated, so there's really no harm done.

The 100-game rule we made up for last season was completely arbitrary- I actually claimed 50 games was enough to get a rough idea of where you stand. The defining factor has always been winning ~50% of your games. If I had to guess, I'd still peg the number to be around 50(doesn't apply right now, because teams are still far from being balanced). As the season goes on and the majority of the player base has been balanced, new players will reach their proper ratings even quicker than 50.

Fun fact: I discussed with nobo yesterday, and despite not giving me the actual formula for calculating rating (understandable, since it's secret like the Bush Baked Beans recipe), he confirmed that the theoretical maximum rating change is about 900 points in one game. Obviously you'd need the perfect conditions for that, such as a new player just starting out, on a massive losing streak, with everyone else perfectly rated and the team balance as skewed as possible. Nonetheless, I wouldn't be surprised if, towards the middle-end of the season, new players were seeing point changes of up to 100-150 per game.

Karl · #19 04-11-2011, 07:51 PM

Other rating systems such as TrueSkill require between 46 and 91 games to really figure out good you are. I don't know what NoboSkill takes but it's safe to say everyone should go play 70 Ladder games so we can get this party started.

[Y] · #20 04-11-2011, 07:56 PM

Only if you and Lam come play =)

Ribilla · #21 04-11-2011, 09:53 PM

http://www.altitudeladder.com/match....4&mode=tbd_5v5

I don't understand why I have lower point gain than all the other players, take blln:

I had a slightly longer win streak, my rank was much lower and I have less games played.

Why is this?

nobodyhome · #22 04-11-2011, 09:59 PM

Streaking is an imprecise term to use actually--a better word to use is trending. Your immediate streak doesn't matter, it's how much you have won or lost in the past x amount of games that matter. In this case you averaged about even in your most recent games, whereas ball'n is trending in the upward direction.

XX1 · #23 04-11-2011, 10:03 PM

Quote:

Originally Posted by nobodyhome

Streaking is an imprecise term to use actually--a better word to use is trending. Your immediate streak doesn't matter, it's how much you have won or lost in the past x amount of games that matter. In this case you averaged about even in your most recent games, whereas ball'n is trending in the upward direction.

Hmm so like the better you do, the more points gained after each match based upon your previous games?

nobodyhome · #24 04-11-2011, 10:07 PM

Not quite. More like, the farther you are away from a 50/50 record (in either direction) the next games will be worth more (whether you win or lose). If you are on a win streak and then you lose, the game you lose will result in more points lost even though you were on a win and not a loss streak.

Pieface · #25 04-12-2011, 03:11 AM

If you're on a super win streak, why should you lose more points as well? Shouldn't ladder assume that a huge streak means you're underrated and should be trending upwards quickly (losing less points, gaining more)?

nobodyhome · #26 04-12-2011, 09:54 AM

The thing is that if you take a look at the base formula above, the way it works is that E is set so that it's actually the probability of your side winning. Then, it works out so that if the players are rated accurately, the expected gain (amount point gain * probability of winning - amount point loss * probability of losing) is set to 0. This is the behavior we want.

When we put in the multiplier, this multiplier is done to both gains and losses, so that no matter what the multiplier is, the expected value still remains 0. I suspect that if we make it so that the multiplier is applied to only gains if you are on a winning streak and vice versa, getting on a win streak will actually end up pushing you past you correct rating, which will lead you to having to streak downward below your correct rating afterwards, etc, causing weird fluctuations (of course I am kinda theorycrafting here).

JWhatever · #27 04-12-2011, 10:49 AM

What is the maximum and minimum amount of points one could get after a win/loss?

-J

Rainmaker · #28 04-12-2011, 05:59 PM

50 -> see first formula.

@ what nobody said:
Theoretically, you want the sum of all points lost and won to be zero.

Putting it on simple word its kind of a risk theory:
If you are willing to run a big risk, for a big reward, then the losses must be big as well.

For what nobody is telling about the streak thing it seems to work this way:

Whenever you are on a streak (either winning or loosing) you have a bigger fluctuations of points; if you are having a 50-50 streak (in 30 games you have 16 won and 14 losses) then your points won and lost should be nearly the same (assume 24~26 per game, as that is the average when you are playing on a 50% ratio)
So 24*(16-14) = +48 on your rating.

If you are having in a 30 streak: 8 wins and 22 loses; then the difference will be much greater. (maybe something like ~37 when you are on a 75% ratio)
37*(8-22) = -518 on your rating
My guess, is that this is based that when you are on 50% ratio you are playing at your rank lvl. If you are having streaks, it means you are rated incorrectly (either underrated or overrated). So giving you higher fluctuation its easier to make you move down or up in the board; until you reach a 50% ratio.

I think that it always takes into account your last 30 recent games.
Can you confirm or deny anything from this nobo?

tupapito · #29 04-15-2011, 04:33 AM

Add the ranking option!

that would be great

elxir · #30 04-15-2011, 05:17 AM

Quote:

Originally Posted by JWhatever

What is the maximum and minimum amount of points one could get after a win/loss?

-J

900

chars

nobodyhome · #31 04-15-2011, 05:29 AM

900 is the theoretical maximum. in practice that will never happen, something like 225 is closer to an achievable maximum.

elxir · #32 04-17-2011, 06:47 PM

syphun has 131 more points than me and we have the exact same W/L. RIGGED!

Joaquin · #33 05-20-2011, 02:14 PM

Hi,
I'm asking admins or anyone with some alti point ranking system knowledge to explain to me on the concrete case of this match:
http://www.altitudeladder.com/match....&mode=ball_6v6
how did I earn minus 102 points.
Tx!

It's almost impossible to scratch out to at least top 50 if you get minus 102, minus 80, minus 60 etc for loss while you get almost everytime around 30 points for victory.

I'm not a computer programmer, but a point system with the focus on actual team contribution (goals, kills...) would be much fair. This is just ridiculous.

York · #34 05-20-2011, 02:17 PM

Quote:

Originally Posted by Joaquin

Hi,
I'm asking admins or anyone with some alti point ranking system knowledge to explain to me on the concrete case of this match:
http://www.altitudeladder.com/match....&mode=ball_6v6
how did I earn minus 102 points.
Tx!

It's almost impossible to scratch out to at least top 50 if you get minus 102, minus 80, minus 60 etc for loss while you get almost everytime around 30 points for victory.

I'm not a computer programmer, but a point system with the focus on actual team contribution (goals, kills...) would be much fair. This is just ridiculous.

You had the better team by 200 points. You were the worst on your team by 50 points. You guys had a 68% win chance and you couldn't win. Obviously your fault

yankinlk · #35 05-20-2011, 04:39 PM

Quote:

Originally Posted by York

You had the better team by 200 points. You were the worst on your team by 50 points. You guys had a 68% win chance and you couldn't win. Obviously your fault

Wow. That is a very interesting explanation for it! So thats the - accelerate people to their correct rating factor - with emphasis on the lowest ranked losing player?

Seems harsh, but fair.

Tekn0 · #36 05-20-2011, 05:17 PM

Quote:

Originally Posted by York

You had the better team by 200 points. You were the worst on your team by 50 points. You guys had a 68% win chance and you couldn't win. Obviously your fault

What do you mean "worst on your team by 50 points" ?

Joaquin · #37 05-20-2011, 05:31 PM

Quote:

Originally Posted by York

You had the better team by 200 points. You were the worst on your team by 50 points. You guys had a 68% win chance and you couldn't win. Obviously your fault

The worst?? I had the least points if you mean that. So you're saying: because I've had the least points in the beginning of the match from my team, therefore I'm destined to be the person who will pay the most for the loss. Are you kidding right?!

Joaquin · #38 05-20-2011, 05:36 PM

Quote:

Originally Posted by yankinlk

Wow. That is a very interesting explanation for it! So thats the - accelerate people to their correct rating factor - with emphasis on the lowest ranked losing player?

Seems harsh, but fair.

- Seem harsh, but fair. - LOL, has mother taught you any different phrase than this one?!
What is fair about it, Mr. Smartypants?!

yankinlk · #39 05-20-2011, 05:57 PM

Quote:

Originally Posted by Joaquin

What is fair about it, Mr. Smartypants?!

Well lets see, i wasn't there but i can see there were 3 randas in that game - you could argue that you are better than both tmic and para - not really for me to say - but its probably likely with the huge amount of plane switching you all were doing that there was a heated discussion on who would play what plane setup. You scored once as both planes and im sure that wen to your head (it would mine) , but you only scored twice - so obviously its your fault - you see in order to win you have to score 6 goals, sorry thanks for playing.

Quote:

Originally Posted by Joaquin

- LOL, has mother taught you any different phrase than this one?!

Ohnoehedidn't. I will be pewpewin' u when i see ya.

mikesol · #40 05-20-2011, 06:38 PM

Ignoring any of the insults presented - I'd like to point out that how many goals you scored, how many kills you got, etc are all irrelevant in this rating system.

This system is fairly well explained here.

If I had to take a guess as to why you shot down in rating those two games it would be because you didn't play for awhile and were on a general winning streak. The system adds this multiplier when you are winning more games than you've lost in the past 16 games that makes games worth more for you so that you can go up to your real rating faster. Unfortunately, this means that if you are on a winning streak any games lost also plummet your score. You'll notice you did get a +62 after you lost those two games indicating that the system is still unsure where to place you.

The other fact of the matter is that both of the games you lost your team should have won. If a bunch of really good players lose to a bunch of bad players they are penalized more.

I hope that makes some sense and if you need further explanation feel free to ask.

Edit: Now I realize there are many arguments for why this system is flawed. I agree that there definitely are many flaws with this system - however I was merely trying to explain why this system did what it did for you in those games.