Rankings: You’re Doing It Wrong!

João M. D. Moura

We have seen a lot of rankings in our lives, not only in WebApps but also in different media. Often the ranking is part of a gamification feature.

But, is it being really effective? In fact, the majority of ranking implementations are wrong, and can even make your application less attractive. It’s difficult or downright impossible to reach the top spots and you may never find a real challenge against other users. Those are just some of the problems when talking about Rankings.

Many big compaines, such as Sony, Microsoft, and Nintendo, have experimented with algorithms to make rankings more dynamic and attractive. Lots of great implementations have surfaced from those companies, and we are able to use them in our development processes.


What is the Proposal?

The mathematical description of a ranking is quite simple, as you can see at Wikipedia

“A ranking is a relationship between a set of items such that, for any two items, the first is either ‘ranked higher than’, ‘ranked lower than’ or ‘ranked equal to’ the second. In mathematics, this is known as a weak order or total preorder of objects.”

A ranking by itself is quite easy to implement, basically sorting a number of items according to some criteria, but what is the purpose of a ranking?

A ranking is not only an opaque information, it’s being used on games and applications to engage users to improve themselves and achieve higher positions, usually competing against each other.

What often happens is the early adopters will reach the top of the ranking really fast, and improve so much that it will be too difficult or even impossible for new users to surpass them or even try to challenge someone with the same skill level.

So if you were trying to make your application more attractive with rankings, chances are that you have failed! No one wants to enter in a competition where it is impossible to actually compete.

But how can we make rankings something more interesting, to transform this old social technique into a powerful tool to keep users engaged? That is our goal and the real purpose of a Ranking implementation.

Making Rankings Effective

Think about a ranking as an piece of data. Data is essentially good and when you have it, you have power. It isn’t just about sorting items, it’s about building an application that fits every level of each user’s needs.

The idea is that, if you have a ranking, you have a resource.

For example, when ranking users, the parameters can change as the user progresses. Those changes can be the interactions between each other, but the condition to change those parameters and get a higher or lower ranking must be compatible with the resource skills and current position.

It’s quite simple and when I thought about the essence of it I came up with the following short list that defines the purpose of a ranking:

  • Organize and order a resource.
  • Make it possible to change positions.
  • Find challenges/conditions that are compatible with the user’s current rank
    and skills.

It certainly doesn’t sound too difficult, but when it comes to technique, it is. But don’t worry; because big companies are working on this for us.

The best solution that I’ve seen so far is the one that Microsoft Research team researched and developed, known as True Skill

True Skill

Microsoft has published a lot of research related to the new business areas that it has ventured into in the past few years, such as the game console market. The Xbox is one of the most incredible devices on the market right now and its updates and features are always bringing innovation. One of those features is the True Skill ranking system.

“TrueSkill is a Bayesian ranking algorithm developed by Microsoft Research and used in the Xbox matchmaking system built to address some perceived flaws in the Elo rating system. It is an extension of the Glicko rating system to multiplayer games”

TrueSkill was developed based on the Elo Rating system, one of the most well known algorithms, that is used in a lot of different ranking tools; it was the initial implementation of statistical estimation on rankings.

It matches the short list that I mention above. It not only sorts users, but it also make it dynamic, providing challenges that fits the skills of every user.

How it Works?

The basic idea and a basic implementation of True Skill is that every resource has two variables: One that represents the perceived resource skills, that we will reference by using the letter S, and another one that represents how much ‘confidence’ the system has in the S value, that we will reference by using the letter C.

A default value for S and C have to be defined in order for new resources to have an initial position in the ranking. In Xbox Live, the default values are S = 25 and C = 25/3. The S value will increase or decrease after every win or loss, but what will make the difference in the resource position is the value of C and how “surprising” the outcome of the challenge is to the system. So, if the outcome is actually the expected result (the favorite player wins, for example), it will not produce a huge increase in the user’s position.

A Resource rating is estimated by using the following formula R = S – 3 * C. The True Skill system can be applied with any scale, in Xbox Live it’s using a 0 to 50 scale.

This is just the beginning of the concepts behind True Skill. There are a lot of details when thinking about all possible results. A draw, for example, or a competition between various players at the same time.

There are some peculiar results that are quite interesting when using True Skill, for example an initial loss can actually make a user achieve a higher position on the ranking, because the C may have a bigger loss then the S variable. Let’s see an example:


Imagine a new user, “John”, on Xbox Live. We can assume the S = 25 and C = 25/3, so John’s rank would be calculated as R = 25 – 3 * 25/3, resulting in a rating 0.

In the first challenge, the system is expecting that John will win, but he actually loses. His skill level will decrease by 5, resulting in S=20. It was not the result expected by the system, so the system doesn’t have much confidence in this new skill. This will result in a bigger loss of the C variable, for example C=15/3.

The thing is that John’s rating result will be higher then the initial one. Losing his first challenge results in R = 20 – 15 = 5, because the system does not believe that the loss represents John’s correct skills measure. However, if John keeps losing, the confidence of a smaller skill will increase, resulting in a huge rating loss.

What and How?

Rankings are not only about competition between users, as you can measure anything that you want. Having a good ranking will help you to find out what better fits any position.

You can rank your users and, depending on their spot, offer different features, interfaces, and ‘customizations’ that better fit their skills and how they interact with your application. The rule is simple, if you can rank something, you can make it more personal to everyone.

The whole True Skill implementation is not available on the web, it’s patented by Microsoft and can only be used if a license is acquired. But derivations of it, such as Glicko rating system are free to use. Unfortunately, there is just one implementation of True Skill system in Ruby that I’ve found on the web and it’s not completed yet. But, you can find a lot of Glicko and Elo implementations in other languages.

Right now, I’m implementing a derivation of the True Skill system in Gioco. I’ve opened an issue on github to add this feature in the next release, so it will be easier to have a correct and good ranking implementation in your application that will help you to engage your users. Stay tuned!

CSS Master, 3rd Edition