Elo Rating System: How Skill Ratings Work

simulator intermediate ~10 min
Loading simulation...
Rank accuracy ≈ 75% — ratings converge to skill

After 100 rounds with K=32 and 8 players, the Elo system correctly orders about 75% of players by true skill. More rounds and appropriate K-factor improve convergence.

Formula

E_A = 1 / (1 + Math.pow(10, (R_B - R_A) / 400))
R_A_new = R_A + K × (S_A - E_A)
Rating difference of 400 ≈ 10:1 expected win ratio

The Mathematics of Competitive Ranking

The Elo rating system, invented by Hungarian-American physics professor Arpad Elo in 1960, is one of the most elegant and widely adopted mathematical models in competitive sports. Originally designed for chess, it solves a fundamental problem: how to estimate the relative skill of players who never directly compete against each other, using only the outcomes of pairwise matchups.

Expected Score and Rating Updates

The core of Elo is the expected score formula, which uses the logistic function to convert a rating difference into a win probability. A 400-point rating gap corresponds to a 10:1 expected win ratio. After each game, ratings shift by K times the difference between the actual and expected score — a simple but powerful Bayesian update rule.

K-Factor: Sensitivity vs. Stability

The K-factor is the single most important tuning parameter. A high K-factor means ratings change rapidly after each game — ideal for new players whose ratings are uncertain but problematic for established players where large swings feel unfair. Most systems use a declining K-factor: high for newcomers, low for veterans. This simulation lets you see how K affects convergence speed and rating volatility.

From Chess to Everything

The Elo system's elegance lies in its simplicity and self-correcting nature. Today it underpins competitive ranking in chess, football, tennis, esports, and even online matchmaking algorithms. Variants like Glicko and TrueSkill add rating uncertainty intervals, but the core insight remains Elo's: compare expected performance to actual results and adjust incrementally.

FAQ

How does the Elo rating system work?

Each player starts with a base rating (typically 1500). After each game, the winner gains points and the loser loses points. The amount exchanged depends on the expected outcome — beating a much higher-rated player earns more points than beating a lower-rated one. The K-factor controls how many points are at stake per game.

What is the K-factor in Elo ratings?

The K-factor determines the maximum number of rating points that can change hands in a single game. A higher K-factor (e.g., 32-40) makes ratings more responsive but volatile, while a lower K-factor (e.g., 10-16) produces stable but slowly-adapting ratings. FIDE chess uses K=40 for new players and K=10 for established grandmasters.

Why do Elo ratings converge to true skill?

If a player is underrated relative to their skill, they will win more often than expected and gain rating points. If overrated, they will lose more often and shed points. This self-correcting feedback loop drives ratings toward their true value over many games, following the law of large numbers.

Is the Elo system used outside of chess?

Yes, Elo-based systems are used in virtually every competitive domain: FIFA world rankings, tennis, Go, esports (League of Legends, Overwatch), online matchmaking, and even dating apps like Tinder. The basic principle of pairwise comparison with Bayesian updating is universal.

Sources

Embed

<iframe src="https://homo-deus.com/lab/sports-science/elo-rating/embed" width="100%" height="400" frameborder="0"></iframe>
View source on GitHub