fbpx

The Dragonmaster’s Lair – Not A Set Review: The DCI Rating System

Brian Kibler takes a break from analyzing the ins and outs of the Magic game itself and focuses on something relevant to many Magic players: the DCI Ratings system. Is it inherently flawed?

Like thousands of other gamers around the globe, I spent this past weekend playing in New Phyrexia Prerelease events. I have my thoughts on the new set
to share, but I didn’t want to add to the bombardment of set reviews being thrown your way this week, so I thought I’d touch on another
recent issue. While myself and many other players enjoyed the casually competitive atmosphere of the Prerelease events, there are those out there who
could not join in on the fun—those possessed of the dubious honor of a high rating.

Ratings in Magic are funny things. To some players, they’re the equivalent of an achievement in a video game. They’re happy to reach every
little plateau. “Achievement unlocked—1800 Rating!” “Achievement Unlocked—Top 10 in the State!” They’re a
barometer of success, an indication of progress upwards, even if that progress is essentially an end unto itself. For these players, I’d say the
rating system functions pretty well. As they improve and perform better in their local events, they can see their rating steadily going up. For others,
however, the rating system is deeply flawed.

Those others are most notably players at or near unlocking an achievement of their own—the coveted rating-based invitation to their National
Championships or the Pro Tour. While it’s hard to feel bad for high-rated players, it is easy to see the negative effects of the way the rating
system works on those players and the tournament scene as a whole. Once they have reached a rating they feel will secure them an invitation, many
players stop playing entirely to preserve their rating. In fact, it happens so often that the Magic community has created a term for
it—“sitting on your rating.”

Why is this a bad thing? It stops people from playing Magic! If a player were sitting exactly at what they believed to be the ratings cutoff for an
invitation, or for a Grand Prix bye, or whatever—would they go play in their local Prerelease? Would they play in a PTQ or GPT for the very event
they’re sitting on their rating for? The answer as I’ve seen is almost certainly not.

Even when those players do play, they’re essentially walking on eggshells. Just this past weekend, I played against a local player who was near
the bubble of qualifying for Mexican Nationals. When I won our match, he was asking me if I knew what my rating was because he was worried about
whether he could afford to take another loss and risk losing his qualification.

As I said before—it’s hard to feel bad for the players receiving ratings invitations, since they’re reaping the benefits of the
system, but when those players are sitting out of tournaments, it hurts the local scene. Any player sitting on the sidelines who would otherwise be
playing provides one fewer entry fee that goes to support keeping the lights on at the local store. Granted, the number of players in any area who
might be at the ratings threshold for qualification might be low, but those are typically some of the best players around.

People want their chance to take a whack at the local big shot. I know that when I show up to FNM or a Prerelease, a lot of players are excited to play
against me and are especially excited when they take me down. If I were relying on my rating to qualify, I probably wouldn’t be there, and if I
were, I’d probably be taking the games a whole lot more seriously, which could sap the fun out of the whole situation.

Magic tournaments are a community affair, which is why Wizards has put such emphasis on building the Wizards Play Network around local stores. If the
local ringer stops playing because he’s sitting on his rating, the competition at the store could very easily dry up, and players could get bored
and move on to other things. One player sitting on his rating might mean that an entire group no longer has a ride to the PTQ a few hours away.
Obviously, these are worst-case type scenarios, but it’s important to realize that local scenes live and die by having a certain critical mass of
players, and it’s incredibly easy for a scene to fade away by losing just a few of its alpha gamers. This is a big part of the reason that I make
a point of going to my local FNM and Prereleases when I can, along with doing what I can to support organizers throwing independent tournaments in the
region.

Solutions to these problems aren’t easy. This issue became a hot topic on Twitter over the past week because WotC announced that they were
extending the ratings deadline for Nationals qualifications by two weeks because of issues with their new tournament reporting software. This caused a
particular uproar because the two-week period overlaps with Grand Prix Providence, a major event many players (including those who have been sitting on
their ratings) were planning on attending. Many of those who complained said they already had made travel plans for both events, and the change in the
date for the ratings cutoff forced them into the extremely awkward situation of having a plane ticket or hotel reservation for an event they were no
longer sure they wanted to attend or were uncertain they’d be qualified for.

Purists might argue that they should play in the Grand Prix and that if they’re good enough to deserve their rating invite, they should do well
enough to preserve their rating. If they can’t maintain their rating at that level, they might say that their rating is inflated and should go
down.

The problem with this argument (which I have actually heard, by the way) is that the rating system Magic uses is ELO, which was ported directly from
chess. As you know, chess has a little bit less variance than Magic does. In chess, every game begins with the same pieces in the same spots on the
same board, but in Magic, there are any number of variables that could influence your results in a given match, or across a given tournament. Maybe you
got mana-screwed. Maybe you made a terrible metagame choice. Maybe you got a nearly unwinnable matchup.

The way the ELO system works doesn’t account for any of that. According to ELO, a player should have a certain expected win rate against another player
based on the difference in those players’ ratings, and that expected win rate goes up as the difference in rating goes higher. While that has some
application to Magic, it has a few flaws.

Consider two players—one with a rating of 1922 and another with a rating of 2222. According to the ELO system, the player with the 2222 rating
should win 85% of the time when the two meet. Now, that’s a pretty significant gap for any two competitive players at a high level of competitive
play. Even Kai or Finkel in their prime didn’t win 85% of their matches at the Pro Tour level, and those kinds of ratings could easily belong to
two players on the Pro Tour. In fact, they both belonged to the same player on the Pro Tour within the span of a year—me.

That’s right. According to ELO calculations, me just after GP Houston last year was an 85% dog to me at the end of Day Two of Worlds last year.
Which is funny, because the 1922 me went on to make three consecutive Top 16 GP finishes, including a win, while 2222 me managed to punt multiple
matches in a row to fail to make Top 8 despite a 9-0 start.

ELO is simply a flawed system to apply to Magic. It’s made to describe a very different game with very different factors involved in determining
the outcome. The top ratings rarely, if ever, reflect the actual best players in the game—rather, they’re typically a list of the players
who did well in their most recent events. The current top three players in total rating are people I’ve never heard of in my life, and while
it’s possible that Jared Helwig from Hanover, PA, is the best player in the world, I have my doubts.

How do we fix this? Well, I find it unlikely that we’re going to see the ELO system abandoned entirely. The DCI has used the same formula, give
or take, for the past fifteen plus years, and I doubt we’ll see that change, especially given that I doubt anyone really wants to purge the
system of old results. I make no claims to having anything close to the mathematical prowess to devise a system that accurately represents a given
player’s Magic skill. Within the context of the flawed system, though, I think there’s certainly room for improvement.

The change that seems to be the easiest to me, in terms of fixing the penalized-for-playing problem, is to base ratings perks, like invitations or
byes, on reaching certain ratings thresholds during a period rather than having a particular rating on a given date. For example, rather than inviting
the Top 100 players on June 5 or whatever, instead extend an invitation to any player who at any time had a rating over 2050 during the period between
January 1 and June 5.

This change would have a number of benefits, the most obvious of which is that players aren’t stuck sitting on their ratings. If a player passes
the qualification threshold, he doesn’t risk losing his slot because he decides to play a joke deck at his local FNM. He doesn’t have to
sit out of the local Prerelease and can play in a PTQ to try to secure a free plane ticket so he can actually afford the trip without eating nothing
but ramen for a month.

Another benefit is clarity. Rather than having to wonder if his 2043 rating will hold up or slip down to 101st, a player knows for certain that, as
soon as he reaches the threshold, he’s in. Currently, the rating system is incredibly intimidating and opaque for many players, and simplifying
it in a way that players can more easily grasp seems like a net positive. I envision an update to the interface of the ratings page that tracks and
displays a player’s peak rating for the qualification period, as well as that player’s current rating, so he has a better sense of how he
stands.

Like any change, this wouldn’t come without challenges. The first is logistical. The current invitation system allows for organizers to know how
many players will be potentially qualified for an event. With a threshold system, it’s difficult to gauge how many players will end up reaching
that threshold, so it’s much harder to plan effectively for venues or ensure that the event is within the desired scope for attendance numbers.

When I brought this suggestion up on Twitter, the most common concern I heard was of the potential for ratings fraud. Under this system, a group of
players could easily set up tournaments in which they concede to one another in order to inflate each of their ratings high enough to receive a
qualification. While I think this is certainly a concern, ratings fraud is already an issue, and I don’t see the potential for the abuse of this
system to be significantly higher than what currently exists. If anything, it seems like it would be much easier to catch that kind of fraud circle
than a single player whose rating is inflated by false tournaments, since it creates a discernable pattern rather than just looking like a dominant
player who wins all the time.

I certainly don’t think I have all the answers here, but I do think that I’m asking the right questions. How can we fix the rating system
so it doesn’t discourage players from actually playing the game? How can we make the way the system works less opaque to the average player? And
how can we make a rating system that more accurately reflects a player’s skill rather than simply their recent results?

What do you think? I’m definitely looking forward to feedback in the forums.

Until next time,
bmk