This is not another â€˜introduction to Legacy’ article. If you aren’t familiar with the format, check out this article from January describing the 50 decks of Legacy at the time, with matchup data drawn from the SCG Open results at the time. In addition, I attempted to sketch out the essential structure of the format at the very end of that article with some fancy graphs.
In this article, I will do four things. First, I will update and supplement that article with the new decks that have entered the metagame since that article. At the same time, I will suggest how some of the decks have moved within the tier rubric I set out in that article.
Second, and most importantly, I will examine both the popularity and performance of each major archetype over time, and graph them so you can visualize these trends. Both measures are important, particularly as you prepare for the Grand Prix. In the process, I will offer a metagame prediction for Grand Prix Columbus, complete with archetype percentages.
Finally, I will marshal the data analyzed in Part II to examine and critique recent arguments made in support of the banning of Mystical Tutor, including Tom LaPille DCI statement. Also, I will provide some distilled talking points for communicating with your friends on this issue clearly and effectively.
II. The 50 Decks of Legacy Supplement
The article I wrote in January remains surprisingly relevant. Almost everything that is in the metagame today is in that article, including Reanimator (deck # 27). Even obscure Dark horses like Imperial Aluren, that recently showed up in the St. Louis top 16 were on that list (deck #30). The only major difference in that regard is that many of the archetypes have shifted tiers. Reanimator jumped from a “Marginal” play (Section III) to a tier one deck within a month after that article.
As comprehensive as that article was, there nonetheless have been a few new decks have emerged into the Legacy metagame. These decks are not flash in the pans, but have put up consistent results over time. They have become established archetypes, and deserve to be added to the Legacy list.
51) New Horizons
How it Works:
This is a blue, green, white tempo/control deck, like many other decks in Legacy. Like all of the other U/G/x decks, such as Natural Order CounterTop (UGW), Team America (UGB), and Canadian Threshold (UGR), New Horizons features Tarmogoyf, Force of Will, Daze, Brainstorm and Ponder. Those five cards are found throughout the entire format. And like all of the UGW decks in Legacy, it features Swords to Plowshares are spot removal. And like both Team America and Canadian Threshold, it uses Stifle and Wasteland.
What makes this deck different then? It’s very, very similar to both Team America and Canadian Threshold, except that instead of black or red, it runs white as a third color. The major difference, though, is the presence of Knight of the Reliquary. This card is the lynchpin. Knight of the Reliquary combos with New Horizons to generate card advantage, a card advantage engine, that also happens to pump up Knight to absurd levels. Also, it helps generate tempo with Wasteland. Finally, it is a tutor for silver bullet answers like The Tabernacle of Pendrell Vale (for tribal Aether Vial decks), Karakas (for Reanimator), and Bojuka Bog (for graveyard decks).
Recent Performance Stats:
This deck first emerged at the SCG Open in Atlanta, which was held May 2nd, getting 3rd place. Afterward, this deck exploded in popularity. The next SCG event was June 6 in Philly, and it was one of the most popular decks in the field, as it also was in Seattle two weeks later. While I don’t have SCG St. Louis results as of yet, we have matchup results from the aforementioned SCG Tour® naments in which it was well represented. Atlanta. Aggregating both SCG Philly and Seattle, we get the following performance results:
8-2 vs. Reanimator
6-3 vs. Ad Nauseam Tendrils (ANT)
17-10-2 vs. Zoo
5-4 vs. Goblins
9-12 vs. Merfolk
4-6 vs. CounterTop
It’s obvious why this deck has been doing so well: it has hugely positive matchups against the three top decks in the format: Reanimator, ANT, and Zoo.
When the DCI was deliberating what to do with Legacy in early June, it’s possible that the most recent large result they were looking at was SCG Atlanta. In fact, if we look at time line this seems plausible:
SCG Atlanta was May 1-2nd.
SCG Philly was June 5-6th
Banning announced June 18th (decision was made much earlier)
There wasn’t a major U.S. Legacy tournament for over a month after Atlanta. Unfortunately, SCG Atlanta appeared to be dominated by Mystical Tutor decks. Yet, the metagame shift was swift. SCG Philly was dominated by New Horizons and Zoo. The DCI jumped the gun. According to these results, the metagame answer had already been found. New Horizons was a big part of it.
Lurking at the bottom of that list you can already see what the metagame response was, and it explains the SCG St. Louis results. Although Jared Sylva hasn’t yet published the SCG St. Louis results, I can already tell you what I think it will show: New Horizons showed up in big numbers, but got stomped by Merfolk and was a dog to Goblins. It’s easy to see why this deck has problems with Merfolk and Goblins. First of all, this deck’s mana base is horrible. It has one basic land, maybe two. That makes it perfect prey for Goblins and Merfolk. Secondly, it’s a blue Goyf deck, which is already weak to Merfolk. Merfolk decks prey upon blue green decks.
This also would explain why Merfolk and Goblins both put multiple players into the Top 8 in St. Louis. This is your Legacy metagame at work! Functioning as it should be. It’s too bad that the DCI stepped in and interfered. More on that in Part IV of this article.
To be honest, while I previously thought that New Horizons would be a great choice for the GP, and many of my teammates thought so as well, with the banning of Mystical Tutor, it’s best matchups have been taken out of the field or weakened. It’s sad because this was a great new deck; but I’m afraid that its reign will be short-lived given the banning of Mystical Tutor. It remains to be seen what sort of adjustments this deck will make if it can remain a part of the field.
This deck was the hottest deck in Legacy up until the banning of Mystical Tutor. Now? Well, time will tell whether this can stick around or whether it will go the way of Canadian Threshold and disappear.
52) CounterTop Thopter
How it Works:
The strategy of this deck is a familiar combo to Extended players. It’s built around yet another Legacy two-card combo: Sword of the Meek and Thopter Foundry. For four mana and both artifacts, you can pay a mana to generate a 1/1 token and gain a life. This combo first appeared in an SCG Open Top 8 with Peter Smutko’s 8th place list in Indy in March. A few weeks later in Florida, Counterbalance was added to the deck, and it looks pretty much like it does today. CounterTop is included, as is Enlightened Tutor and a toolbox suite of options, including Crucible of Worlds, Engineered Explosives, Pithing Needle, Tormod’s Crypt, on the artifact side, and Back to Basics, Moat, and Oblivion Ring in the enchantment side. This deck is also always packing at least two Jace.
Recent Performance Stats:
Take a look at Jared’s article from yesterday and see what he has to say about this particular sub-archetype.
Strategically, this deck seems weak to Merfolk, despite the lucky win in the finals of St. Louis (Alex should have won game one: if he had waited one turn to attack, he would have been able to play Lord of Atlantis and give his men Islandwalk). Expect this deck to be a close Merfolk matchup. I also imagine that Goblins gives this deck fits. Tactically, this deck seems weak to Krosan Grip, Pithing Needle, and Null Rod.
Since this deck won the final SCG Open before GP Columbus you can expect this to be huge at GP Columbus. It could very well be in the top 5 most popular decks, if not the top 3, and it will probably edge out Natural Order decks as the most popular Counterbalance variant. There is one major constraint on building this deck, and that’s the availability of Moat, which means that people will likely substitute Moat or just not run it if they can’t find one.
At the same time, I also expect this to be a mediocre performer at the GP. Having just won the last SCG open, the metagame response will be swift.
53) Show and Tell
- 4 Brainstorm
- 3 Show and Tell
- 1 Duress
- 4 Force of Will
- 3 Daze
- 4 Stifle
- 1 Form of the Dragon
- 4 Lim-Dul's Vault
- 1 Wipe Away
- 3 Ponder
- 4 Thoughtseize
How it Works:
Once a support player to Reanimator decks, Show and Tell has come into its own right with the Rise of the Eldrazi, and Emrakul in particular. While no specific variant has emerged as a favorite, Show and Tell appears to be the most efficient way to cheat Emrakul into play, and it is working. Rich Shay recently won a GPT with the deck, and, as you can see, it’s making Top 8s at the SCG Opens. This particular list features both Show and Tell + Emrakul and Phyrexian Dreadnaught + Stifle. Conley Woods unique brew used Mosswort Bridge (I mentioned this option as a way to cheat in Eldrazi in my Legacy set review).
Show and Tell is getting some buzz, especially with Conley Woods talking/writing about it, but it may still be the sleeper deck leading up the GP. I wouldn’t at all be surprised to see this deck finally take off at the GP. Watch out for it. Karakas will remain one of the best answers to Emrakul. Show and Tell looks like it may become a serious part of the Legacy metagame. Be prepared! I warned you!
In the 50 decks of Legacy article, I organized all 50 decks into 4 categories:
1) Top Contenders/Decks to Beat. These were the decks that will show up in any Legacy tournament, and are very likely to Top 8.
2) Decks you Might Face. These decks are the decks that aren’t quite as popular or prevalent as the decks in the first category, but there is a good chance you’ll see them in a Legacy Tournament. Goblins is a good example. And even if they are as popular as the decks in the top tier, they tend not to be as successful. Dredge is a great example of this.
3) Marginal Players. These are the decks that tend to show up in most large Legacy tournaments, but in very small numbers, and rarely do great. Decks like Landstill, Enchantress, Stax were good examples of decks in the top end of this category. Decks like Dragon Stompy, Affinity, Elves and Imperial Painter decks are paradigmatic examples of decks in the lower end of this category.
4) Dark Horses and Other Rogue Options. These are the decks that sometimes appear in Legacy tournaments. These are the truly rogue options. They exist, and sometimes see play, but infrequently, at best. They may also be older decks that have disappeared, but aren’t completely unviable anymore. Scepter Chant, Life Combo, and High Tide are great examples of decks in this category. This is the catchall category.
In the next section of this article I will give you my exact metagame prediction for Columbus and performance data for each archetype. However, I want to talk about some of the major shifts that has occurred between these categories.
What’s Moved Down?
Canadian Threshold and Aggro Loam have taken a nose dive, and dropped from the top tier to the third tier. Burn, once quite popular on the SCG circuit, has finally died down some. It’s no longer in the top 8 or so most popular archetypes, and has fallen from the second tier to the third. Ad Nauseam, of course, disappears into the third or fourth tier with the banning of Mystical Tutor. We will see Tendrils decks, they will just look somewhat different. I also think that Affinity has fallen from the third tier into the fourth. You no longer regularly see Affinity at the SCG events.
What’s Moved Up?
Reanimator went from being a Rogue option to a top tier deck. Despite the banning of Mystical Tutor, I believe it will stay in the top tier. Goblins has made a strong case that it’s now in the top tier, with its consistent and strong performance. It’s at least a rung above Lands.
III. The Grand Prix Metagame
In this part of the article, I will review the performance and prevalence of every major archetype in the format since GP Madrid. I will then give you my prediction for the GP.
1) Merfolk — 10.50% of the Field in Columbus
In the SCG series since Grand Prix: Madrid, Merfolk averages the greatest proportion of the field, edging out zoo by over half a percent. One thing is clear: there were will be hundreds of Lord of Atlantis’s at the GP. Take a look at this graph:
The blue line charts the percentage of the field that was Merfolk at each of the SCG events since late February. As you can see, Merfolk is consistently a huge part of the Legacy field. What’s amazing is that Merfolk has remained incredibly constant in the last five months at about 10.49% of the field to 11.02% of the field, with the exception of the last tournament, in which it was a very strong performer. I’m amazed that level of consistency over such a wide geographic expanse. The variance was ever greater than slightly more than half a percent!
This chart shows you how Merfolk has performed in that time period. The Y axis represents match win percentage against the field. In Richmond, Merfolk was the top performing deck, with an over 60% win percentage against the field. That win percentage steadily declined to around 50%, but has been trending back up ever since. It remains to be seen whether that will fall or rise with the banning of Mystical Tutor.
Tentatively, I think the answer is complicated. Reanimator was a good matchup, but ANT was a bad matchup, so on that front, the change is mixed. It’s bad because on the whole since Reanimator was a much larger part of the field than ANT, but it’s good because ANT is gone, and Reanimator will probably persist. On the other hand, Zoo and Goblins are weaker matchups, and New Horizons is a strong matchup. I think the answer depends on whether Reanimator and New Horizons will persist in the new metagame, and the quantity of CounterTop decks, a strong matchup, that appear in Columbus.
I predict that Merfolk will be about 10.50% of the field at the Grand Prix. It’s interesting to note that excel provides a regression (least square) analysis with a trendline, so I can extrapolate any trends, such as follows:
As you can see, the regression line is almost identical to the 10% line. Merfolk will be one of the top 3 most popular decks for sure. The question is whether it will be the most popular deck or not. One thing’s for sure: Merfolk will be an average to above average performer, and there will be tons of it.
2) Zoo — 10% of the Field
Zoo was the most popular deck at GP Madrid, with 225 copies in the field of 2200. It’s also been either the first, second, or third most popular deck in every single SCG Open since, or tied for that status. Take a look:
What’s really interesting is that Zoo has this oscillating pattern where it dips to around 9% of the field and then rises back to 11% of the field in the next tournament. I expect that Zoo will be about 10% of the field in GP Columbus. And I wouldn’t at all be surprised if it was the most popular archetype at the GP. Zoo and Merfolk will make up nearly a fifth of the entire field. How has Zoo performed?
This graph shows you the performance of Zoo over time. Zoo is a generally strong performer, and an enormously strong performer in recent months. It’s only below-50% performance was in Orlando, which can easily be discounted as an aberrant metagame. It’s win percentage is extraordinarily high, not counting St. Louis, which you’ll have to check out Jared Sylva article today to find out.
Take a look at the Zoo regression line:
This is an example of how regression as a forecasting mechanism can be misleading. I don’t think that Zoo will decline going into GP Columbus. I think it will be about 10% of the field.
3) CounterTop — 11% of the Field
CounterTop isn’t always the most popular deck, but it has been from time to time, and was at SCG Philly.
As you can see, CounterTop averages around 10% of the field, even though it’s dipped as low as just over 6% in Seattle, and been as high as 12.4% in Philly and nearly as high in St. Louis. Notably, the Midwest is where CounterTop has the highest numbers: Indy, St. Louis, and just east of Columbus: Philly.
The strong performance of CounterTop Thopter in St. Louis will undoubtedly give this archetype a huge boost going into the GP. Like Merfolk and Zoo, I wouldn’t be surprised if this were the most popular archetype. In fact, this is my odds on favorite to be the number one most popular archetype in the field. The problem, of course, is that it’s really multiple archetypes. The Thopter deck is very different from the Bant CounterTop lists or the Natural Order CounterTop lists. At the moment, this is the archetype I expect to show up in the greatest numbers. Zoo, Merfolk, and CounterTop will constitute a third of the entire field!
This graph shows CounterTop’s performance over the last five SCG Opens, not counting St. Louis (again, check Jared Sylva article today to find that out). As you can see, CounterTop, as an aggregate archetype, is an inferior performer to both Zoo and Merfolk. Look at the St. Louis results in Jared Sylva article today to see if its win percentage went up or down, and by how much in either direction. I predict it went up, but the question is by how much. Did it break the 50% threshold? A 46% win percentage against the field is pretty weak.
Now take a look at the regression line:
The regression forecasts CounterTop at about 10% of the field in Columbus. I think it will be higher, and around 11%. But really, what’s just 1%?
4) Reanimator — 9% of the field
This chart shows the percentage of Reanimator in the field over the last 5 months. Keeping in mind that GP Madrid and SCG Richmond were held on the same date, you can see that Reanimator’s proportion of the field has grown by leaps and bounds in that time. While it peaked in Seattle, it also has had strong showings elsewhere. What’s important is that Reanimator was the most popular deck in both Seattle and St. Louis. At the same time, there were no Reanimator pilots in the Top 16 of St. Louis, despite being such a huge proportion of the field, and there were no Reanimator pilots in the Seattle Top 8, aside from LSV, who made 5th. More on this in part IV.
Take a look at Reanimator’s performance over time:
In Atlanta, Reanimator made a statement with a strong performance and multiple top 8 appearances (despite averaging only a 52 win percentage against the field). However, , its win percentage has precipitously fallen as it has become enemy #1, and a prime metagame target. Check out Jared Sylva article today to see how Reanimator did in St. Louis. I would expect that it performed about the same as it did in Seattle, if not slightly worse. But I wouldn’t be surprised if it dipped further.
Given Reanimator’s enormous popularity in recent tournaments, I actually expect Reanimator to be in the top 5, and possibly even in the top 3 most popular archetypes at GP Columbus. The banning of Mystical Tutor should slightly lighten the load in terms of direct hate. Although Reanimator loses a critical tool, it’s also relatively easy to build for many players who already own the critical cards. Players will find other ways to replace Mystical, with Ponders or Hapless Researchers, or cards like Strategic Planning and the like. I expect it to be about 9% of the field in Columbus, and certainly no less than 5%.
5) New Horizons — 6% of the Field
New Horizons first appeared in Atlanta, and just took off. Part of the reason for this is, as described above, it’s strong performance against ANT and Reanimator. It was the fifth most popular archetype in St. Louis, and I expect that its popularity will persist, despite what I expect to be a precipitous drop in performance. See Jared Sylva article today to see its final performance stat.
6) Goblins — 6% of the Field
Goblins will be in the top 7 most popular archetypes at GP Columbus. What’s important about Goblins is that it appears in large numbers in environments where people don’t play as much Legacy. Thus, it was huge in Orlando and Atlanta, and huge at GP Madrid, at 5.6% of the field. It’s closer to 5% of the field in the Midwest, but it’s still strong there too.
Goblins is a good performer. The problem, of course, is that Goblins struggles against Zoo. Check out Jared Sylva article today to see how it fared in St. Louis, but I predict you’ll see an uptick in its performance against the field. It’s got great game against Merfolk, though, and any extant New Horizon’s lists. I predict that it will be between 5-7% of the field in Columbus, and probably closer to 6%.
7) ANT — <1% of the Field in Columbus
So much for the claim that ANT is only played in Europe, eh? Aside from SCG Richmond, ANT as a proportion of the overall field is right in line, if not greater, than GP Madrid. What’s really interesting is that for its last hurrah, ANT dipped to 3% of the field, after peaking at almost 8% in both Philly and Seattle.
As you can see, though, ANT is a decent performer. It’s just above 50% in the last few tournaments. Jared will tell us today how it fared in St. Louis.
The important point, though, is that this deck will virtually disappear for the GP. Don’t get me wrong, there will be Tendrils combo decks, just not one built so prominently around Ad Nauseam. Goodbye ANT. You were unnecessarily murdered.
8) Dredge — 4% of the Field in Columbus
Dredge has consistently been among the most popular archetypes in Legacy, often on account of the fact that it’s not terribly difficult to build. That’s why it shows up in larger numbers in metagames like Orlando and Seattle. If GP Columbus were an SCG Open, I’d say expect 2.5% Dredge. Since it’s a Grand Prix, expect slightly over 4%.
If you are considering playing Dredge, this chart will dissuade you:
Ouch. Dredge is one of the worst performers in the entire metagame. Interesting that it went from a 60% win percentage to 30% in just a few months. Undoubtedly, this was partly a product of Reanimator splash damage, which, unfortunately for Dredge, will persist.
9) Belcher -2.5% of the Field
People like to play Belcher at the SCG Opens, and it remains a constant 4%+ part of the field. It was only 1% of the field in Madrid, though. The field will be much larger in Columbus, so I expect all of the same people to play Belcher, but it won’t be as great a portion of the field.
Check out its recorded performance:
Like Dredge, Belcher is a pretty terrible performer. Don’t play it.
My GP: Columbus Metagame Predictions are as follows:
CounterTop: 11% of the field (at least half will be Thopter)
New Horizons: 6%
Tendrils decks: 2.5%
Show and Tell: 2.25%
Aggro Loam: 2%
B/G variants: 1.5%
The Rock: 1.5%
PT Junk: 1%
Painter variants: 1%
Dream Halls: 1%
Stompy variants: 1%
Suicide Black/Mono Black: 1%
B/W variants: 1%
Faeries variants: 1%
Team America: .5%
My GP Chicago metagame prediction was within +/- 20% of the actual field, and that was without any hard legacy tournament data of the sort we have now with the SCG series. I won’t be surprised if my GP Columbus metagame prediction is within 10% of the actual field, but I would be surprised if it was less accurate than my GP Chicago metagame prediction. I think it will probably about +/- 15% accurate, which is quite remarkable when you think about it. That’s better than most economists.
IV. The Banning of Mystical Tutor: A Critique
Here’s a striking fact: Since the inception of the Legacy format, no card naturally introduced into the format or legal when the format was created has been banned for power level reasons or on account of tournament dominance, until now. In that time (six years), only four cards have been banned: two because power-level errata was removed, making them creating fast combos(Time Vault and Flash), one for logistical reasons (Shahrazad), and one that was introduced by Portal and was a near-functional equivalent of a card already on the banned list(Imperial Seal). This, despite numerous calls to ban cards over the years, from Sensei’s Divining Top to Lion’s Eye Diamond.
In my view, the banning of Mystical Tutor was unwarranted, unnecessary, and a mistake. In this part of the article, I will examine the evidence for the banning of Mystical Tutor, beginning with a comprehensive empirical review of the data. A close review of the data shows that Mystical Tutor decks were neither dominant in terms of the field, metagame performance, or tournament top 8s. In fact, they were mediocre performers. Next, I will examine the justifications articulated by the DCI in support of the banning of Mystical Tutor. Upon close inspection, these justifications are unsatisfactory and unpersuasive. The explanation offered by the DCI is highly formalistic and too reductionist, failing both to explain exactly why or how Mystical Tutor is now problematic, and to recognize that Entomb is a major reason why Mystical Tutor has been powered up. More importantly, the justifications advanced are so broad that they are limitless in terms of their potential applicability, and as precedent. Finally, I address claims that the data is faulty, and cannot be relied upon because the SCG Tour® naments were soft, and had too few skilled pilots to establish archetype strength. As I will show, these arguments are not only flawed, but unpersuasive on their own merits.
The purpose of the Legacy Banned List is known: to promote the fun and health of the format. Because everyone has their own idea about what makes Magic fun, the DCI has a tough job. Some people hate blue. Some people love it. Some like to attack with big creatures. Some people like to win with combo. Everyone enjoys different things in Magic, and there are no principled ways to decide between these subjective views of what makes Magic fun. The good news is that there are a few things that most people agree are important to making a format fun, and that can be objectively measured:
1) Formats with a dominant deck or strategy are unfun. Most players agree that formats strangled by a single archetype over a long period of time are not fun. This can be measured using tournament results.
2) Diverse formats are fun. Formats with many different archetypes to choose from and strategic options are more fun. This largely follows from the first point. Formats with a dominant deck are not diverse. Players can select decks more to their style and liking. This, too, can be easily measured by looking at the number and variety of archetypes making Top 8, or constituting more than 5% or 10% of the field, respectively.
The DCI is on safe ground when it manages formats using these principles, since it accords with virtually everyone’s understanding of what makes a format fun. When they manage the format according to other, less objective criteria of what makes a format fun, they risk making the format less fun. One person’s unfun is another person’s fun. Restricting cards based upon subjective criteria risks doing more damage than good; especially since there are often negative, unintended consequences to a restriction. Restricting a card because it’s â€˜unfun’ may actually reduce the diversity in the metagame or help clear the way for a dominant deck. With these principles in mind, let’s take a look at both the prevalence and performance of Reanimator and Ad Nauseam Tendrils, the two primary Mystical Tutor decks.
This chart graphs the prevalence of Reanimator in each respective tournament over the last five months:
Reanimator had become one of the most popular decks in Legacy. For three of the last four SCG Opens, it was the most popular archetype in the field, peaking at around 13% of the total field in SCG Seattle. At only 13% of the field, Reanimator was slightly more prevalent than Zoo, CounterTop, and Merfolk (see Part III, above, for those stats), which are around 10% of the field respectively. Reanimator’s metagame presence actually enhanced format diversity, by adding another clear fourth pillar to the top tier trifecta of CounterTop, Zoo, and Merfolk. And its growth does not correlate with a decline in any other major archetype, as you can see by inspecting the trendlines of those archetypes in Part III. In fact, on the contrary, its growth has apparently contributed to the development and growth of another archetype: New Horizons, which has a good matchup against Reanimator. Those five decks: Reanimator, Zoo, Merfolk, CounterTop and New Horizons collectively represented 48% of the field in St. Louis. In my view, that represents an extremely diverse field, when the top five archetypes in the field don’t even collectively make up half of the field. Conclusion: Legacy with Reanimator is an extremely diverse metagame, at least from the perspective of the field as a whole. Let’s look at Reanimator’s performance stats, but in the field generally and in terms of top 8s, to see if that information suggests a different conclusion.
This chart graphs the performance (win %) of Reanimator against the field in each tournament for which this data has been collected:
Reanimator’s enormous growth in popularity has not coincided with an uptick in performance. Reanimator’s performance has ranged from an average 47.57 win percentage against the field in Indy to a 52.17 win percentage against the field in Atlanta, from which it has continued to decline. I suspect that Jared Sylva stats today will show similar trends for the archetype. For three of the five SCG Tour® naments so far, Reanimator’s overall win percentage against the field is under 50%. That win percentage places Reanimator well below the most of the other major archetypes.
Fact: Of the 8 most popular archetypes in Seattle, Reanimator was the 6th best performing, in terms of win percentage against the field. Fact: Of the 8 most popular decks in Philly, Reanimator was the 5th best performing, in terms of win percentage against the field. Fact: of the 8 most popular archetypes in Atlanta, where Reanimator had its best top 8 performance, it was only the 4th best performing archetype, in terms of win percentage against the field. And so on.
In terms of average win percentage against the field, Reanimator is not only far from dominant, it’s mediocre, if not weak. Conclusion: Reanimator is not a dominant deck in terms of metagame win percentage, relative to other major archetypes. No serious argument can be made, on the basis of this data, that Reanimator was a dominant deck.
However, % of the field and average win percentage against the field are not the only measures of dominance or metagame performance. One argument that has been suggested is that Reanimator (or ANT’s) win percentage is being dragged down by weaker players. I address this argument more comprehensively in a section below, but I will note a few things here. First of all, Reanimator is not a difficult deck to play, regardless of what people say about ANT. Reanimator is a linear, straightforward deck. Second, the metagame performance measure is an average, meaning that good players doing well will pull up the average. As I say below, there is no reason to think that the average Reanimator players is any worse than the average CounterTop or Zoo player. And, even if we made that assumption, it doesn’t matter: it only takes one good player to prove otherwise. It only takes one good player to make a top 8 appearances. So, for example, if for some reason we felt that Reanimator was the best deck, but that it was too hard to play, then surely the few good players, such as Gerry Thompson or LSV, would have won the tournament with it, or at least made top 8. That would mean that we could and should also look at top 8 data as a third measure of dominance. Let’s do that.
This chart shows Reanimator as a proportion of the field and of Top 8s. As you can see, up through SCG Atlanta, Reanimator put a greater proportion of itself into the top 8 than its share of the field. After Atlanta, that trend reversed, despite Reanimator increasing in popularity. What’s most noticeable is the virtual absence of Reanimator from Top 8s after Atlanta, particular when juxtaposed against its increasing popularity in the field as a whole.
The argument that Reanimator players are just bad is unavailing because it only takes one non-bad Reanimator player to put a copy into the top 8, and not a single player could do that in Philly or St. Louis, and only LSV could do that in Seattle, and he only made 5th place. That, despite the fact that there were record numbers of Reanimator players in field (22 in Philly, 25 in Seattle, and 24 in St. Louis). Yet, not a single one out of those 71 players could make top 8 aside from LSV, not even Gerry Thompson! And, if it takes LSV to make a top 8, well, let’s just say that that’s not evidence that a deck is a problem.
Conclusion: Reanimator is not even remotely a dominant deck, when measured in terms of metagame performance, metagame presence, or Top 8 appearances. There is absolutely no data whatsoever to justify the banning of Mystical Tutor in terms of Reanimator’s tournament performance statistics.
The bottom line is that Reanimator was a top deck, and a strong performer, but that the metagame had been adjusting, and it’s performance fell accordingly. The DCI acted prematurely, if anything.
Now, let’s perform a similar analysis on ANT, which got whacked.
ANT’s percentage of the field over time:
ANT’s metagame presence had grown steadily over time, peaking in Seattle. ANT’s popularity has grown and peaked just before the banning, at 8% of the field in Seattle. What’s interesting is how few people played it in St. Louis (just six). One could speculate that the reason for the dip in popularity in St. Louis is that players would anticipate a surge in ANT hate, on account of Mystical’s banning, and a final opportunity to play it in a large Legacy environment. That argument might seem plausible if it weren’t for the fact that the other Mystical Tutor deck, Reanimator, saw a surge in players playing it, with the same reality.
There is absolutely no evidence here that ANT was a dominant deck. Nor, if we combine ANT and Reanimator, is there are any evidence that Mystical Tutor was a dominant tactic.
This chart shows you the combined metagame presence of both Mystical Tutor decks. While Mystical Tutor decks, as a portion of the overall field, peaked in Seattle at just above 20% of the field, that is far less than the presence of many other tactics, such as Tarmogoyf, Force of Will, and on par with Aether Vial.
This graphic shows the proportion of Mystical Tutor decks (as measured by the combined numbers of ANT and Reanimator) and the proportion of Aether Vial decks (as measured only by the combined numbers of Merfolk and Goblins, which undoubtedly understates the number of Vial decks in the field). As you can see, the two are roughly comparable in terms of metagame appearances.
ANT’s performance over time:
Like Reanimator, ANT was a top performer in Madrid. But following that performance, it has seen some good days and bad days in the US. It was very strong in Orlando, a metagame where there was a bunch of ANT and Dredge. However, it’s been weaker ever since, despite doing above average. Of the most popular archetypes, ANT was the 5th best performing archetype in the field in Seattle, and the 3rd most popular in Philly.
The bottom line is that ANT’s performance was perfectly normal, and within the range of any other Legacy deck. But what about top 8 performances? What if ANT was merely dragged down by weaker players?
As you can see from this chart, ANT put exactly two players into SCG Top 8s out of 7 different attempts, and out of those seven attempts, it ended up winning SCG Atlanta.
The two tournaments were ANT and Reanimator performed best were GP Madrid and SCG Atlanta. ANT won SCG Atlanta, and there were three Reanimator players in the top 8. I’m concerned that the DCI put too much stock into this event, since there was no SCG Tour® nament until after it had decided to ban Mystical. Yet, the three subsequent tournaments have clearly established that neither deck is a problem. Whether measured by metagame presence, metagame performance, or top 8 appearances, Mystical Tutor decks are not remotely problematic. They aren’t dominating tournaments by any stretch of the imagination, nor are they reducing the diversity of the field. On the contrary, they appear to be increasing the diversity of the field, and thus making the format more fun.
Why, then, was Mystical Tutor banned? Last week in his column, Latest Developments, Tom LaPille explained the DCI’s decision. I will parse out his reasoning here, and then summarize the main points.
The Tutor Tiers
First, he recounts the history of the Legacy Banned list, and the idea, put forward by Aaron Forsythe when the first banned list was developed, that Mystical Tutor was a â€˜tier two’ tutor, and that only â€˜tier one’ tutors would be banned in the format. Thus, Demonic Tutor, Demonic Consultation, and Vampiric Tutor would all be banned while Mystical Tutor and Enlightened Tutor would not. In light of the performance of Flash decks, and then both ANT and Reanimator at GP Madrid, Tom concludes this section by stating that the DCI has revised its view of Mystical Tutor, and considers it now to be a â€˜tier one’ tutor.
Let me just say that this whole idea of’ tutor tiers’ is silly, if not absurd, especially if we take it seriously in terms of a possible banning. Tutors don’t inhabit tiers like books sit on a shelf. The idea of tiers of tutors is just conceptual framing, and has only the loosest connection to reality. Tutors wax and wane in power and utility in different contexts and metagames. Gamble can get any card in the format for one mana. In Loam decks, it’s arguably better than Demonic Tutor. To reason that Mystical Tutor can be banned because it’s now a â€˜tier one’ tutor begs the question: why is it now a tier one tutor? Stating, in conclusion, that Mystical Tutor is now a tier one tutor, is not a satisfactory explanation for banning it.
As I’ve written about before, â€˜power’ is a folk taxonomy: “What we call power is actually the sum of a card’s synergistic interactions in the card pool relative to other cards.” The same could be said about Tom and Aaron’s â€˜tier’ concept. Tom’s reasoning is fiercely formalistic and highly reductionist. The idea that a card can or should be banned because it jumped tiers is nothing less than abstract, formalistic reasoning that could be deployed in support of any position.
Moreover, it’s reductionist because it doesn’t ask why Mystical Tutor is more powerful now. It considers Mystical Tutor in isolation, as if that’s possible. I can answer this question though: the reason Mystical Tutor is more â€˜powerful’ is because of two things: 1) the unbanning of Entomb, and 2) the printing of Ad Nauseam. Neither Entomb nor Ad Nauseam existed in the original Legacy card pool. Thus, Mystical Tutor was powered up, but it wasn’t because it was now a tier one tutor in some abstract sense; there were two key synergies that powered it up. More on this below.
Next, Tom talks about testing Reanimator and ANT decks in the Legacy Online format. And while he acknowledges that the â€˜practice room’ hardly resembles a tournament environment, he nonetheless states that opponents could not beat his decks without heavy sideboard hate or maindeck set up.
No one better than the folks at Wizards understand how different the expectations and experience of people who make Magic is from the experience of large tournaments played all over the world. How often does the Future Future League actually accurately predict a forthcoming Standard? Never, I would expect. Yet, that’s exactly what the DCI is apparently doing here: testing matches as a basis for DCI policy. Yet, years of experience as developers has taught Wizards, I would hope, to be skeptical about such an epistemic practice. Magic metagames are large dynamic systems, and a few, even highly skilled and highly qualified players, are not even remotely sufficient to make precise judgments about them. That’s why the most effective basis for managing the Banned and Restricted list is tournament results, not individual player results, even if that player is Tom LaPille.
The Gentleman’s Agreement
“Mystical Tutor decks were quite rare at Legacy tournaments that did not have tons of money on the line.”
First of all, this is verifiably untrue. There are numerous recorded local tournaments on the Source and elsewhere where there were plenty of Mystical Tutor decks in the field or making top 8. Such a statement requires evidentiary support, which I believe the DCI is lacking. Most TOs don’t send their decklists (if they even collect them) to Wizards headquarters, so I’d really like to see what data they are referencing.
Second, and more importantly, even if this were true, it’s not relevant. This only matters if we further assume that when people do play their Mystical Tutor decks, then they just outperform everything else. That’s where this assertion links up with the discussion of online testing.
Yet, the data starkly refutes this assumption, which I’ve just gone through at the beginning of this part of the article, documenting the performance of both Reanimator and ANT in the metagame where Mystical Tutor decks are heavily played over the last five months.
Tom says that players who abided by the gentlemen’s agreement had more fun, that they “were experiencing a better variety of decks…” This is a broad claim that is not supported by any evidence whatsoever. Moreover, I would expect exactly the opposite. Mystical Tutor decks as a part of the field increase the variety of the decks in the field. And, as we know, players enjoy diverse fields. A diverse format is generally a fun one.
Also in this regard, Tom claims that people who abided by the gentleman’s agreement also experienced a “higher quantity of recognizable baseline Magic gameplay.” Beyond being inherently vague and patently indefinable, my problem with this statement is twofold: first, it assumes that this means that a format is more fun. In my experience, one of the great things that makes Magic fun is the unintended interactions that emerge out of large cardpools. People play Legacy because they enjoy that card pool. Second, it’s too broad to be a standard for managing the Legacy Banned List. Legacy players should count their blessings that the DCI has never employed such a standard before, otherwise who knows what the Legacy banned list might look like. These vague notions can be deployed in support of any banning, and are not indicative of a genuine problem.
The whole idea of a gentleman’s agreement is not only factually wrong, but it’s irrelevant. Hypothetically, even if there were a gentleman’s agreement, that isn’t a justifiable basis for banning: If there was such a thing, then in the environments where people abided by it, Mystical Tutor doesn’t see play, so Mystical Tutor was not a problem. And in the environments where the â€˜gentlemen’s agreement’ didn’t exist or was violated, Mystical Tutor decks either weren’t too good and therefore weren’t a problem (see the Evidence above). Either way, the notion of a gentlemen’s agreement does not justify banning Mystical Tutor. That’s why this concept has rightly been derided by most Legacy players. Neither Tom’s tier concept, the notion of a gentlemen’s agreement, nor his highly suspect, anecdotal online practice room testing is even remotely satisfactory as a justification for banning, and in all of these cases, logical and evidentiary support is lacking, if not entirely absent.
Next, Tom turn’s the question: why not just put Entomb back onto the banned list? He acknowledges that this is a good question, but then defends the original decision to unban it, stating that they don’t think unbanning it was a mistake at all. What’s interesting here is how reductionist this analysis is. Mystical Tutor was legal in the format for years, and while it was used in the Flash deck, most players regarded Storm combo decks as either innocuous or fair. Decks like Iggy Pop or The Epic Storm generated interest, but never won more than their marginal share of tournaments. I regard Mystical Tutor as being powered up by two specific events: 1) the unbanning of Entomb, and 2) the printing of Ad Nauseam, and (1) is clearly the more significant, since it was the proximate event, and Ad Nauseam has been legal for two years now. Viewing Entomb and Mystical Tutor as distinct and separate questions is the essence of what Tom’s doing here, and it’s falsely reductionist. Instead, these two cards interact and power each other up.
Then, and perhaps even more troubling is this statement in defense in the unbanning of Entomb: “We think it’s cool that Reanimator is a deck.” But it’s not cool that there is a good Tendrils storm combo deck? If the DCI banned Mystical Tutor instead of Entomb because it likes Reanimator, how is that not purely arbitrary? On what neutral principle can the DCI be said to have chosen between ANT and Reanimator, knowing that Mystical Tutor will hurt the ANT decks a lot worse than the Reanimator decks? There is none. Increasingly, this makes the DCI’s decision to unban Entomb appear to be the real problem. Yet, not unsurprisingly, the DCI refuses to recognize it as such. As a side note, I couldn’t understand why the DCI unbanned Entomb, when there were at least a dozen safer targets on the banned list.
Ultimately, the DCI’s reasoning, as presented in Tom’s article, is unsatisfactory. At best, it’s too abstract, formalistic and reductionist. It’s at such a high level of generality that it could be used to support virtually any decision, as Matt Elias satirical take last week nicely demonstrated (mirroring an approach I used with the 2008 Vintage restrictions here). At worst, it’s troubling that the DCI perceived some sort of â€˜gentlemen’s agreement’ and disturbing that Magic Online testing of Legacy, a format that is significantly different from its paper counterpart, was even relevant to the DCI’s decision-making process, let alone the use of a practice room as datum.
Ignoring the SCG Results: Do the SCG Results Count?
A few players have explicitly or subtly implied that we should discount the SCG Open results, which unequivocally show that Mystical Tutor decks are not a problem. Kyle Boddy, for example, stated:
I would not take the results from the SCG Open series tournaments too seriously. I would discount their results dramatically when compared to large WotC-sponsored tournaments like GPs/PTs.
The SCG Opens are simply not wholly representative of the professional Magic circuit, and to use them as the sole basis for bannings is silly.
This is not a new argument. A few months ago Max McCall argued that ANT and Dredge were the best decks in the format. More recently, he said:
It was pretty clear that Reanimator was the best deck in the format, with Tendrils being probably second best, depending on how many people were playing Counterbalance at the time.
Such a statement is only true if you believe him, and not your lying eyes, to paraphrase Richard Pryor.
How could he make such a claim when the data is sharply contrary? According to Max and others, the reason that SCG Tour® naments don’t substantiate these claims is because the player skill level was so low. Other anecdotal evidence included reports from LSV and other Pros that the SCG Legacy Open’s were by far the â€˜softest’ tournaments they’ve played in. Max renewed this argument in his article two weeks ago:
I’m firmly in the camp that if you’re not level five or higher, you’re probably pretty bad at Magic, but whenever I make the argument that the results of the StarCityGames.com Open don’t really reflect the power of certain archetypes because of how “bad” the Open players are, I’m told that I just have anecdotal evidence but no real proof.
Now, I spent most of the last year in a computer lab doing statistical analysis. I believe in the value of empirical data, and I know how pernicious anecdotal evidence can be. So I was pretty hesitant to just make a blanket claim that everyone in the U.S. metagame was garbage and that the lack of tournament data to document the dominance of certain archetypes didn’t reflect how good (and bad) some archetypes really were.
And yet, in the next few paragraphs, Max presents nothing by anecdotal data. How can you claim to be against anecdotal data, and then rely on it? I will share his anecdote in a moment, but I want to address the core claims: Are SCG players worse? And, if so, does that matter?
Knowing Tom LaPille, I suspect that he either agrees with Max or is sympathetic to that perspective. Max’s claim, and others who share his view, is founded on one of two assumptions:
1) Either the Mystical Tutor players, on average, are worse than the pilots of other archetypes on average (unless you have level 5+ in the field, of course)
2) Mystical Tutor decks are harder to pilot than other decks, and require a certain density of pros to play them.
Both assumptions are similar, but either one supports the argument Max is advancing. Sometimes one is explicitly made, and sometimes both.
The truth is that neither assumption matters, for a very, very simple mathematical reason.
Suppose you have a 200 player tournament, as many SCG Legacy Opens are. Then, suppose there are 40 Mystical Tutor players, and that only 1 of those 40 players are â€˜good.’ It only takes that one â€˜good’ player to win the whole tournament, if their deck is good enough. Thus, the relative skill level of the Mystical Tutor players to the entire field doesn’t matter whatsoever as long as you have at least one â€˜good’ player. Mathematically, you only need one Mystical Tutor deck and one Mystical Tutor player in a field to have that Mystical Tutor deck win that tournament.
Thus, the player skill distribution (the mathematical curve that represents the skill level of the archetype pilots) could be way, way lower for Mystical Tutor pilots than for other archetypes, and that would still not prevent Reanimator or ANT from dominating the tournament. After all, it only takes 1 player in a top 8 to constitute 12.5% of that top 8. And it only takes 2 to make it 25% of Top 8s, which is far more than enough to show up in lots of trend data. If these decks were that good, and since they are played in large enough numbers, Max’s claim would have to be that all of those players are bad, which is just implausible.
And even if that were true, it would further have to be true that these players are worse than other archetype pilots, as a class. Yet, there is absolutely no non-anecdotal evidence whatsoever that the skill distribution (that is, the bell curve showing the distribution of pilot skill by archetype) is statistically significantly different than the skill distribution of players of any other archetype.
Visually, there is no reason to think that the skill distribution of Reanimator pilots vis-Ã -vis Zoo is like this:
That is, if we imagine that every set of archetype pilots have a normal distribution curve (a typical bell curve, with an average, and measureable standard deviations), or something that resembles one like the Zoo curve above, there is absolutely no difference that those curves are differently shaped for some archetypes than others. There is no statistically significant evidence that, as a class, Reanimator pilots have a differential skill distribution.
The evidence put forward to advance such claims is evidence like this (purely anecdotal), from Max’s article:
At the Seattle Open, in the 4-1 bracket, I played the Reanimator mirror. My opponent discarded Iona, Shield of Emeria to Careful Study, and was shocked when I cast Reanimate on it. He was totally unaware that Reanimate could target creatures in any graveyard. I’ve been unable to attend other large U.S. tournaments, but friends of mine have reported that the quality of play in the Legacy Opens is unbelievably low compared to a PTQ. Certainly, we’re all familiar with the gaffes shown on the ggslive coverage; the Lands player who got a game loss for procedural error in a game 3 that he couldn’t possibly have lost otherwise, and the famous “Dark Ritual, go” games spring readily to mind.
Indeed they do readily spring to mind. And that is exactly my final point:
Errors with Mystical Tutor decks are perceptually salient and more readily encode in memory. I’ve made this point on the Source, but it wasn’t well understood. Let me put it in common English: When people screw up with Mystical Tutor decks, it stands out, and it’s obvious. It stands out both because it’s so obvious and because it’s usually proximate (that is, close in turns/time) to end the game. As a result, these errors encode more easily in memory.
Let’s contrast two game-ending errors that will make my point:
ANT player goes: Dark Ritual, Go, realizes that they can’t go off this turn, after counting their mana and storm, and pass the turn. For lack of that Dark Ritual, they lose the game.
Zoo player fetches out Savannah instead of Taiga on turn one, plays turn one Nacatl, turn two Tarmogoyf, but can’t play two burn spells on turn four and a Fireblast to win the game on turn five because they only have one Mountain in play.
Both the ANT player and the Zoo player made a key mistake: the ANT player played Ritual at the wrong time, and the Zoo player fetched out the wrong dual land on turn one. Both plays directly led to a game loss. Yet, which one stands out? The casual observer may not even notice the Zoo player’s mistake, which was egregious, and no less costly. Yet the Dark Ritual play stands out like a sore thumb. That’s what I mean by perceptual salience.
When a Zoo player plays Lightning Bolt on turn one instead of Wild Nacatl, people don’t stand up and say: God, Zoo players are so bad! Yet, a similarly egregious error with ANT creates a perception that ANT players, as a class, are terrible.
This salience actually leads to the stereotype that ANT is harder to play, or that combo pilots, as a class, are worse than other pilots. Decades of psychological research on the relationship between stereotyping and salience confirm this.* The more salient a stimulus, the more information about it we are likely to perceive, encode, and store in memory. This is actually how stereotypes are formed, and explains why certain Magical stereotypes (such as those about ANT) arise in the first place. If mistakes with ANT are more salient, they are more likely to be remember and contribute to stereotype formation.**
These stereotypes then produce confirmation bias, which is the phenomenon wherein information that confirms the stereotype (or the schema) is more readily encoded (stored in memory) and recalled, whereas instances that are contrary to one’s stereotypes are not as readily encoded or recalled, yet are no less prevalent. Thus, the stereotypes (or schemas) that we create as a function of salience affect our perception, memory encoding, retention, and recall. A more systematic review of the data will disprove these assumptions that Max and others ascribe to.
Moreover, Max’s argument is unavailable for another, simpler reason: if most players are terrible, then this applies equally to Mystical Tutor players and non-Mystical Tutor players alike. Skill then cancels out, and the SCG results are therefore valid. Either way, whether we buy his assumptions or not, the SCG results matter, and they are telling us that Mystical Tutor decks were not a problem.
I know that this is a difficult thing to talk about, especially with friends or colleagues that may disagree. Let me distill my main points in three talking points:
1) Neither Reanimator nor ANT were a problem. Statistically, they were not too good. The metagame had adjusted, and the performance data proves this.
2) There are bad players who pilot every archetype, and there are good players that pilot every major archetype. There is absolutely no non-anecdotal evidence that Mystical Tutor pilots are worse on average than any other group of pilots. If Mystical Tutor decks were too good, the good pilots would have proven it.
3) The reason that ANT pilots seem worse is 100% a function of cognitive bias. Errors with ANT and Reanimator are more obvious and proximate to the end of the game, and therefore more memorable.
Why does the banning of Mystical Tutor matter? What’s the harm?
One of the things I really liked about Legacy is that every deck had a bad matchup. This feature of the format promotes balance. Zoo was weak to ANT, but strong against Merfolk. Merfolk was good against Reanimator and CounterTop, but weak to Goblins. And so on. Whenever you take out a card like Mystical Tutor, which supports spell-based combination decks, you take out a potential metagame answer. Every metagame player has a role, including Burn. These decks keep other decks honest. While I don’t think that Legacy is going to become an imbalanced format as a result of this banning, it’s not healthy to take out answers that can help the format adjust if something were to go wrong. It’s like taking away from of the format’s natural antibodies if a virus were to sweep through the format. Diversity is a natural strength, and reducing format wide diversity with this banning has the potential to harm the format in the long-run. This kind of decision also sets a bad precedent in this regard. The DCI should be humble both in its ability to correctly manage the Banned and Restricted List and in recognition that such management can have unintended consequences to be concerned about. That’s why the DCI should only ban cards if there is strong evidence to support it.
Until next time…
P.S. I want to thank Jared Sylva for his hard work, which I’ve built upon here. Also, check out his article yesterday.
* See, for example, Shelly E. Taylor & Susan T. Fiske, Salience, Attention, and Attribution: Top of the Head Phenomena, in 11 Advances in Experimental Social Psychology 249(Leonard Berkowitz ed., 1978)
** Importantly, the more salient a stimulus is, the more likely we are to use visual rather than verbal processes to encode it in our memory. This matters because evidence suggests that visually encoded information is recalled more readily than verbally encoded material. See the study cited just above.