Sullivan Library – The Truth About Standard

Read Adrian Sullivan every week... at StarCityGames.com!
Thursday, December 17th – The Standard metagame seems to be swinging away from the Jund-dominated format it was mere weeks ago. While Jund is still extremely strong, speedy Red strategies (and others) appear to have the upper hand in those matchups. Adrian dissects the States results to form a clearer picture of the metagame going forward…

The truth about Standard is that it is only now beginning to be “solved.” Put another way, the “Hive Mind” are only just now coming up with the decks that can put up a fair fight against the format’s presumed king, Jund.

Jund has an incredible head start in the war. Coming off the back of Pro Tour: Honolulu, Jund decks were one of the clear victors, and the porting of the Block Jund deck to Standard was fairly easy. After the rotation of Lorwyn, Jund decks shifted ever closer to this Block base. For many decks, Lightning Bolt and Garruk Wildspeaker are the only non-Block cards in the deck. While there is a fair amount of variation, the essential Shards base of the list (Bloodbraid, Broodmate, Leech, Thrinax, Blast, Terminate, Blightning, Pulse) is just so easy to riff on, every other deck is put into a position where it really needs to be in great shape for a fight if it wants to have a chance.

And so, what is interesting, is that decks are finding their footing.

Problems with the Data Set

If we look at the results from States (yes, I’m aware that it is being packaged as “The2009s,”,and I’m aware that the Provincials exist too, but I’m going to be stubborn), there are real clear limitations in the amount of data that we can actually extract. (Note, all my comments on data are those that were available at the time of writing. More data would, of course, require a retooling of the results.)

The obvious question is the problem of representation. We have literally no idea how many people played which decks. What we have currently is a list of Top 8 competitors, but not an exhaustive one. So, for example, we know that Turbo-Fog had 11 players play it to a Top 8 finish. But, how many players played Turbo-Fog. There is a similar problem with every measure. Take Jund: 78 players in the Top 8s played Putrid Leech, and 35 players did not. It’s entirely possible that the same or a similar number of players played non-Putrid Leech and Putrid Leech lists. If this were the case, one could make very clear conclusions about the viability of Putrid Leech in Jund in the current metagame at large. What you don’t want to do is make the kinds of stunning misconclusions that were so common during the latter days (after multiple card bannings) of Trix, where pros swore up and down it was the best deck, but if you actually looked at the data, you could see, plain as day, Trix was underperforming based on its player count (particularly when you took into account just how many of the very best players were playing it). Without full access to this data, one has to go with what I’ve long termed the “Winner’s Metagame”.

The Winner’s Metagame is basically what it sounds like: it is a representation of what the metagame looks like if you’re doing well. In any real metagame, there are actually numerous metagames going on, some of which you can’t really quantify. How do you quantify the various metagames of players who haven’t put in the work to know the format? Or the players that have misunderstood the format? Or the players that outthink themselves in their metagame version of The Princess Bride? About the only way you can really capture these things is to ignore the question of reason, but rather recognize that there are various strata in metagames. The most common ones that matter are the decks that are winning (The Winner’s Meta), there are those just below the surface (classically called “The Jungle”), and the metagame at large. You need to be aware of the metagame at large, in most events, because you’ll have to play against something those initial rounds, even if they have been slightly sifted by, say, the presence of byes. You want to be aware of the Winner’s Metagame so that you can know what to expect if you’re doing well. And you need to be aware of The Jungle so that you don’t get dragged down into Loserville because you got paired against the most popular deck that isn’t quite there.

As I said, though, without numbers on each group that played in the entire tournament system, the best we can do is the Winner’s Metagame. There is still a lot of useful information here, though. With the information from a Winner’s Metagame, you can get a sense of what archetypes are worthy of paying attention to, and how much you can pay attention to them. With a Winner’s Metagame, you can try to understand just what kind of opposition you can expect to face if you do make a single elimination.

Another problem that we have is the question of data relevance. In the field of statistics, the concept of meta-analysis deals with the idea of taking samples from numerous diverse systems. Take, for example, the world of politics: Nate Silver of FiveThirtyEight.com successfully predicted the results of 49 out of 50 states in the 2008 Presidential Election — an unprecedented achievement. He did this by combining the results of polling data from an incredible amount of different sources, each of which may have used any methodology it wanted. From this hodge-podge of data, by weighing each source, a much more accurate analysis can be made.

In the realm of Magic, we can look at it in this way: it is probably that a tournament with more rounds of Swiss is going to give a more accurate representation of which decks are “best”. Further, simply by having a larger pool of players, the probability for “best decks” being “found” increases as well (even as there are increased variance in possible results). Unfortunately, with the data, there is no way to tell just how many people competed in a particular event.

So, for example, I’m willing to bet that Turbo-Fogs three victories (in North Dakota, Tennessee, and Wyoming) and Vampires victories (in Hawaii and British Columbia) are not as good as evidence for the quality of those decks as the wins in areas that are likely to be heavily attended. (The Tennessee event is the only one likely to have had a larger attendance, if evidence from PTQs is any guide.) Conversely, I’m willing to give far greater confidence in data that comes from areas that had large attendance, such as the event I attended in Wisconsin. Without access to this data, what we do end up getting is a dilution of our theoretical Winner’s Metagame with smidges of a more general metagame and with the Jungle. In practice, what this means is that certain deck data just needs more skeptical consideration.

Finally, there is the problem of representation at large. With small sample sizes for some decks, we get a very minimal sense of their actual performances. Take Shaheen Soorani Blue/White Control deck, which he piloted to a Top 8 at States. Currently, the data doesn’t show any other players who made Top 8 with the list anywhere. It does not stand to reason that the EV for a player playing his list in a Top 8 would be the same result as his. On the other hand, for the more represented decks, we can have a quite strong confidence.

All of the caveats aside, there really is a lot of rich evidence to look at.

The Gist

If we look at the data thus far, the first place to start is just archetype counting. In many cases, I merged similar archetypes, for the ease of analysis (and as a means to combat small sample size problems). Here are the counts:

• Jund (total): 113 — 16 wins
o Leech: 78 — 10 wins
o no Leech: 35 — 6 wins
• Boros Bushwhacker: 24 — 2 wins
• Naya Lightsaber/Naya Aggro: 23 — 1 win
• Red Blitz: 17 — 5 wins
• Green/White(/x) Aggro: 17 — 1 win
• Bant: 17 — 1 win
• Vampires: 16 — 2 wins
• Eldrazi Green: 12 — 1 win
• White(/x) Aggro — 1 win
• Turbo-Fog: 11 — 3 wins
• Grixis Control: 5 — 1 win
• Spread ‘Em: 5 — 1 win
• Valakut Ramp: 4 — 0 wins
• Other: 23 (each with 1% metagame representation or less) — 2 wins
o Green/Blue Eldrazi, Naya-Jund Cascade, Big Red, Red/White Control, Esper, Summoning Trap, Magical Christmasland, Emeria Enchantress, Barely Boros, Goblins, USA Control (not Governator), Mono-White Control, Soorani Blue/White Control, and Unearth.

In the merging of lists, there are a couple of comments to make:

• I merged some non-Lightsaber Naya lists with Naya Lightsaber because, despite a few minor differences that clearly marked them as not being of the same archetype, I felt they were fundamentally similar enough to include together.
• I’m calling all of the Red decks that are full of haste creatures “Blitz”, in homage to the old name that Sligh decks were given once Fireblast and Ball Lightning started seeing play in the same deck. “Red Deck Wins” has always employed a semi-land control element to me that is lacking in many of these builds, but they are still incredibly similar, even if there is a great deal of variance I these decks.
• There are a few decks listed as “(/x) Aggro. In these lists, they are fundamentally similar enough that I don’t mind lumping them together. Take the so-called “Junk” decks; these decks are usually so incredibly close to the White/Green Aggro decks that they might as well be undifferentiated. For many of these decks, running, say, two Black spells doesn’t change their performance significantly enough (particularly when compared to those decks that run, say, Oblivion Ring) that I feel as though anything useful is accomplished by separating them. Compare this to the Bant decks, which often have enough card slots that they really do look like different archetypes. Any list that was significantly different enough to be given its own archetype listing was given its own listing. Similarly, a list given the name “Mono-Black Control” looked to me like it was merely a more controlling Vampires list, and so I lumped it with Vampires (the existence of some non-Vampire creatures notwithstanding).

That said, there are a few conclusions that just jump out.

Jund was a powerhouse, unmistakably. Without knowing how many players showed up with Jund, we can’t really make any comments on the deck’s penetration. We can, however, recognize that it made up nearly 40% of the winners. That is crazily impressive. Cue the calls of outrage.

But wait! Before we get too carried away with ourselves, let’s try to put this in some perspective. In some ways it is worse than Faeries, who were horrifyingly bad. If we compare one of the last sizable events for Faeries (the reporting on City Champs, last year), we’ll note that they had a similar amount of people winning — Faeries winning 45% of the events, as compared to the 43% of what we have reported thus far with Jund. On the other hand, Faeries was played by far less people; if you made the Top 8 with Faeries, you were 24% likely to win it all, as compared to only 14% likely to win it in Top 8 with Jund. In other words, at the top levels of play, Jund is almost half as likely to succeed as Faeries. Furthermore, given that not everyone has learned which decks are good against Jund yet, it’s probably that, as the metagame disperses, the representation of Jund will continue to decline.

Certain archetypes seem to be somewhat lackluster. I can’t speak for every local metagame, but I know that I saw Barely Boros and Unearth decks all over Wisconsin, and I heard reports of their being played widely in numerous locations. That said, they each only managed two Top 8 finishes each. Barely Boros somewhat makes up for this by also managing to win one of the most competitive States in the nation, Ohio, a state that routinely pulls in monstrous attendance. It is possible that there may yet be some life in Barely Boros, but if you ask me, its probable that this archetype is merely getting in some representation based on the math of numbers, and your better bet would be to go with a non-Vengeant Red deck.

Turbo-Fog put up some truly amazing performances, given its Top 8 numbers, but, as I noted earlier, we do have to put those in some degree of perspective, given the potential for heterogeneous fields.

With that in mind, let’s take note of the larger picture:


The areas marked in yellow are those that a particular archetype was more represented in their placing. The final column (with some areas marked in red) represents the EV finish for someone choosing a particular archetype.

Jund clearly has something interesting to say. While running Leech or no has little to do with the EV of your finish, No-Leech decks definitely are more likely to win a tournament, and Leech decks are likely to be close. Overall, Jund outperforms the average EV for winning by nearly two percentage points, so Jund is a great choice if you want to win a tournament. No surprise there.

What might be a surprise is that it isn’t the best bet, in the top echelons of the metagame. Both Grixis Control and Spread ‘Em give a better shot at victory. Now, these numbers are deeply flawed because of small sample size, but they are still worth noting. Turbo-Fog has a whopping 27% chance of winning the whole thing if it makes Top 8 (with the caveats noted from before).

Probably the real story though is Red Blitz decks. Not even counting the Barely Boros deck, hasty Red was nearly 30% likely to take home the gold if it made the Top 8. That’s more than twice as likely as Jund. Metagamers, take note.

“The Jungle” in this metagame looks something like this: Boros Bushwhacker, White(x) Aggro, Jund with Leech, Green/White(x) Beats, Bant, and Valakut Ramp. In essence, these are the decks that tend to fall just under the exemplary decks when they are playing against top decks. If you start to falter, you’re almost certainly going to fall into the arms of these decks.

Compare this to the real losers of the pack: Vampires, Eldrazi Green, Naya Lightsaber, and “anything else,” all of which were massively more likely to result in an early exit from the single-elimination portion of the tournament. For the most part, those three named decks are absolutely “real” decks, they just don’t seem to perform at the level that is needed if you want to have a good EV. It’s also worth noting that all of these (and Valakut Ramp) fail to meet the “random competitor” test; i.e., if you’re playing in the Top 8, and you’re made an offer to play any deck in the Top 8 at random, instead of the one you came with, for those decks marked in red, you’ll be making a good call. That’s not exactly where you want to be sitting.

If we order decks by this measure, we get the following:

Turbo-Fog: 3.773
Grixis Control: 3.9
Spread ‘Em: 3.9
Red Blitz: 4.088v
Green/White(/x) Aggro: 4.147
White(/x) Aggro: 4.167
Boros Bushwhacker: 4.354
Jund: 4.367
Naya Lightsaber: 4.826
Valakut Ramp: 5.000
Vampires: 5.156
Eldrazi Green: 5.417

For Jund, its low marks on this could easily be accounted for by the very high numbers bringing things down by the mere weight of it — a larger sample size reduces the possibility of error in the positive sense as well.

There were, of course, some more unusual lists. Larry Harrison’s Mono-White Control is a great example of this:

Particularly given the way that the Red decks have seemed to shine, this deck — essentially a kind of quasi-Martyr deck — just has to put the kibosh in every Ball Lightning player’s day.

Also employing Knight of the White Orchid profitably is Andy Goble’s White deck, splash Blue for Aven Mimeomancer:

This mid-range aggro deck can behave somewhat like a White Weenie deck, a tokens deck, or a Big White deck. While I haven’t played Goble’s or Harrison’s White decks, I do have to say that they are intriguing food for thought.

Overall, it seems clear that there are some real conclusions that can be made about the format. Jund was a very strong performer to make the Top 8, but it wasn’t the top performer in the Top 8. That honor clearly goes to the hasty Red Blitz decks. Various Blue-based strategies also look to have promise; particularly with the hints of good Blue cards in Worldwake, look to see the potential re-emergence of true controlling decks once again. Overall, Jund decks without Leech do seem to be outperforming those with Leech. A number of archetypes also seem to have fallen in luster since their initial outings, notable Vampires, Eldrazi Green, and Naya Lightsaber.

As the metagame gets more and more refined, expect to see the further diminishment of Jund. Clearly, it is still a force, but it is one that can seemingly be reckoned with.

On another note, I’m super excited to be able to play Magic in one of my favorite locations in Madison: the Monona Terrace (designed by Frank Lloyd Wright). This is the location that Grand Prix: Madison had been intended to be held at, instead of the crappy office park that it ended up in. I’m hoping that this will merely be the first of many PTQs held in this location (and that’s not even just because it’s less than a mile from my front door). Wish me luck in that last PTQ of the season!

Adrian Sullivan