Selecting a Deck for Fun and Profit

Opening Note:

This topic really is a challenging one to understand, and certainly reconsile. This article represents some serious – and in some cases fairly heated – discussions I’ve had over the years with the smartest writers to ever type a Magic article. Now that I am finished with it, I’m still not quite sure I understand the specific relevant elements of correct archetype selection, at least not entirely (so take a lot of what is said with a grain of salt). The percentages and other numbers presented are used for discussion purposes only (as Geordie Tait once reminded me, players can find percentages for themselves only one way). Obviously these percentages assume matchups based on deck quality and do not take into account random player error. Finally, the arguments herein obviously assume that players are selecting decks based on trying to win, or at least place, in competitive Magic tournaments and do not consider players who are playing decks based on other criteria.

That said…

Tournaments are rarely won by the player who had the cleverest deck. The reason rogue victories are so memorable is that they are uncommon and unexpected. We remember the Decree Deck, the Solution, Scrounger TurboLand, and so on because we expect victories in their breakout environments from other choices. Most of the time, tournaments are won by the luckiest players making the fewest mistakes, playing the best tuned versions of consistent (or at least powerful) – but ultimately known and expected – archetype decks.

But how do players select which decks, specifically, that they will bring?

Individual players all have particular biases and no one I know of chooses the”best deck” or at least his Weapon of Choice in a mathematical vaccuum. Scott McCord won’t play a Constructed deck that includes any cards that do more than one thing (Creeping Mold epitomizes everything Scott hates in terms of flexible utility). Despite winning a PT with the Rebel chain, Kai Budde tends to dislike creatures. Zvi Mowshowitz will err on the side of mana acceleration, Sol Malka will favor if not play Green and Black cards no matter the format, and Brian Kibler will select the deck with the most stringent possible color requirements – four colors when he can get away with three – just so that he can play the best possible cards at every mana cost.

But most of the time, despite their preferences and biases, for single format or single day events, players should tend to select their Weapons of Choice based on a simple formula. They should decide what decks they expect will make up the field, and select the deck that has the highest percentage overall against that field.

That’s it.

[Editor’s Note: When Mike talks about EV for this article, he’s discussing point totals over rounds where a win is 3, a draw is 1, and a loss is 0.]

Zvi once told me that he thought that deck selection for multiple day events based on floating, records-based, criteria was inherently flawed. In the U.S. Nationals of the past, for instance, you would play a day of Draft and then a day of Constructed, and then day three would be constructed for eight lucky magicians. Zvi said that players who selected their day two decks based on their day one performances were making a key error before they ever shuffled up for their first Constructed matches.

The chief offenders were those who had either very poor or very good records. A player who was 3-3 and HAD to go 6-0 on day two in order to make Top 8 would often pick a streaky deck. For instance, say you had a U/W control deck in your team’s repertoire that had a 65% weighted matchup against the field (very consistent against beatdown, let’s say, but less strong against the uncommon B/U control decks); this deck has a six round EV of 11.7 points against your projected field.

Now your team also has a very streaky mono-Red beatdown deck that is under-powered given the potential card pool but has Tangle Wires and a lot of burn. If ten players play it, you believe that two of them will go 6-0 and four will go 1-5 with two going 2-4 and the last two going 3-3. No one will have a passably good record; either they will get very lucky or reasonably to very unlucky (with unlucky being defined as not mising and the inherent weaknesses of your cards exposed). This deck has a considerably lower EV of 7.8 points.

Keeping in mind that opinions on both of these decks are based not on mathematical certainty but perception and opinion (but faith and belief in those, dictating decisions), why is it that so many 3-3 players select the latter deck?

These players weight the normative idea of a potentially”streaky” finish (even if that streak is 1-5) over the consistent percentages. This is a bad idea.

However it is not necessarily a bad idea to choose different decks based on your record. Now generally you should choose the deck with the highest EV. However, sometimes what bracket you are in Will Dictate A Different Metagame. A different metagame where you can made an educated guess that the decks you will hit are different than in a blended metagame may dictate a different EV for the same deck.

Case in point, at US Nationals 2001, Former Strategy Writer Sol Malka wowed us with a 6-0 start. He was in as good a position as can be going into Standard. As Sol is wont to do, he went with a homebrew Wrestler-themed weapon of his own creation; this time it was Kane, a Big Red Machine added to his usual Black and Green mana requirements:

6 Swamp

4 Mountain

4 Sulfurous Springs

3 Karplusan Forest

3 Darigaaz’s Caldera

4 Rishadan Port

2 Pyre Zombie

3 Plague Spitter

3 Phyrexian Scuta

3 Thunderscape Battlemage

4 Scoria Cat

4 Chimeric Idol

4 Seal of Fire

3 Duress

2 Addle

3 Terminate

1 Scorching Lava

2 Charcoal Diamond

2 Fire Diamond

Sideboard

1 Pyre Zombie

1 Terminate

1 Addle

1 Duress

1 Phyrexian Scuta

1 Plague Spitter

2 Boil

3 Scorching Lava

1 Thunderscape Battlemage

3 Urza’s Rage

I am by no means an expert on Kane itself, but having a greater love for Black and greater hatred of conformity than any other mage, I did a lot of testing against Fires of Yavimaya decks with Black and Black/Red decks in 2001. Kane might have been one of the best decks in the field… but it ain’t no good against Saproling Burst.

Where Sol failed was that he was in the 6-0 bracket. In the upper brackets, you will expect to see a disproportionate number of Fires of Yavimaya decks. Many players with the best records will commit the exact same sin that players with the poorest but still in contention records will do… they will choose their deck lists based on record.”I can definitely mise a 4-2 with Fires,” is the most dangerous thing a player with a great record can say. First of all, it is dangerous because that player should only choose Fires if he was prepared to choose Fires regardless of record, and second of all, it is dangerous for poor Sol. The player should pick the deck with the best EV if he needs to go 4-2; that deck will give him the best shot of doing so.

In the actual tournament, 39 of 150 deck lists, or 26%, included Saproling Burst. Compare than number with the Top 12 of the Tournament (the Swiss rounds cut to 27 points, with a 6-way tie for the last two spots in the Top 8):

1 Benafel, Chris 29

2 McCarrel, Casey 29

3 Bachmann, David 28

4 Borteh, Alex 28

5 Blackwell, Trevor 28

6 Harvey, Eugene 28

7 Hegstad, Brian 27

8 Jensen, William 27

9 Price, David 27

10 Finkel, Jon 27

11 Mowshowitz, Zvi 27

12 Zila, Jason 27

Of these twelve players, a highly disproportionate 7 of 12 (including Champion Trevor Blackwell) played with Saproling Burst, specifically in concert with Fires of Yavimaya. This combination is maximally dangerous to a B/R deck whose chief answer to Saproling Burst is the sorcery speed Thunderscape Battlemage; because of the haste from Fires of Yavimaya, Saproling Burst is also harder to break up with Addle and Duress (it can be top decked into The Fix). All of the Fires players who ended up with 27 points or better had a 4-2 or better Day One record, and of all sixty players with that record or better on Day One, there was also a disproportionate number of Saproling Bursts (though not as pronounced as the 58% in the Top 12, which represented over twice the normal population of Fires/Burst decks). If you consider the very top, the Top 8 itself was 5/8 Fires of Yavimaya with Benafel, Blackwell, Harvey, Jensen, and Price all running Saproling Burst (and McCarrell kicked for cheating).

Case in point: No matter how comfortable he was with Kane, Sol should never have picked a deck that he thought he could mise 4-2 with… He had to choose a deck that would perform with positive EV against a high concentration of Fires decks.

Ultimately, picking the best deck is a function of two things. First, it is about picking the deck that has the highest EV against the expected field. Second, it is about making sure you determine that field as accurately as possible. We’ve already seen that the metagame can change depending on a player’s record, which implies that the EV of a deck will also change.

But what about tournaments where your deck selection and potential opponents are not linked to known factors?

The simplest model we can talk about is one where there is a clear best deck. I would talk about mono-Black Necropotence, Whirling Dervish, and templating to B/r Necropotence… or perhaps b/U Trix, Elvish Lyrist + Spike Feeder, and Firestorm Trix, but Matt Vienneau would accuse me of using dated examples as the theory works just was well with modern decks. Say for example the clear best deck is, I don’t know, Ravager Affinity. Ravager Affinity with core Affinity mechanic cards and Thoughtcasts for card drawing offers a natural marriage of speed, strength, and synergy that no other deck in the format can match (Let’s Say). Red mages stacking Echoing Ruins on Shatters can barely keep up; other decks may be quite excellent at smashing into one another, but not one really faces off with Ravager and smiles every time.

And then you have Green Deck.

Green Deck is so hateful that it makes Seth say”infinite is infinite” and take the artifacts out of even his non-Affinity deck. Green Deck is Tel-Jilad Justices piled on Oxidizes setting up Viridian Shamans helping it live long enough to tap five for Molder Slug, which locks poor Affinity down at no permanents. Sure, it can’t possibly beat a third turn Arc-Slogger, but Green Deck doesn’t have to. It has a positive EV against Affinity. [Actually, Karstoderm is a large pain in the ass to the Red decks, but preach on, brother man. – Knut]

At some point, if you and all other players are making rational decisions based entirely on EV, you can only make one of two deck choices: Ravager Affinity (best deck against a naturally occurring field) or Green Deck (beats the best deck). In an undefined metagame, you should pick Ravager Affinity because your overall EV is positive. However, at some point, if the concentration of Ravager Affinity decks is very high (definitely over 50%), your expected record will find a limit at 50% (even if it never hits that limit). Imagine a tournament where 100% of the players had the same Ravager Affinity; all of those players would, based on deck choice only, have a 50% matchup with every other deck. If the concentration of Ravager Affinity is high enough (and certainly below the imaginary 100% in the above example), Green Deck becomes the correct choice because its overall EV is now positive.

Say Green deck is (++) against Affinity and (-) against non-Green Deck / non-Affinity. If the metagame is 50% Affinity, Green Deck finds a positive EV and becomes a potential rational choice.

Now there is the idea of templating. Say (and we are speaking in general terms here, rather than real rules with real numbers, remember) that Ravager Affinity is very very absurd and that its core cards are simply stronger, faster, and more complementary than other decks’. We would want to play Myr Enforcer in that case rather than Somber Hoverguard. Somber Hoverguard is a great card! However it is not a core card of the same level as Myr Enforcer. It doesn’t hit as hard, it is smaller, and it ultimately costs more mana (always at least U). Its evasion is valuable, but it lacks the significant defensive presence of Myr Enforcer against the opponent’s best draw. For all its virtues, Somber Hoverguard will drop not just to Electrostatic Bolt, but Pyrite Spellbomb, Magma Jet, and a Blinkmoth Nexus. It is not (in an unknown environment) correct to play Somber Hoverguard. However, if there is a high enough concentration of Green Deck, Somber Hoverguard may be correct. Let me explain:

Say that Green Deck is (++) against Myr Enforcer but only (+) or even (=) against Somber Hoverguard. If there is enough Green Deck, Somber Hoverguard becomes a rational templating decision despite the fact that it adversely affects other matchups. Probably what happens is that the presence of Somber Hoverguard disrupts Ravager Affinity’s natural synergy such that it changes generic matchups from (+++) to (++) and the mirror from (=) to (-). That might not matter if the metagame is right.

When can you act on this model?

This is a variation on both the Nash Problem and the Rule that I’m sure some people will find fascinating and others puzzling. Say that there is no metagame, but that you personally have a correct understanding of which decks are viable and which builds are optimal, but no knowledge of what other players will select. What happens?

You and all rationally acting players will choose Ravager Affinity.

Now what happens if some number of players are game theorists? There are a couple of different variables, such as how many players are game theorists, whether they know about the presence of other game theorists, and whether they are first or second level game theorists.

I believe that if all players are game theorists, they understand the baseline model, but do not know that all other players are game theorists the result will be that all game theorists will chose Green Deck.

In real life you have additional variables (this week, edt actually accused my mind of breaking as I tried to build the model). At one point, I thought that my theories were different from both Zvi’s and edt’s, but I think that they now not only live together but do so harmoniously. Anyway, edt says that there is an equilibrium point where decks all coexist by correct percentages. It is easier to understand if we use the original metagame model.

Say there is a three deck rock-paper-scissors environment of Necropotence, Erhnam, and U/W, and assume that each one tends to beat one other and tends to be beaten by the last (e.g. U/W beats Erhnam and loses to Necropotence), then the correct equilibrium point is 33%, 33%, 33% with no deck having a significant advantage at equilibrium. If one group has a disproportionate population, what matters is where the difference in percentage went. For example, if there are too many U/W decks and all the extra U/W players came out of the Necropotence population, it’s a bad day to be playing Erhnam Djinn.

Look at it like this. Say you have a six round Swiss at Equilibrium Point. Your opposition will always look like this:

Erhnam

Erhnam

Necropotence

Necropotence

U/W

U/W

And you have a six round EV of eight (assuming two wins, two draws, two losses), no matter which of the above three archetypes you pick. Now say that there is a buzz about Hymn to Tourach and players have been timing out too much such that in a six round Swiss you can now expect these opponents:

Erhnam

Erhnam

Necropotence

Necropotence

Necropotence

U/W

Decks have the following EV:

Erhnam = 11

Necropotence = 6

U/W = 7

By the numbers, it’s a fantastic day to be playing Erhnam Djinn!

Now the numbers we are using are highly simplified. We are counting matchups in this model as wildly equivalent such that a 51% and a 100% deck advantage are counted equally and counting Swiss EV as a draw with every mirror and no non-mirror draws, which will never happen. But in real life, there is such a thing as templating.

Earlier I said that you will not choose Somber Hoverguard when you can play Myr Enforcer in any naturally occurring metagame. Yet many perfectly respectable players have done so, despite the fact that the block format is not overridden with Green Deck. Templating, whether the addition of Incinerate in 1996 or the addition of Somber Hoverguard in 2004, or any templating in between, works the same way.

Say there is an archetype you like and a card you like that could fit into that archetype. Whether you play that card or not is based on much more complicated percentages. Say that you love Necropotence and fear Whirling Dervish (stupid 1/1 for two always kills you). You elect to add Red for Lightning Bolt and Incinerate. This reduces your baseline ability to draw cards with Necropotence because of Sulfurous Springs, but also gives you more burn-based free wins against an opponent’s aggressive card drawing.

When evaluating deck templating, it is important to understand the concept of a non-zero sum game. That is, when you add Red to a a Necropotence deck and you now have an answer to Whirling Dervish, but now draw fewer with Necropotence, you don’t always beat Erhnam and always lose to other Necropotence decks; your matchup against Green goes up and your matchup against Black goes down, but rarely by the exact same measure. In our example you might see a net -5% in the mirror, but because you only fear Whirling Dervish out of an Erhnam deck and you now have 6-8 classes of removal that can contain the opponent’s four Whirling Dervishes you may have just transformed your 33% Erhnam matchup to 51%… They’ve still got game, but you’ve come a long way. Your matchup against U/W is about the same at say 66%. You wreck them less with Necropotence and disruption, but run cirlces around their Circles with your burn and have more sideboarded gas like Anarchy and Pyroblast.

At equilibrium, you would see this delta with the above made-up numbers:

Mono Black

Erhnam

Erhnam

Necropotence

Necropotence

U/W

U/W

EV 8 at ~50%

B/r

Erhnam

Erhnam

Necropotence

Necropotence

U/W

U/W

EV 12 at ~54%

Clearly at equilibrium, the swtich from mono-Black to B/r will yeild significant positive EV. Now here’s the part where my brain breaks: edt says that it is a mistake to look at Hoverguard Affinity, B/r Necropotence, and other templated variations on Natural decks as merely variations of core decks. He says that it is instead correct to say that they are discrete decks that represent their own chunks of metagame, even if they are minority percentages.

Say that the discussed successes of B/r Necropotence have scared away players. Erhnam players have decided that they no longer want to be part of a metagame where they are losing both to U/W and some Necro mages. U/W flounders in a format with fewer punching bags while losing to a new class of decks. The metagame changes such that you now expect to face:

Erhnam

B/r Necropotence

B/r Necropotence

Mono-Black Necropotence

Mono-Black Necropotence

U/W

Clearly you don’t want to be U/W on this day, and Mono-Black Necropotence is the right choice. B/r Necropotence is the relative loser when compared to last week, with a much lower EV of 8 and a drop to 51% blended likelihood of winning any match. The trick is obviously to be B/r Necropotence in Week 1 and back to Mono-Black Necrpotence in Week 2.

The above metagame yeilds week to week effects like the Black Summer, where you might find something like this in six rounds of Swiss:

B/r Necropotence

Mono-Black Necropotence

Mono-Black Necropotence

Mono-Black Necropotence

Mono-Black Necropotence

Other

B/r Necropotence is clearly wrong at 46%-49%.

The dominant Mono-Black deck finds a limit at about 50% (not surprising), with wild changes in expectation between 48% and 54% based on whether someone was smart enough to run Erhnam.

Erhnam kicks ass in this room. Its worst EV is about 58%, and only then in the case that someone plays U/W.

But they wouldn’t. Like B/r Necropotence, U/W is buried in bad matchups and will never hit 40% in this metagame. If it exists at all, it does so only to spoil Erhnam or get very very lucky.

It clearly looks like Erhnam is a Natural predator deck with this skewed metagame. edt would say that it is impossible for a significant number of players to play predator and jump on the Erhnam bandwagon to take advantage of the high concentration of Necropotence and only mild disadvantage in the B/r matchup. If too many Erhnam players were to show up, they could not all take advantage of 5/6 Necropotence matchups, and at high enough adoption, would just butt heads at a 50% limit like the Necropotence players actually do. In this way we can see that if certain rogue decks are viable, they can only be viable in small numbers (say single digit percentages), if they are viable at all for more than one tournament.

In real life, all players at least think that they are game theorists. They are running deck choices based on predictions of varying accuracy. When you hear a Pro player say”just play the best deck,” that player is making an assumption about the rest of the room, where other players are either behaving correctly or non-conforming in such a way that it doesn’t matter (they choose a non-best deck that has better or worse interactions with other non-best, decks but that still has negative EV against the best deck). Onetime Sensei Chris Senhouse thought that for any format there was no best deck, but that the there was always an empirical best deck for any one tournament; hopefully it is easy to see why this is the case given the models we have looked at today.

Last of all, there are the rogue decks. The reason rogue wins are so memorable is that, by nature, rogue decks can only either hold short term or minority percentage at equilibrium (just like if you have an island full of velociraptors and t-rexes, they have to die young or they will eat all of the herbivores and cause mutual predator and prey genocide) or graduate to full archetype share (usually at the cost of a prey deck). Because we would expect zero copies of a deck with an equilibrium 2% share in the environment in the Top 8 of any major tournament, the fact that there are 1-2 such unexpected builds there makes players stop, take notice, and copy. As we are seeing with certain versions of U/G in Mirrodin Block, as many players adopt these formerly minority deck choices, the end result is not a commensurate percentage of success in U/G, but instead that other decks whose populations are currently well below their equilibrium points instead over-perform to compensate.