Translating Testing to Results

Success or failure in a Magic tournament is not decided over 1,000 games played in your living room before the tournament. Success or failure will depend on a very small number of games, and on one or two incidents during those games – choosing whether or not to mulligan, choosing which cards to sideboard in or out, deciding which creatures to attack with or which spell to cast on your fourth turn, little things like that. Your testing, therefore, needs to focus on preparing you to make those decisions.

Dear readers,

To start this week’s column, I’d like your thoughts on what the best play is in the following situation.

It was round 4 of Regionals, and after winning the first three rounds, I was paired against an Astral Slide deck. I was playing the Red deck that I wrote about two weeks ago. I won the first game, and game two seemed to be going rather well. I had four Mountains and a Great Furnace in play, and my hand was Shrapnel Blast, Shrapnel Blast, Chrome Mox. I was on twenty life. For what it’s worth, I had a Clickslither and three Goblins (a Sledder, a Prospector and a Warchief, if memory serves) in my graveyard.

My opponent had three Plains and two Mountains in play. In his graveyard were two Spark Sprays, Starstorm, and Wrath of God. He had three cards in his hand. He was on ten life.

The question is very simple. What is the correct play and why?

I’ll give my answer at the end of the article. First, though, I realize that I haven’t properly introduced myself as a Featured Writer for StarCityGames.com, and I’d like to take the opportunity now to do so.

I started playing Magic just under ten years ago. To convert that to Magic time, the latest expansion set when I first took up the game was”The Dark.” Like many players, I had most tournament success during my”gap year” between school and university (in my view, the differing success levels of differing European countries in Pro Tours is closely related to what proportion of the population has a year off between school and university). The highlight of that year was a second place at the UK Nationals, and I also contrived to lose a couple of matches either one of which would have landed me in a Grand Prix top 8 – in Birmingham playing a Grave Pact deck in Tempest Block Constructed, and in Vienna playing a Memory Jar deck in Extended. I started playing mono-Red beatdown decks in 1997 and discovered that they were much more fun than all the other decks. I’ve written about Red decks for the Dojo, for Mindripper, as a Premium Writer for Brainburst, and now for Star City.

I first contacted Ted because I wanted to write some theory articles, having been irritated at reading lots of complicated theorizing that used lots of technical language without explaining very much to me. Showing better judgement about what I should write about than I displayed, he suggested that I write about preparing for Regionals. Hence the article about the mono-Red deck two weeks ago, and this article, which rather than being about a particular deck, is about how to improve your chances at Regionals, even if you make the foolish decision not to play the Red deck.

For me as for everyone else, the best thing about Magic is the people that you meet and the friends that you make as a result of it. In my case, I was lucky enough to find that the friends with whom I tested were not only a great bunch of people, but also happened, for several years, to be the best players and deck designers in the UK. The year that I finished second at Nationals, my playtest group had four different decks, all of which were better than the decks which anyone else had – the main reason why people from our testing group finished in four of the top five places that year. The following year, the same group of people came up with two different decks – Trinity Green and Red Deck Wins – which are now established deck archetypes and often found in Extended tournaments even today. John Ormerod and Ben Ronaldson, two members of that group, went on to become integral parts of teams with players such as Zvi Mowshowitz and Kai Budde, and designed decks which dominated Pro Tour after Pro Tour.

There is very little that you could learn from how I prepare for major tournaments. Where there is such preparation, as opposed to the tournaments where I turn up and try to scrounge some Goblins, it involves nagging team mates that I want to play the mono-Red deck against their decks, and then beating them and being smug, or losing to them and changing the deck to give me a better chance should I play against them in the tournament itself. It works for me, but is probably not that helpful for those of you gearing up for Regionals.

What might be of interest, though, is how my team mates used to prepare. This isn’t an article about”make sure you get plenty of sleep and bring food” sort of article (though those are good bits of advice), but more about productive ways to playtest, choose which deck to play, and make the most of your chances during the tournament itself.

It might be the case that the people who write articles on the internet about how their deck has a 60% chance of beating some other deck are doing so just to wind me up. I fear, though, that there are people out there who believe that there is anything meaningful in expressing the chances of one deck beating another in terms of percentages. Let’s be clear about this. There isn’t.

The first obvious objection is that such conclusions are normally based on too small a sample size. If you win twelve games out of twenty when playing your special mono-Blue deck against Ravager Affinity, that doesn’t mean that if you played 1,000 games, you would win 600 (as is implied by claiming a 60% winning rate). Another objection is that other factors such as skill level and familiarity with the deck make an assessment of the winning percentage difficult. Jarrod Bright, for example, claims that the reason why people who try to test his Black-Green deck get bad results is because they are playing it wrong, and that if you practice enough then you should beat all of the other major decks most of the time. I can believe that if you play the Black-Green deck at least ten times every day, then you will get better testing results with it than I ever managed. You won’t, unfortunately, ever get back those hours of your life which you chose to spend playing a Black-Green deck, but that’s another matter.

These are good objections to expressing deck matchups in percentage terms. But my objection is a rather more fundamental one. I believe that even in situations where you can get a statistically significant set of results and get two players of equal skill level who are familiar with the decks, it is still unhelpful to talk about what percentage of games your deck wins against another deck.

In all the years of testing with them, I never heard Ben or John make claims about what percentage of games their new decks won. With Ben in particular, it was never for want of testing – I’d spend a full day testing decks against him, go to bed and have a good night’s sleep and wake up to find him still playing games of Magic, against someone else if they were available, on his own if not. When describing matchups, Ben and John used a rather different style from the mathematical based one. They would say which deck had the advantage, and explain why and in what circumstances each deck would win. So, for example, before going to Regionals I e-mailed a list of my Red deck to John, to ask about the matchup against white-based control decks. His e-mail back didn’t say”you win 45% of the time.” It said”their best draw beats yours, but your draws are more consistent.” I would contend that this is far more useful information, and is the product of a different testing mindset.

To win a Pro Tour, a player has to play nineteen rounds of Magic, a maximum of fifty-seven games. To qualify for a National Championships involves playing between six and eleven rounds. To qualify for the Pro Tour involves playing between nine and eleven rounds. Unless you play ten or more of the games in one of these tournaments against a particular deck, there isn’t actually any difference in terms of number of games that you expect to win, rounded to the nearest whole number, between having a 55% expected winning ratio, and a 60% expected winning ratio. This means that time spent trying to establish whether you beat Ravager 55% or 60% is time which is wasted, and which you should spend doing something more useful. That’s even before you take into account the fact that the decks that you actually play against will probably differ from the decks that you tested against.

I have compiled a translator which translates different claims made about what percentage of games a deck wins into what such claims actually mean in English:

Less than 40% -“I should lose a game with my deck against this deck, unless I get lucky or my opponent makes a big mistake.”

41-49% –“I am at a disadvantage, though I will win if my draw is better or I outplay my opponent.”

50% -“I have an even chance of winning – the winner will be whoever gets the better draw or plays better.”

51-60% –“I have an advantage, and will win if my draw is as good as my opponent’s and I don’t make any bad mistakes.”

61% or more -“I expect to win, and something is going to have to go badly wrong for me not to.”

You might have to adjust the percentages slightly depending on the writer, but in general, this is what people mean by ascribing particular percentages to particular matchups.

The reason I labor this point is that too often in playtesting, the search to give a number to how well different decks perform against each other is given altogether too much importance. Rather than trying to discover how many games your mono-Blue deck wins against Ravager Affinity, you should be trying to find out why each deck wins or loses. Do you lose only when they get the nuts draw and play multiple Myr Enforcers on turn 3? Is your deck well set up to survive the early pressure, but crumbles under the weight of card advantage if they manage to get double Skullclamp? Can you keep a one land hand playing first after sideboarding? Can you keep a five land hand going second after sideboarding?

As long as you understand the circumstances under which you win or lose against a particular deck, and have a rough idea about how likely those circumstances are to occur, you are playtesting effectively. Here percentages can be useful. If you know that you beat Ravager decks except when they draw two Arcbound Ravagers and/or two Myr Enforcers by turn 4, you can work out, in however much detail you choose, how likely that is to occur, and hence how good your deck is against a Ravager deck.

Thinking about why the games are decided, rather than about what the result happens to be, is a better way of designing and fine-tuning decks during the playtesting period, not to mention being infinitely more help if you write about your deck on the internet. It is also much more helpful when it comes to playing in the tournament itself. Success or failure in a Magic tournament is not decided over 1,000 games played in your living room before the tournament. Even in the tournament itself, most of the games will be uneventful – one player gets mana screwed, or you or your opponent get a much better draw and there isn’t much that can be done except avoid making any hideous errors. Success or failure will depend on a very small number of games, and on one or two incidents during those games – choosing whether or not to mulligan, choosing which cards to sideboard in or out, deciding which creatures to attack with or which spell to cast on your fourth turn, little things like that. Your testing, therefore, needs to focus on preparing you to make those decisions.

I’ve lost count of the number of games that I have played where I’ve noticed my opponent’s pleasure upon seeing me start the match with a Mountain and a small Red creature. What this means is that they have done some playtesting, and concluded that their deck beats a mono-red deck, say, 65% of the time. More often that not, the same opponent can be found complaining about bad luck as they sign the results slip to confirm that they lost 2-0 or 2-1.

Their complaint might be that they got mana-screwed in the decisive game, or that their killer sideboard cards never showed up, and sometimes the complaints are justified. Usually, though, the reason that they lost has very little to do with luck, and everything to do with them not having prepared properly for the match, meaning that they didn’t know that they shouldn’t have kept their hand in the first game and that they should have played more aggressively in the third game. People who do this just encourage the likes of me who show up with a particular deck each time having hardly tested, and find it really funny that we can still beat these people who have put all this effort in to the wrong sort of testing. I’ve been the victim of not doing enough of the good kind of testing myself – I missed out on the chance to beat Brian Kibler on Grand Prix: London because in the third game I drew a hand of four Mountain, Clickslither, Siege-Gang Commander, Rorix against his Blue-White deck and didn’t realize until the middle of the game that I should have mulliganed that hand.

The aim of playtesting for Regionals is not to discover which deck can win most games when played against the decks acclaimed by the internet as the decks to beat, but to find for you the deck which gives you the best chance of qualifying for Nationals. These are two subtly different things. However much Ollie Schneider, twice English National Champion, used to playtest, he would always end up playing a deck which gave him lots of ways to gain card advantage and stop his opponent from doing what they wanted to do, which complemented his play style, involving as it did lots of conversation with his opponent, lots of playing mind games and trash talking, and lots of winning as a result of his opponents making errors. Ben and Tony Dobson would never play beatdown decks, foolishly preferring those nasty control or combo decks, while John liked a wide variety of decks, but ended up playing decks with not much land and lots of creatures rather more often than the results might have indicated was a good idea. In contrast, some of the other players in our testing group just wanted to find out which was the deck which the internet said was the best, confident that if they had the best deck, then they could outplay most opponents and win that way.

Perhaps the best example of the difference between playing the deck that suits you best and the objectively best deck is Jamie Wakefield. Jamie wrote a quite excellent report of finishing ninth at Regionals which there will be a link to here if Ted can find a copy of it somewhere on the internet. [Still working on it. Check the forums to see if Flores has found it yet. – Knut]

At the time when Jamie was playing, one of the most powerful cards was called Thawing Glaciers. Jamie would never play that card on principle, because it interfered with the way he played the game. Each time he activated the Glaciers, he would have to shuffle his deck, which broke his concentration and made it more likely that he would make mistakes, so he decided to leave the Glaciers out of any deck that he played in a tournament. At one level, this is just daft, because in the hands of a Pro player his deck would have been much better with Thawing Glaciers in it (and with sixty, rather than sixty-two cards in it, though that is a different story). But for Jamie Wakefield at a Regionals, or in a Pro Tour Qualifier, the fact that he was playing a slightly worse deck was counterbalanced by the fact that”improving” the deck by adding in the Thawing Glaciers would have resulted in him making significantly more play errors.

Over the course of a tournament, the superior quality of his deck might have helped him win a couple of games which otherwise he wouldn’t have. However, he might have lost four or five games as a result of silly errors caused by the shuffling breaking up his concentration, with the result that the overall performance would have been worse. This is an extreme example, but the point is that if you play a deck which you are more likely to make errors while playing just because you/the internet think it is the”best deck,” you are unlikely to do as well as if you play the deck which you play best (unless that deck is extremely bad). Even at the Pro level, particularly in Extended tournaments, you see players play the same decks tournament after tournament.

What if the worst comes to the worst, and you test properly, bring along a deck which you feel comfortable playing and know how and why it wins and loses, and yet find yourself on the brink of elimination? Say you’re at Regionals, and in the eleven round tournament two losses would leave you out of contention. You win the first couple, but then in the third round you get mana screwed in two of the three games (hey, it happens), and lose a close match. You recover well and win the next six, to leave yourself on 8-1 and just one win away from being able to draw into qualification. Unfortunately, facing you is your nightmare matchup, one which in testing you could hardly ever beat unless you got a particularly good starting hand.

The best plan in this kind of situation that I know of is to try for the swindle. This is a term which comes from chess, and describes the situations where people manage to come back and win games which, given perfect play on both sides, they have absolutely no chance of winning. There are two main ways of pulling off a swindle in Magic:

1. Making your own luck.

Supposing that from your testing you can only win by getting really lucky and drawing a particular set of cards. Mulligan more aggressively than usual, throwing away hands that you would usually keep. After all, if you are going to lose with an average draw, it makes sense to give yourself every chance to get a”lucky” draw. Sure, you might end up with a terrible four-card hand and lose, but you were probably going to lose anyway, so what does it matter! If the only way to win is to play a Skirk Prospector on turn 1 and a Goblin Warchief on turn 2, then mulligan the hands which don’t let you do that. Equally, if you get a dubious hand which you could win with if you draw the right cards – e.g. a one land hand with all the stuff you need to win if you rip a couple of lands, you’d be more likely to keep it in a tough matchup than in one where you are favorite to win if you can just avoid the mana screw. Getting the”super-lucky” draw can also help contribute to the other kind of swindle:

2. Getting your opponent to make a mistake.

In chess, this is how swindles work. The player doing the swindle will set up a position which is extremely complicated or which their opponent can lose if they don’t play it right. The downside to this is that if they do play the position correctly, then they will win. But that doesn’t matter, because they were going to win anyway! Examples of this in Magic include attacking with all your creatures (where if your opponent decides correctly which creatures to block, they will win, but if they get it wrong, they will throw the game away), engineering a situation where your opponent can win if they know the timing rules well enough, but won’t if they don’t (for example, casting Patriarch’s Bidding in a Goblin mirror match when both players have lots of Goblins and a Goblin Sharpshooter in the graveyard), and bluffing your opponent through things like keeping useless lands in your hand and yet looking confident.

I played in the European Championships (now sadly discontinued) one year, and in two separate rounds all my opponents had to do was attack with all their creatures to win the game, and in both instances they found something stupid to do instead. In some ways, it’s not as pleasing as crushing your opponent through the force of your deck-building genius and superior play skill, but I imagine very few of you will care if it means the difference between success and failure.

Speaking of failure, back to the question that I asked at the start of the article. The decision that I made was one of those very few crucial ones in the tournament. Obviously, if I play the Chrome Mox and then cast Shrapnel Blast twice, I win unless my opponent has some form of instant lifegain. Since he’s playing an Astral Slide deck, it is pretty much certain that he has three to four copies of Renewed Faith, and he might have sideboard cards as well.

I didn’t have any creatures in play, so if I decided not to cast the Shrapnel Blasts, my opponent gets to draw more cards without having to cast any spells to stabilize the situation on the board. Each turn that goes by makes it more likely than he will draw a life gain card. In addition, he might have Circle of Protection: Red in his deck. I can respond to him casting a CoP: Red with my Shrapnel Blasts, but only if I leave four mana untapped at all times, which makes it impossible to cast any creatures that I draw except for 1/1 Goblins.

So what I did was to play the Chrome Mox (unimprinted), and cast two Shrapnel Blasts, sac-ing my Mox and Great Furnace. In response my opponent…

…cast Pulse of the Fields.

My next two draws were Siege-Gang Commanders, which I couldn’t cast because I only had four land in play. My opponent restored his life with the Pulse, and eventually cast an Exalted Angel, Starstormed away both my Siege-Gangs when eventually I had the mana to cast them several turns down the line, and won the game. Game three I mulliganed, kept a one land hand of six cards, and got crushed.

With the benefit of perfect knowledge about what was in my opponent’s hand and what was on top of my library, the play most likely to secure victory would have been to play the Mox unimprinted and end my turn. If my opponent were to try to Pulse at end of turn, I would have won by casting two Shrapnel Blasts in response, and otherwise I would have cast back to back Siege-Gangs, which would each have required a Starstorm to remove, as if they went unremoved or were Wrathed away, I would be able to Shrapnel Blast my opponent happily without any Pulsing spoiling the fun. The optimal play for the Slide player would be to cast Pulse in response to my casting the Mox, but even then he has to deal with the Siege-Gangs over the next few turns, and reveal that he is holding a Pulse.

What I was hoping that you clever people could tell me is what, in the absence of knowing that you are about to draw two Siege-Gangs, is the best play, and why. And what I wish is that I’d encountered this situation while playtesting, rather than having had to figure it out in round 4 of a Regionals tournament.

I hope that this article has been useful – I imagine that many will have come across at least some of the advice contained within before (though I’ve read an awful lot of articles on the internet through the years and not come across one which covers this subject in anything like this way). If you disagree with any of it, that’s fine and let me know so in the forums, and I’d like to hope that at least for some of you this article might help you get a better result at Regionals than otherwise you might have. Any comments, positive or negative, and particularly ones giving your take on whether casting Shrapnel Blast at Astral Slide players is correct or not, please direct to the forums, which I read and reply in whenever I can get to an internet browser which doesn’t crash as soon as it loads up the Star City website.

For those who thought that this article offered information which was excessively basic and obvious, I hope that the next one will be more to your satisfaction – it’s one which I am writing at the special request of Kai Budde.

‘Til then, may your Prospectors speed the casting of your Warchiefs.

Take care,

Dan Paskins