Playtesting Pitfalls

I’m in Barcelona.

It’s early Thursday morning.

The Pro Tour is tomorrow, and I don’t have a deck.

I’ve been here for about a week now, staying with the rest of the team in a pair of rented apartments a short walk from the Mediterranean Sea. We’ve spent hours every day drafting and playtesting, with intermittent trips to various famous locations, restaurants, and the beach thrown in. Now, with the start of the Pro Tour looming ominously just over twenty-four hours away, we’re questioning all those hours of testing and considering playing a different deck entirely than the one that virtually the whole team was convinced was the best until just yesterday.

I had planned for this article to be a look at the process of testing from the perspective of the expertsâ€”to let you in on the ways in which we approached the format to come up with the best deck. I’m not convinced that we’ve done that, and instead I’m seriously questioning our conclusions and our process barely more than a day out from the start of the Pro Tour. So instead, what you get is a look into what went wrong.

Since this article is going to be posted during the Pro Tour itself, I can’t provide exact details about the various decks and cards I’m talking about, since I don’t want to jeopardize my own chances or the chances of my teammates to win the event. But I can provide some insight into what went wrong, and how we got to this point, by focusing on the type of mistakes that we made in our testing in the last week, and explain how you can try to avoid these same playtesting pitfalls.

Pitfall #1 â€” Ignore Major Decks in the Metagame

In our playtesting for Pro Tour Philadelphia last year, we tested a huge variety of decks. We built countless versions of Cloudpost, Zoo, and Storm, along with a wide range of other combo decks, like Hive Mind, Elves, and even an absurd Retract combo deck with Vedalkan Archmage. We had Blazing Shoal Infect decks and Smallpox decks and Rock decks and mono-red decks and RUG decks and Buried Ruin/Mindslaver decks and more.

But we never built Splinter Twin. Throughout the course of our testing, people would say “Hey, we should really build Splinter Twin,” but the suggestion wound up getting ignored every time. Some people dismissed Splinter Twin as just a worse combo deck than the other options, since it was vulnerable to creature and enchantment removal along with traditional anti-combo cards.

Splinter Twin ended up being the second-most popular deck at the Pro Tour. Thankfully, the deck we playedâ€”Zoo with an array of countermagic in the sideboard that led to the name CounterCat â€”was naturally very strong against Splinter Twin. We had Path to Exile and Qasali Pridemage and Gaddock Teeg alongside a fast clock, Green Sun’s Zeniths, and those sideboard counterspells that pushed the matchup in our favor.

On the one hand, we’d played enough Splinter Twin in Standard to have a rough idea of how the deck would translate to Modern and how to fight it that we could predict how our decks would do against itâ€”as it turns out, both the Zoo deck and the Storm deck that we played had very positive records against Splinter Twin throughout the Pro Tour. On the other hand, it was downright lazy and irresponsible of us not to put the deck together and play against it. Splinter Twin was an obvious deck for the tournament, and it’s entirely possible that we could have been missing something important that we might have uncovered with a bit of playtesting.

Since Philly, I’ve been personally determined to avoid another Splinter Twin incident. Just last night, after nearly a week of testing, I finally convinced the team to put together one of the decks I’d been advocating we build for nearly the entire time we’ve been here. The archetype was one of the most popular decks in Block on Magic Online, and it seemed foolish that we hadn’t played against it at all.

The deck proceeded to slaughter the team deck over and over and over. The games weren’t particularly close, either. The matchup was horrible. It wasn’t a matter of a few card choices here and there. It was fundamentally a structural issue with how the two decks lined up. There were ways to regain some ground, to be sure, and certainly sideboard options to try to shore things up, but it was going to take some serious effort to make the matchup anything less than terribleâ€”effort that we don’t have much time for, and nearly didn’t even know would be needed, since we didn’t bother putting the deck together and playing against it until virtually the last minute before the Pro Tour.

Pitfall #2 â€” Inbred Testing

A big part of how we ended up in the position we’re in is the pitfall of inbred testing. This is what happens when a group’s playtesting turns inward and involves playing one team deck against another rather than against the expected tournament gauntlet. This happens a lot because people are generally more interested and invested in playing a deck they’re actually considering piloting in the tournament than a stock list.

It’s very important to have someone taking on the role of “the enemy” however, and preferably someone who has at least some interest in playing that deck in the tournament. Not only do you tend to get more accurate results when players are piloting decks that they’re invested in since they’re actively looking for ways to win every game rather than just playing the role of testing dummy, but you also learn more about those matchups you’re more likely to face in the event.

In this particular case, our testing suffered because I had been working on a deck somewhat similar to the other deckâ€”the one I’d advocated putting together that we only got around to testing against last night. I was working on testing and tuning it, and played a great deal against people playing the other deck the team has been focused on.

What we ended up with was a lot of insight into how our two decks matchup up against one another… but not much else. There’s something to be said for the value of any testing, especially in a format like Block, where there’s a lot to be explored and big edges to be gained from any small discovery. The problem arises when you make deck and card choices based on that testing. We ended up tuning our decks to have the right answers to the threats posed by the other, and vice versa, which left them poorly positioned to deal with decks with an alternate angle of attack.

Pitfall #3 — Insufficient Testing with Sideboards

This is one of the biggest pitfalls of testing for any tournament, and frankly it’s somewhat shocking to me that it’s one that we’re still running into. People constantly underestimate the impact of sideboards. Over the course of a tournament, unless you win or lose every single match in two games, you’re going to play more games involving your sideboard than you will with your maindeck. This means that your deck’s win percentage in sideboarded games has a bigger impact on your overall success than your ability to win with your maindeck! And yet when people test, they almost exclusively play games pre-sideboard, with sideboard testing coming up almost only as an afterthought or extremely late in testing.

The importance of testing with sideboards is magnified for decks that are particularly helped or harmed by particular sideboard cards. If you were to test Dredge in Legacy, for instance, and you only ever played maindeck games, you could very quickly come to vastly inaccurate conclusions about how powerful the deck is in the format. Similarly, players too often dismiss decks as poor choices because they have a low game 1 winning percentage against popular decks.

My favorite example of this is Tinker from the World Championships in 2000. Most players going into that tournament didn’t consider Tinker a viable deck because it matched up very poorly against Replenish. Tinker relied on individually powerful artifacts and mana denial against a deck with Seal of Cleansing and Frantic Search. Rather than give up on the deck, I worked on it extensively with Dan OMS and discovered that a sideboarding plan featuring Miscalculation, Annul, and Rising Waters could completely turn the matchup around. The resulting deck took both Jon Finkel and Bob Maher to a mirror match in the finals, earning Jon the World Championship and Bob the title of Player of the Year. Not bad for a deck that most people ignored because they didn’t work hard enough on a sideboard.

For this tournament, we only started playing sideboarded games in earnest yesterday. What we discovered was that our frontrunner deck was extremely vulnerable to some very common sideboard cards from many of the decks we expected to be popular. Not only that, but our own sideboard options seem relatively weak in comparison. We did a great deal of exploration, but didn’t manage to find anything that seemed to have nearly the same kind of impact that our deck could actually support.

While sideboards are frequently the product of last minute decisions and untested theory, the best sideboards are those that are carefully planned out and playtestedâ€”and the best decks are usually the ones with the best sideboard plans. Don’t make the mistake of putting off your sideboard testing until the last minute, because you might not end up with enough time to figure things out.

That’s it for this week. We have only a single day left to figure everything out that we should have done in the past week, so you’ll have to excuse me as I cut things a bit short and get back to playtesting. Be sure to tune in to the live coverage this weekend to see if we managed to pull things together!

Until next time,

bmk