Solving the Sealed Deck Debate with Science: The Results!

If you didn’t read Part 1 of this article,
go read what I’m trying to accomplish here
â€” or this won’t be very interesting!

So, it’s been over two weeks since we cracked five Sealed decks and handed each of them to four different people to play in their own Pods. Here is the “common knowledge” that we were trying to test:

1. Some Sealed decks are so bad they are essentially worthless
2. Some Sealed decks are so good they are almost unbeatable
3. Some decks are easy to build, while others are extremely difficult
4. Sealed deck tournaments are very bomb-dependent. (“Whoever draws their bombs most, wins.”)

And here are the questions we wanted to answer:

1. How distinct is the power level of a Sealed deck?
2. How obvious is a good Sealed deck vs. a bad Sealed deck?
3. How obvious is the “correct” build of a Sealed deck, if such a thing exists?
4. Can a good player win despite their Sealed deck, or is the deck too important?
5. Most importantly: what is the relationship between the cards you open and your chances for success?

In addition, I surveyed the players before they played out there matches about their own expectations for the deck. I asked them:

1. How good do you think this pool is, scale of 1-10? Why?
2. Was this pool hard to build? (1-10, 10 being impossible) Why?
3. How similar do you think other players versions of this deck will be? (guess a number of overlap cards and/or colors)
4. Do you like this pool? (totally subjective)
5. What will your record be in your four Swiss rounds?

The answers from the builders were fascinating, so let’s start with those:

Deck 1

Matt Ferrando, Jon Becker, Hashim “No Answers” Bello, Jamie Parke
Order in which I would pick these 4 players in a draft for my life: Parke, Becker, Ferrando, Bello
Good: 8.3; 8+; n/a; 7 â€” Average: 8
Hard: 2.1; 4; n/a; 4 â€” Average: 3.4
Similar? 18 spells overlap; 21 spells overlap; n/a; 20 spells overlap â€” Average: 19.7
Likable? Yes; Yes; n/a; Seems Solid
Record? 3 or 4 wins; 3 or 4 wins; n/a; 2 or 3 wins â€” Average: 3.16 wins
Actual Record: 1 – 3; 3 -1; 1 – 3; 3 -1 — TOTAL 8 – 8

Comments from players:

Jamie: “I’m worried about being too slow, with zero Myrs, but I have lots of good removal which could make up for it.”

Matt: “Not having mana Myr made it easier for me to build, because actually don’t like them in this format.”

Jon: “Not particularly difficult to build â€” but the last two cuts, as always, took a while.
Hashim: “There was a survey?”

For the most part, the players of this pool predicted success and were happy with their pools. The average win prediction was over three, meaning one or fewer losses.

In fact, the two most experienced players achieved 3-1 records. Matt Ferrando failed to meet expectations, and he and Hashim squeaked out only one win apiece, leaving this deck’s actually average record at exactly 50%. However, if you look at the way the decks were built, you’ll notice that one of those bad records was built very differently from the rest:

Hashim chose the giant green monsters over the strong (but not terribly deep) white cards. A vanilla 6/5 in this format can certainly play the role of fill-in bomb, and I can see the logic with all the red removal: just kill guys and have big stuff to cast in the late game! But it didn’t turn out well.

On the other hand, Boros builds finished 7-5, and they were all pretty similar.

So what have we learned? Well, certainly people seemed to think this deck was an obvious build â€” and yet someone strayed from the “obvious” plan (although twenty cards in this pool appeared in at least three or the four decks, which seems like a lot of consensus).

Also, this deck only has one real bomb â€” the Sunblast Angel â€” but it still posted a respectable record, and players seemed happy to have this one-bomb deck.

Not terribly telling: let’s move on to the next deck.

Deck 2:

Tony Tsai, Tim McKenna, Luis Nieman, Eric Smith
Order in which I would pick these four players in a draft for my life: Tony, Luis, Tim, Eric
Good: 6; 8; 8; 7 â€” Average: 7.25
Hard: 2; 2; 3; 2 â€” Average: 2.25
Similar? 19 spells overlap; “pretty close” (20?); 21?; 18.5 spells overlap â€” Average: 19
Likable? n/a; Yeah; yes; Seems Solid, Solid
Record? 3 wins; 3 wins; 3 or 4 wins; 3 wins â€” Average: 3.12 wins
Actual Record: 2 – 2; 0 – 4; 2 – 2; 2 – 2 â€” Total: 6 – 10

Comments from players:

Tony: “The lack of enough Shatter-type spells make it seem likely to get blown out by bomb non-creatures.”

Tim: “Obviously, white will be played. The main difference will be how people will beat down. You could put more little guys and equipment in the deck.”

Luis: “I also have good sideboard options: 2 Dispense Justice (if I’m the control deck), Soliton + Heavy Arbalest, Kemba, Kha Regent + equipment, etc.”

Eric: “It has two playable ‘Bomb’ cards, along with a very strong card in the Precursor Golem and solid removal in the Arrests/Dispense Justice. The drawbacks are that there isn’t that much metalcraft synergy, and there is no permanent targeted removal.”

Again, everyone seemed happy with the pool and predicted success: over three wins each!

The results, however, were less sterling: 2-2 from everyone except for Tim, who put a blight on this deck’s results with his 0-fer. He was the only one to play Soliton with his Arbalest, and the only one not to splash green for Sylvok Replica.

Additionally, two players decided to play red as their secondary color purely for Embersmith, forgoing Luis’ very clear argument for blue as the support color. Another interesting card here is Dispense Justice, which Eric Smith embraced enthusiastically by starting both â€” but was completely relegated to the sideboard by everyone else. Have a look:

Now, everyone thought this deck was very easy to build, and that the overlap would be high. In fact, they were right: the white base deck had twenty-two cards that were in at least three of the four builds, and the white was so deep that the differing choice of complimentary colors didn’t seem to matter much.

But everyone who opened this deck also thought the deck was a winner, predicting at least three wins each… and not one person made it to three!

So: how obvious
is
a good or bad deck? I certainly thought this was a good deck, but none of these players had better than mediocre performances with it. With triple-Arrest in every build, perhaps we are overvaluing this card? Or are we overvaluing the rares in this pool? I think this may be related. Both Steel Overseer and Precursor Golem are particularly easy to deal with, as bombs go. Multiple commons can stomp on these cards before they can have any impact on the game whatsoever.

Chimeric Mass is similarly fragile. It’s immune to Arrest, yes, but a real dog to popular uncommon Glimmerpoint Stag, and easily chump-blocked by the myriad token creatures floating around the format.

I think the results for deck 2 are quite interesting: let’s see what’s next.

Deck 3

Chris “the mute” Manning, Dan O’Mahoney Schwartz, Chris Pikula, Gaudenis Vidugiris
Order in which I would pick these 4 players in a draft for my life: Gaudenis, Dan, Pikula, Manning (extremely difficult: Gau is obviously the most active currently, Dan and Pikula are very close, and Manning is extremely reliable and has done great in this format)
Good: n/a; 5; 7; 5 â€” Average: 5.67
Hard: n/a; 4; 2; 5 â€” Average: 3.67
Similar? n/a; 11 or 21 spells overlap (35% infect builds); 21; 11 or 21 spells overlap (2 different builds) â€” Average: 17.67
Likable? n/a; not too happy; I think so; not really
Record? n/a; 2 wins; 2 or 3 wins; 2 wins â€” Average: 2.16 wins

Actual Record: 4 – 0; 3 – 1; 2 – 2; 1 – 3 â€” Total: 10 – 6

Comments from players:

Chris Manning: “I went 4 – 0, didn’t lose a game. Also, I didn’t answer any of your survey questions, which sucks for you because it would have been fascinating to know if I thought the deck was that good beforehand.” (Actually, he only said the first part)

Dan: “Every archetype feels like it falls short by one or two cards. There are no combat tricks or good equipment for the solid suite of poison creatures, no strong way to capitalize on the easy-to-get metalcraft, and of course, no creature removal! On the bright side, there is Mimic Vat, a solid suite of artifact removal, and tons of ways to get two-for-ones. I’ll build the W/G/r deck with hopes that if I do play a bomb-laden deck, they will be artifacts that I can blow up, and just hope it isn’t something like Hoard-Smelter Dragon…. can see a 4-0 if I avoid non-artifact bombs, or even an 0-4 if people have colored bombs and decent poison decks.”

Chris Pikula: “It has one bomb â€” but it’s an artifact, of course, and otherwise the ways to win are not great. This has a lot of removal, but will have trouble with non-artifact creatures.”

Gaudenis: “It seemed like the non-infect options are just not powerful enough. White with either green or red is fine, but not all that impressive….. Unless I’m under-rating how good infect is I think it’s decidedly average in most regards, so not really â€” but I’m really glad I will get to try this before the Grand Prix.”

This deck is particularly interesting, and not just because one player built a totally separate deck and then appeared to demonstrate that his choice was quite bad: Gaudenis specifically wanted to experiment with poison in Sealed, and this was a good opportunity. He posted the worst record of anyone with this deck at 1-3… but interestingly, his single victory came against the consensus Best Deck!

This makes some sense, as poison’s speed and synergies can often make bombs and card quality less relevant. In fact, two other players with this deck also beat the consensus Best Deck: and for the player who lost to deck 5, it was his only loss (Dan vs. Eric Tam).

The builds here are interesting, so let’s have a look:

Three people built base W/G decks splashing a little red, and Gau tried out poison. How to explain this deck out-performing expectations? Everyone expected to have about two wins with it, but it averaged 2.5 â€” even with Gaudenis dragging the average down. Is this deck better than it looks?

I think it might be, as it has solid removal, one of the best and cheapest bombs in the format (Mimic Vat), and some excellent uncommons (Razor Hippogriff, Acid-Web Spider, Golem Artisan, Slice in Twain, and Glimmerpoint Stag are all excellent). The average rating here was under six: I think the deck is almost certainly a seven or an eight, but it also probably benefited from quality pilots. I assigned decks to players randomly… and deck 3’s worst player is a PTQ ringer, while its other pilots include two Hall-of-Fame vote-getters and an active Pro who is very highly respected.

Deck 4

Paul Jordan, Jake Van Lunen, Brook North, Adam Reubens
Order in which I would pick these 4 players in a draft for my life: Jake, Paul, Brook, Adam
Good: 6; 6; 5.5; 6.5 â€” Average: 6
Hard: 7; 7; 8; 5 â€” Average: 6.75
Similar? 19 spells overlap; 22 spells overlap; 13.5 spells overlap; 20 spells overlap â€” Average: 18.5
Likable? Meh; No; No; It’s alright
Record? 2 wins; 2 wins; 1 win; 1 or 2 wins â€” Average: 1.61 wins
Actual Record: 2 – 2, 0 – 4, 2 – 2, 1 – 3 â€” Total: 5 – 11

Comments from players:

Paul, in response to “Why was it hard to build?”: “They all are.”
Jake: “I like pools where I can build a real deck with a goal. There wasn’t a clear plan for the pool and I felt like it had a lot of issues. Even the Trigon of Rages aren’t great in a pool like this.”
Brook: “I wanted to play the white, especially Tempered Steel, but I don’t think you have the… artifact creatures to support it. Everything I looked at was pretty janky. I’m particularly agonizing over the Furnace Celebration. I think it’s a play in this deck…”
Adam: “It’s All right. It doesn’t do anything particularly well. It’s a slow deck, but it can’t take over the late game like many slow decks can.”

Our first unpopular pool, though everyone still thought it was slightly above average. This is a very interesting deck in that it has a large collection of cards you are happy to see in your pool, but nothing that ties it together. Many players made comments about wanting to play the Tempered Steel or Volition Reigns to up the power level, but most ended up just playing all their R/B removal â€” the only exception was Paul, who ran blue instead of black. Brook tried to get something special going by playing the Furnace Celebration, despite not quite the resources you would want. Personally, I would have liked to see someone try the big green monsters with the solid removal.

Here are the builds:

The 75% or better overlap here is twenty spells â€” just about what was predicted by everyone. Twelve cards overlapped in all four decks.

You might use this deck’s poor results to defend the theory that bombs are essential to winning in Sealed deck â€” but notice that everyone also rated this deck as harder than average to build. Perhaps the lack of bombs made it hard to focus in on a good build â€” and perhaps there was a B/G deck splashing red for Galvanic Blast and double-Shatter that would have had the whole package of great removal and plenty of big finishers.

Still, no one seemed excited about having this deck, and the results aren’t going to change any minds. I think this is definitely evidence of the “correct” build for a Sealed deck sometimes being non-obvious. Jake was very upset with his results, in particular, and blamed it not on the pool itself but on his own failure to recognize the weaknesses of the obvious R/B build.

Let’s see what happens when everyone is extremely happy with their pool:

Deck 5

Marshall “Can’t be bothered” Reaves, Eric Tam, Tom Martell, Mark Schmit
Order in which I would pick these 4 players in a draft for my life: Tom, Eric, Mark, Marshall
Good: n/a; 8; 10; 9 â€” Average: 9
Hard: n/a; 6; 3; 3 â€” Average: 4
Similar? n/a; 17 spells overlap; 18 spells overlap; 20-24 spells overlap â€” Average: 18
Likable? n/a; Yes; Best moment of my life; Yes
Record? n/a; 3 wins; 4 wins; 3 wins â€” Average: 3.33 wins
Actual Record: 1 – 3, 4 – 0, 3 -1, 3 – 1 â€” Total: 11 – 5

Comments from players:
Marshall: “When are you going to do your Sealed deck event?” (Then I invite him and he can’t even answer my survey… typical!)
Eric: “The reason this pool is not a 10… is that the ‘obvious’ white/red build, while containing potent removal, does not contain enough very fast elements to support a true beatdown deck and is a bit light on middle-game/defensive creatures. And the alternate white/blue build is similarly not powerful enough to provide for a 10 control deck.”
Tom: “LOL at this pool… can I use it in Nashville? :)”
Mark: “Color choice was obvious. The hard part was figuring out how many Myr, how many lands, and the last few cards to cut.”

Everyone thought this deck was good â€” and the results backed that up, for the most part.

Let’s get started by looking at the builds: I had assumed that everyone would build this W/R â€” but Marshall, who posted the only non-winning record with this deck, had other ideas. He also didn’t answer my survey questions, so we’ll just have to guess at his thought process.

Yes, that’s right: three Mythic Rares with four on-color Myr
and
five solid pieces of removal in W/R…. and yet somehow, Mr. Reaves was so greedy that he wanted to add another color! Certainly, there is a school of thought that with so many Myr and so many artifacts, you can play an extra color and not get punished for it… but with a deck this good, why are we taking chances?

I’m glad he did this, though: it shows that even very powerful and seemingly obvious Sealed pools get mis-built sometimes. This sealed pool is definitely top 10% of all pools, and probably top 5%…. but even if it were top 2%, that wouldn’t guarantee anything for the person who opened it. The temptation of having no 3-core spells in any color, the optimism brought on by the ubiquitous on-color Myr, the desire to do something cute, the fear of an even more powerful deck… it’s easy to talk yourself into sabotaging a great pool.

So the time you see a pool like this at table 1, don’t be so quick to dismiss the player: they almost certainly had an opportunity to screw it up!

But what of the other three players? They overlapped by over twenty spells on average, and had a combined record of ten wins and two losses: the build seemed pretty easy and the results were sterling. Tom lost to Chris Pikula W/G/r removal-heavy deck, and Mark Schmit lost to Gaudenis’ poison deck â€” the only poison build of any of the decks in any pod. It’s too small a sample size to know if poison is the antidote to overpowered W/R decks or if that was just a fluke.

If you want to make an argument that there are some Sealed decks that are essentially free, obvious tickets to a winning record with minimal work, this deck is a good starting point.

What were the final records of all the decks, though?

Deck 5: 11 — 5
Deck 3: 10 — 6
Deck 1: 8 — 8
Deck 2: 6 — 10
Deck 4: 5 – 11

Deck 5 barely out-performed Deck 3, a deck which received a 5.67 average rating from those who received it â€” the worst of any deck! Deck 1 averaged an 8 rating, and finished at 50%. Deck 2 was over 7, and finished 2 games under .500.

Of course, it should be noted that every deck was rated as above-average by those who received it. Is this grade inflation, a particularly strong set of pools, or simple optimism? Hard to say. The predicted W/L records are clearer here that it’s one of the latter two: on average, players predicted 2.75 wins for themselves. That could be a sign of a pool of players with a lot of confidence in themselves or a strong collection of packs: I think it’s the former, even if the latter is contributing a bit.

What can we say about our common wisdom?

1. “Some Sealed decks are so bad they are essentially worthless.”

I’m not convinced of this yet…. though Deck 4 presents a good argument, with a 0-4 and no one able to produce a winning record. But this pool also had one of the most promising un-tried builds, with B/G/r removal and dinosaurs.

Deck 3, which people were pretty lukewarm on but performed well, is a good argument for giving decks that don’t look so great the opportunity to shine.

2. “Some Sealed decks are so good they are almost unbeatable.”

We’re back to Deck 5, of course â€” but it’s far from unbeatable! And certainly, the builder is capable of beating his own excellent pool. You may not have to beat all the best pools in the room to be successful, because they don’t build (or play) themselves.

3. “Some decks are easy to build, while others are extremely difficult.”

I think this is mostly true. Almost every deck had three players build very similarly and one player take a different tack â€” and usually the consensus build finished better.

Not always, though! The important thing to observe here is probably that the easy build is not necessarily the
best
build.

4. “Sealed deck tournaments are very bomb-dependent.
”

Certainly, the bombiest deck turned in the best results. But it also had some of the best removal! And the deck that finished second had only one (very fragile) bomb. I’m tempted to say that quality uncommons are just as important as bombs.

And our direct questions:

1. “How distinct is the power level of a Sealed deck?”

Deck E defeated Deck A in every pod. Deck A defeated Deck D in every pod. Those were the only sweeps of the ten deck matchups, and Deck A was involved in both.

The transitive matchup wasn’t far from expectations, though: Deck E only lost to Deck D once, and that was Gaudenis’ anomalous poison build.

The records of these three decks? 11-5, 8-8, 5-11. That sure looks like three distinct tiers of deck quality.

2. “How obvious is a good Sealed deck vs. a bad Sealed deck?”

The worst deck was pretty readily identified as bad, getting a lot of “meh” responses from its builders. On the other hand, the deck that was actually rated the worst of the bunch finished one game behind the leader! I think that some decks are obvious and some are very easily misjudged.

3. “How obvious is the ‘correct’ build of a Sealed deck, if such a thing exists?”

In many cases, it seems pretty obvious. The average spell overlap in these decks was between eighteen and nineteen spells out of about 24: 80% or so.

Of course, this format is going to generate a ton of extra overlap with all the colorless cards and the predominance of White and Red in the sealed environment. I think for a lot of pools, the best options are elusive. Eighty-four cards is a lot to choose from, and we need more than four people per deck to really see the possibilities.

4. “Can a good player win despite their Sealed deck, or is the deck too important?”

No one really made hay with Deck 4, but Deck 3 seemed to outperform expectations pretty strongly with an impressive array of pilots, even with one of them experimenting with his build. I do think this collection of Sealed pools lacked a genuine stinker that would have received sub-five ratings and given us a look at what you can do when the packs really aren’t helping.

5. “Most importantly: what is the relationship between the cards you open and your chances for success?”

Deck quality assessment ranking, from best to worst:
5, 1, 2, 4, 3

Deck predicted average record, from best to worst:
5, 1, 2, 3, 4

Deck actual record, from best to worst
5, 3, 1, 2, 4

Really, the only anomaly here is deck 3, jumping from the tail positions to second place. Literally every other deck stays in order. This is a significant argument for much of the cynical common wisdom about Sealed deck, with a little dash of hope in there.

I’d love to repeat this experiment on a larger level, but I will point out that keeping track of everything and compiling results is pretty time consuming: it probably needs to be done in some cleaner way than disparate email threads.

So what do these results say to you? Are you encouraged or discouraged by them?

In two recent sealed PTQs, I opened decks that did not look top-8 worthy to me. Three weeks ago, I had a lot of solid white and red cards, but not a huge amount of removal (four various spells) nor any real bombs (Oxidda Scrapmelter and a Precursor Golem that I never cast were my best cards). I did have three Glint Hawks, though, and solid spells top-to-bottom. I didn’t lose a game until the first round of the top 8, where I lost two.

The next week, I had what looked like a mediocre pool with little removal, but plenty of solid cards again. It was blue and white with ample flyers, a Volition Reins, and solid filler. I went 6-1-1, playing some of the best Magic I’ve played in a long time to get there, but finished ninth. The blue was not an obvious choice, but I’m convinced it was the right choice.

Earlier in the season I had the dream poison deck, with two infect rares and all the fixings â€” but I misbuilt it (getting too ambitious with splashing when I saw that I didn’t need much black mana) and had to sideboard eight cards after every game 1. I started 4-0 anyway, but the bad build and some bad play had me dropping at 4-2.

For me, those experiences definitely point to a greater range of possibilities in each Sealed deck than people commonly perceive. Sure, it’s fun to have the insane open sometimes â€” but I think it’s far more enjoyable to show off your top 8 deck and have people say, “How’d you manage that?”

Sometimes, it’s not as hard as it looks…