This is a study that I have done during the first quarter of my graduate life. I love gaming extensively and I have played World of Warcraft for nearly four years. It was a homework project so I probably didn't think some parts thoroughly but I still think it can form a basis for theory crafting of the aforementioned subject.
First of all, If I remember correctly, this study was done during Firelands Era of Cataclysm (Patch 4.2). So everything is susceptible to change as WoW transforms constantly.
What I did is use a third party service that analyzed power of each specialization tree(a.k.a spec) in game. They went through logs of raids (which were provided to them by participating players), investigated boss kills and the effectiveness of each class and their relative specs in those kills. It must be pointed out again that this is, by itself, a huge bias, as only players who actually upload their logs are included (Blizzard is very reluctant to share data regarding their games).
After that, I have used another third party service. They did polls to come up with aggregate classes and specs for each individual server. I have summed them up to create aggregate data (just for US and EU servers). Before I step into any mathematical stuff, I'll quote my brief introduction about 5-man dungeons here
In WoW, there are 10 player classes, each of which has 3 specializations. These classes and their specializations (which are called talent trees, or specs) are given at the table below. The important thing to note about these specializations is that, there can be three roles fulfilled in a 5-man dungeon group, which are called Tank, Healer and DPS (which stands Damage per Second, a common abbreviation used for the damage dealers).
Each of these roles will be also marked with the first letters in capital near to each spec in the below table. The tank’s role is be the front-man of the group, attracting the attention of monsters controlled by the computer to avoid damage to players who are able to die easily in the group, such as healers or damage dealers who are low on armor. The tanks generally has a very high armor value to be sustainable and has special abilities to attract monsters to themselves. The healer’s duty in a group is mending the tank and keeping the other players alive. If the tank dies, it will definitely lead to the death of all the players in the group, as other players won’t be able to cope with the high damages caused by monsters. In addition, there may be cases where the healer is also required to heal the other players in the group, depending on the strategy followed by the group. The simplest but the most important role is the damage dealer. It’s sole duty is to make the monsters die, before the tank dies, or before the healer is unable to heal the group. For this, there are 3 damage dealers considered in each group.
The letters in brackets refer to Tank(T), Healer(H) and DPS. Summing the list up there are 4 Tank specs, 5 healer specs and 21 DPS specs and standard composition in a 5-man group covers one tank, one healer and 3 DPS players. This gives us a total of 185220 different team combinations.
As there are three groups of players, for a random sample, a random tank, a random healer and three random DPSes will be chosen. The probabilities is found by dividing the number of players playing a specific spec to the number of players that can fulfill that role. The probabilities are found by the formulas below. So the probability of choosing a Death Knight Tank (Blood Spec) is...
and so forth.
10000 samples have been created (If I did this again, I would probably use over a million samples, because the set is bigger than 185k as I stated above), and the first ten is listed below as reference. The average of the power coefficients per team is calculated, and listed as Probability of Winning in the below table. For all of the 10000 samples, the average of the Probability of Winning was found at 0.778. Considering that the dungeons could be completed by a below average group, 0.776 (Completely chosen by me without any basis - big mistake) was used as a Success or Failure criteria. In the sample, 5637 successful runs and 4363 failure runs.
Below are the first 10 samples in the set of 10k. The Probability of Winning for the team is average specscore of all members (Which is also a mistake as contrary to my previous statement, Tanks and Healers are more critical to the success therefore should have a greater weight)
Although the data gets more skewed with the increase in bin sizes(there are other figures that I did not post here), Figures 1 and 2 are relatively close to a normal distribution. The variance of the data is 0.00165, the standard distribution 0.04065, and the average 0.77782. Knowing these values, expected values for executing the Goodness-of-Fit test were generated. The expected values were created by creating a normal distribution with the variance and standard distribution of the sample, multiplying that number with 10000 and rounding these numbers as an observed number of events cannot have non-integer values. The 2 value derived from the Goodness-of-Fit test was 0.25. Using = 0:001 and v=40 (the histogram used in Goodness-of-Fit test had 41 bins, subtracting 1 as instructed by the formula), the corresponding 2 value was 70.7029, proving that the hypothesis is true in a 0.001 significance level.
So yeah, that's it. No more conclusions. I guess I could have worked on this a little more, but "WoW, This was completely different than what I expected, and also awesome" comments that was on my paper when I recieved it back convinced me I did enough work for a homework project =)