
Hi all,
I've been working for a while now on some serious nerdery, and I finally have some results to post. As recently appointed coach for the Danish national team, I created a rather elaborate excel sheet in order to analyze the EuroBowl meta. Once created, I figured that I might as well use it to scrutinize the NAF-res-meta in general.
Before I get started I should probably say that I'm not a proper statistician. I know that when working with performance averages, one should list a range rather than a single number. Unfortunately, while I can calculate individual ranges, I don't know how to add a list of ranges together, so here I've just posted the averages. I'm also working with CRP stats rather than BB2016 stats - and the stats no doubt originate from tournaments using all kinds of different rules. If you can stomach all that, read on.
0) First, I took the NAF data for racial match-ups from Ian Williams (doubleskulls) site: naf.talkfantasyfootball.org/lrb6 (last updated on november 2016) - and I entered those win percentages into my elaborate spreadsheet, representing roughly 113.500 games played.
The results/win-percentages are from LRB6/CRP NAF tournaments. Presumably mainly resurrection rules and 1.100K. Some of the stats are from tournaments using tier-bonuses, but I have no way of isolating those. Tier tweaked tournaments will mean that the strongest teams will look slightly weaker than they are, and weak teams will look slightly stronger.
First table is straight up win percentages (ties count as half-a-win).
The second column (not repeated in any of the following lists) shows how large a percentage of the total number of games were played by each race.
Code: Select all
Win-% in Swiss number of teams
Wood Elf 55,9 7,2%
Undead 55,5 8,1%
Lizardmen 54,2 5,9%
Dark Elf 54,1 6,8%
Amazon 53,2 3,8%!
CDs 52,3 5,5%
Norse 52,1 6,4%
Elf 51,8 2,5%!
Dwarfs 51,5 7,0%
Skaven 51,3 6,6%
Necro 51,3 4,6%
----------------------------
High Elf 49,5 2,0%
Khemri 48,3 2,0%
Orcs 47,8 8,7%!
Pact 47,6 2,2%
Human 47,1 4,4%
Slann 46,9 1,6%
Underworld 45,6 1,6%
Nurgle 45,4 1,6%
Vampire 45,1 1,4%
Chaos 44,8 2,4%
Halfling 36,1 2,5%
Gobbo 33,1 3,2%
Ogre 32,2 2,2%
I don't know how to check for the strength of the correllation between quality (i.e. power) and quantity, but if we ignore the tier 3 teams (who are abviously not taken solely for winning) then the remaing teams ought to come up around 4.4% of the time on average. Notably, just 2 of the 11 winning teams are less prevalent than this, while just 1 of the 13 losing teams is overrepresented - The Orcs. This -might- indicate that the popular theory that Orcs attract new players, who in turn drive down their stats, could be true. If it is then Orcs might have to be treated as better than their stats indicate.
--------------------------------------
1) So far, so good. First, I was curious to examine the impact of Swiss pairing. Swiss pairing has been the predominant principle behind tournament schedules for a long time, and I think most (but not all) of the stats are from Swiss matching. The result of Swiss pairing is logically that weaker teams get matched against other weak teams, so their stats will improve as the tournament goes on. And in the same vein powerful teams will get matched against other powerful teams - meaning that some of them will lose or tie.
So, I set up the excel sheet to let each team face a meta-composition reflecting each opposing races share of the meta - in essence letting each team face every other team, rather than a (swiss) biased selection. Assuming that the individual race-vs-race winpercentages does not change, this is the result.:¨
Code: Select all
Swiss Random
Wood Elf 55,9 +0,91
Undead 55,5 +0,87
Lizardmen 54,2 +0,50
Dark Elf 54,1 +0,46
Amazon 53,2 +0,62
CDs 52,3 +0,32
Norse 52,1 +0,08
Elf 51,8 -0,06
Dwarfs 51,5 +0,25
Skaven 51,3 +0,59
Necro 51,3 +0,11
High Elf 49,5 -0,66
Khemri 48,3 -0,67
Orcs 47,8 +0,07
Pact 47,6 -0,59
Human 47,1 -0,49
Slann 46,9 -1,16
Underworld 45,6 -1,70
Nurgle 45,4 -0,76
Vampire 45,1 -1,14
Chaos 44,8 -0,50
Halfling 36,1 -1,82
Gobbo 33,1 -2,06
Ogre 32,2 -1,86
Genrally speaking, as expected, swiss pairings make the strongest teams look weaker than they are, and the weak teams look stronger than they are.
--------------------------------------
1a) I then wanted to examine a further bias in the stats: Mirror matches. Simply put, mirror matches are data chaff. It will pull the results of any team towards 50. Which to me is problematic, because the more powerful (and prolific) a team is, the more mirror matches there will be. Imagine a meta with a team so broken that it beats everything else. With nothing else worth playing, this team will end up playing against itself with an innocent looking win percentage of 50%. Perfect balance...
I know that the BBRC included mirror matches in their original data, and by extension, one could argue that they chose to make mirror matches part of their definition of balance. To my mind, there was no choice, just neccessity, because their was no possible alternative. Back then, there was no way to filter out the mirror matches, so they had to be left in. Anyway, it was easy enough to remove the mirror matches from the data, and the result was this:
Code: Select all
Swiss Random NoMirr TOTAL
Wood Elf 55,9 +0,91 +0,52 57,33
Undead 55,5 +0,87 +0,46 56,93
Lizardmen 54,2 +0,50 +0,29 54,99
Dark Elf 54,1 +0,46 +0,33 54,89
Amazon 53,2 +0,62 +0,15 53,97
CDs 52,3 +0,32 +0,15 52,77
Norse 52,1 +0,08 +0,15 52,33
Skaven 51,3 +0,59 +0,13 52,02
Dwarfs 51,5 +0,25 +0,14 51,89
Elf 51,8 -0,06 +0,05 51,79
Necro 51,3 +0,11 +0,07 51,48
---------------------------------------------
High Elf 49,5 -0,66 -0,02 48,82
Orcs 47,8 +0,07 -0,20 47,67
Khemri 48,3 -0,67 -0,05 47,58
Pact 47,6 -0,59 -0,07 46,94
Human 47,1 -0,49 -0,16 46,45
Slann 46,9 -1,16 -0,17 45,67
Nurgle 45,4 -0,76 -0,11 44,55
Chaos 44,8 -0,50 -0,14 44,16
Vampire 45,1 -1,14 -0,08 43,88
Underworld 45,6 -1,70 -0,10 43,80
Halfling 36,1 -1,82 -0,40 33,88
Gobbo 33,1 -2,06 -0,63 30,31
Ogre 32,2 -1,86 -0,43 29,91
As you can see, the impact of mirror matches is fairly neglible for most teams, except for the strongest teams (Wood Elf, Undead, Lizardmen and Dark Elfs) and the weakest teams (Halfling, Gobbo, Ogre). The difference is basically determined by how far a team is from a 50% performance, and how many games they have played. Orcs (again) stand out a bit due to the sheer number of games played by orcs.
--------------------------------------
2) So, Swiss Pairing means a difference of almost 1 percentage point for the best teams and 2 for the worst, while mirror matches account for roughly half a percentage point for the best and worst teams. Interesting as that is, I go back to including mirror matches for the next step. It doesn't matter much anyway, because the next step is about moving all teams closer to 50% wins, and if they are, then including or excluding mirror matches makes very little difference.
Now, while I did do all this math because I was just plain curious about swiss and mirror matches, I was also thinking about creating some "tier bonus" tournament rules that would (hopefully) bring unprecedented parity to said tournament - which in turn would make teams equally popular. To simulate this, I started by defining a new meta, assigning the non-tier 3 teams an equal share in the 209.000 games. Tier 3 teams kept their individual number of games, as tier 3 teams are (arguably) not taken for their chance to win.
In such a meta, given unchanged race-vs-race average win percentages, the projected win percentages would be:
Code: Select all
Swiss Random-pairing =Meta
Wood Elf 55,9 +0,91 56,81 58,74
Undead 55,5 -0,06 56,37 58,01
Lizardmen 54,2 +0,50 54,70 56,37
Dark Elf 54,1 +0,46 54,56 56,10
Amazon 53,2 +0,62 53,82 56,07
Norse 52,1 +0,08 52,18 53,91
Skaven 51,3 +0,59 51,89 53,84
CDs 52,3 +0,32 52,62 53,80
Elf 51,8 -0,06 51,74 52,93
Necro 51,3 +0,11 51,41 52,91
Dwarfs 51,5 +0,25 51,75 52,83
High Elf 49,5 -0,66 48,84 50,63
Orcs 47,8 +0,07 47,87 49,00*
Pact 47,6 -0,59 47,01 48,72
Khemri 48,3 -0,67 47,63 48,50
Human 47,1 -0,49 46,45 48,16
Slann 46,9 -1,16 45,74 47,53
Underworld 45,6 -1,70 43,90 46,36
Vampire 45,1 -1,14 43,96 45,96
Nurgle 45,4 -0,76 44,64 45,68
Chaos 44,8 -0,50 44,30 44,45
Halfling 36,1 -1,82 34,28 35,54
Gobbo 33,1 -2,06 30,94 32,07
Ogre 32,2 -1,86 30,34 31,85
--------------------------------------
3) Finally I used a feature in my spreadsheet to simulate the effect of adding a tier bonus that would increase (or decrease) a teams performance. For example, would what happen if Necro get a (say) +2% increase across the board. Or Wood Elfs got a -3% decrease. (In that case, Necro vs Wood Elfs would be modified by +5%).
Naturally, it will take guesstimation to figure out what kind of actual bonus would result in (say) that +2% increase for necro. That work is yet to be done - but we do have a few pointers from past tier tweaking tournaments like the EuroBowl. Furthermore, I know that it most likely isn't possible to create tweaks with such a uniform effect against all those (very different) opposing races. But I've tried to make up for that by trying to get all (non tier 3) teams quite close to 50% wins - in reality I ended up between 50.02% and 51.63%. That way there is some room for error while still leaving things pretty well balanced. (I put tier 3 close to 40%).
Note that I didn't just add or subtract numbers in one go. Instead I tweaked gradually, and watched how changes in each race affected the performance of other races. Admittedly, I could have gotten better results (by that I mean even closer to 50%), but partly I wanted some "tiers" rather than 24 individual tweaks. And partly I didn't want to kid myself about the precision of the numbers anyway. As it happened, the teams fell pretty neatly into 8 tiers, containing 2-5 teams each.
Code: Select all
Swiss Random-pairing =Meta Tweak: Result
Wood Elf 55,9 +0,91 56,81 58,74 -4 51,44
Undead 55,5 -0,06 56,37 58,01 -4 50,71
Lizardmen 54,2 +0,50 54,70 56,37 -2 51,07
Dark Elf 54,1 +0,46 54,56 56,10 -2 50,80
Amazon 53,2 +0,62 53,82 56,07 -2 50,77
Norse 52,1 +0,08 52,18 53,91 0 50,61
Skaven 51,3 +0,59 51,89 53,84 0 50,54
CDs 52,3 +0,32 52,62 53,80 0 50,50
Elf 51,8 -0,06 51,74 52,93 +2 51,63
Necro 51,3 +0,11 51,41 52,91 +2 51,61
Dwarfs 51,5 +0,25 51,75 52,83 +2 51,53
High Elf 49,5 -0,66 48,84 50,63 +4 51,33
Orcs 47,8 +0,07 47,87 49,00* +4* 49,69*
Pact 47,6 -0,59 47,01 48,72 +6 51,42
Khemri 48,3 -0,67 47,63 48,50 +6 51,20
Human 47,1 -0,49 46,45 48,16 +6 50,86
Slann 46,9 -1,16 45,74 47,53 +6 50,22
Underworld 45,6 -1,70 43,90 46,36 +8 51,06
Vampire 45,1 -1,14 43,96 45,96 +8 50,66
Nurgle 45,4 -0,76 44,64 45,68 +8 50,38
Chaos 44,8 -0,50 44,30 44,45 +8 50,02
Halfling 36,1 -1,82 34,28 35,54 +8 40,24
Gobbo 33,1 -2,06 30,94 32,07 +12 40,76
Ogre 32,2 -1,86 30,34 31,85 +12 40,55
Let me just summarize the 8 tiers here:
Tier 0 (-4): Undead, Wood Elf,
Tier 1 (-2): Amazon, Dark Elf, Lizardmen
Tier 2 ( 0): CDs, Norse, Skaven
Tier 3 (+2): Dwarf, Elf, Necro,
Tier 4 (+4): High, Orc*
Tier 5 (+6): Pact, Human, Slann, Khemri
Tier 6 (+8): Nurgle, Chaos, Vampire, Underworld, Halflings
Tier 7 (+12): Gobbo, Ogre
It should be noted that mathematically, it is the differences between the tiers that matters - so one might equally well have wood elfs at 0 modification, and then tier 7 at +16% rather than +12%. As a matter of personal taste though, I'd rather take away a bit from the top teams than heaping on even more bonuses on the bottom teams.
----------------------------------
3) Finally, it has been suggested that tiering rules are essentially a pointless endeavour, because they'll just push stats around rather than create any actual balance. In a nutshell, if teams are strong against some teams and weak against others, won't a boost or nerf across the board run the risk of making things less balanced rather than more balanced?
It is true that rules such as these will never get rid of baised individual match-ups. But overall balance can still be improved greatly, even if it won't be perfect. I tried my best to look into what would happen, if the bonuses described above were succesfully applied.
First, it is worth noticing that if this works, then all teams are very close to equal, assuming random opponents are assigned from a uniform meta. But has this come at the price of increased imbalance to individual teams? I've calculated 3 scores in order to look into this:
First, the sum of how far each race is from the perfect winpercentage (50% between most teams, but 60% for a normal team against tier 3 and 40% for the tier 3 team against a normal team.
Secondly, the same stat but for the now tier tweaked win percentages. Comparing the 2 stats we see that all teams have come quite a bit closer to the mark.
The third breaks down the change into individual match-ups, showing how many match-ups have had improved balance and how many would become less balanced. On average around 3 match-ups will have improved for each that is now less balanced.
Code: Select all
IMBALANCE CALCULATION
NAFimb TWEimb +/-
Wood Elf 209,5 84,7 17/5
Dark Elf 150,2 56,0 16/5
Undead 181,3 51,1 20/2
Lizardmen 164,5 89,9 16/5
Necro 113,1 54,7 16/5
Norse 142,2 80,4 16/5
Dwarfs 121,7 100,7 11/10
CDs 111,9 78,9 12/9
Skaven 131,7 69,3 14/7
Orcs 104,5 65,3 14/6
Human 122,7 53,9 16/4
Amazon 190,1 73,1 18/3
High Elf 157,3 91,1 19/3
Underworld 164,2 85,2 15/4
Nurgle 157,2 105,0 13/6
Khemri 133,8 91,2 13/7
Elf 134,4 78,2 16/5
Chaos 148,3 81,9 13/6
Gobbo 213,4 83,4 18/4
Halfling 176,5 102,5 14/5
Ogre 230,8 118,6 18/3
Pact 113,3 68,7 13/6
Slann 163,0 84,8 16/4
Vampire 154,8 78,0 13/6
AVG 153,8 80,3 15,3/5,2
Cheers
Martin