Hi,
Thanks dode. I will have to think a bit harder about how to stratify the FUMBBL data by TV (currently looking at ranked data), as the low numbers will mess with standardisation. I was thinking of the following TV bands for standardisation; 900-1200, 1200-1500, 1500-1800, 1800-2100, 2100-2400. As you say, above 2000 numbers start dropping off rapidly, but with TV bands this wide, as long as I exclude Underworld, Goblin, Halfling from the analysis, small numbers shouldn't be an issue.
I did some work looking into standardising the Win% (wins+(draws/2)) NAF data by TR (TR100, TR110, TR120, as it is presented broken down at;
http://naf.talkfantasyfootball.org/). The assumption was that the differing TR values may confound comparison between races, if races had different ratios of games played at TR100:TR110:TR120. By standardising the data by TR, it allows you to compare races, even if the relative numbers of games played at each TR level are different. Please see table below. As you can see, taking TR out of the equation did not appreciably change win%. This means that most teams have very similar ratios of games played at TR100:TR110:TR120 in the NAF, so it shouldn't be too much of a confounding factor when comparing races in NAF data. Of course, we can only standardise by what is in the data; as dode mentioned skill pack upgrades could also be confounding but I can't standardise for this as it is not recorded in the data.
Code: Select all
RACE, WIN%, STANDARDISED, DIFFERENCE
UNDEAD, 56.3, 56.3, 0.0
WOOD ELF, 55.5, 55.4, -0.1
LIZARDMEN, 53.8, 53.8, 0.0
AMAZON, 53.6, 53.6, 0.0
DARK ELVES, 53.1, 53.0, -0.1
CHAOS DWARVES, 52.5, 52.5, 0.0
DWARVES, 52.1, 52.1, 0.0
NORSE, 51.9, 51.8, -0.1
SKAVEN, 51.3, 51.4, 0.1
NECROMANTIC, 51.1, 51.2, 0.1
ELVES, 50.5, 50.6, 0.1
ORC, 48.1, 48.1, 0.0
HIGH ELVES, 48.1, 48.0, -0.1
KHEMRI, 48.0, 48.1, 0.1
CHAOS PACT, 47.6, 47.6, 0.0
SLANN, 46.4, 46.2, -0.2
HUMAN, 46.0, 46.0, 0.0
UNDERWORLD, 45.1, 45.1, 0.0
NURGLE, 44.5, 44.4, -0.1
CHAOS, 43.7, 43.7, 0.0
VAMPIRES, 43.3, 43.4, 0.1
HALFLINGS, 34.5, 34.3, -0.2
GOBLINS, 32.4, 32.4, 0.0
OGRES, 31.7, 31.4, -0.3
[/size]
With this in mind, I produced a 'dode graph' of the standardised data. Please see below;
TR.png
For details on standardisation please see page 9 of the following PDF
http://www.apho.org.uk/resource/view.aspx?RID=48457
I am also working on a similar thing looking at standardising between LRB4/5/6 of the NAF data.
Please let me know your thoughts.
EDIT - forgot to say, the lines on the graph represent the +\- 95% confidence interval for that races win% (calculated using method on page 9 of the above hyperlinked PDF)
Best Wishes,
AndeeT