Re: Examining the NAF meta
Posted: Thu Nov 30, 2017 1:19 pm
This thread very quickly spiralled into a good example of other elements of the NAF meta, ironically!
Discuss Fantasy football-style board games - GW's Blood Bowl, Impact!'s Elfball, Privateer Press' Grind, Heresy's Deathball, etc. THIS IS NOT AN NFL FANTASY FOOTBALL SITE!
https://www.talkfantasyfootball.org/
https://www.talkfantasyfootball.org/viewtopic.php?f=81&t=44538
I disagree with you here. But I think it may be a miscommunication... I think what Plasmoid has done is this:dode74 wrote: 3. I think your argument about mirror matches is flawed. While it's probably easier to do today, "back then" wasn't exactly the stone ages, and removal of mirror matches absolutely could be done by choice. I also think your choice to normalise (step 2, if that's what you're doing) and *then* remove the mirrors will create a multiplicative error: WE play WE a lot (they are their own 2nd most played opponent) so you start by scaling down the factor then removing it, rather than removing it then normalising the other factors.
I always have an axe to grind with people who try to pass off their teenage bra-fumbling as statistics. As for needing a counter argument about the methodology... he flat out states that he knows inferential statistics needs distributions, not descriptives... but then uses descriptives exclusively. Nobody needs to argue beyond that point - its done wrong from step 1, so steps 2 through infinity are inherently wrong. There's really no methodology beyond that - no work is shown, just vaguely referenced, and done knowingly wrong.nightwing wrote:You need to take a deep breath. You're coming off as someone who has an axe to grind here. You didn't agree with his post? Fine. Drop the pathetic insults and make a reasoned counter argument about WHY the methodology is flawed, rather than spiteful bitching.
Facts convince the people who are worth convincing, for everyone else you just need to distract them with something shiny.Rolex wrote:Facts never convinced anyone.
What you're really talking about is peripheral route persuasion, which is the domain of propaganda and marketing. Intelligent people don't need to be wooed into caring what the truth is, and everyone else didn't really care to begin, so they're easy to reroute.Rolex wrote:While advocating for science he ignores the sciences of the mind.... how ironic.
Hopefully more data than just roster and score, too. I'm not sure what useful information can really be gleaned from those fields alone.sann0638 wrote:(and yes, I appreciate the irony of me asking. We're working on a way of making the raw data quickly available directly from the NAF site)
Of course it does - because you don't know what you're doing either. Same way the vaccination/autism link seems totally reasonable to people who don't understand (or bother to read) the science. If, on any given topic, you don't really know what you're talking about, then you're not in a position to judge whether or not someone else does.mubo wrote:Which seems pretty reasonable to me.
Given that the tier definitions set out by the BBRC - which is what the numbers were (improperly) being compared to - are based on the inclusion of mirror matches, the exclusion of them is either disingenuous or dishonest. Yes, mirror matches will push the value toward 50%, but that was true when the tiers were created, too. There are 24 rosters, and unless you're allowing compositional imbalances heavily influence your numbers, that shouldn't have so much of an effect as to warrant diverging from the original choice by the BBRC to include them.mubo wrote:I think you absolutely have to remove mirror matches to get meaningful WRs, they will all necessarily be 50%, and will thus drag all results towards 50%.
Given he doesn't have match-level data (to the best of my knowledge it's not at match level on his stated source) I don't think that's the case. Again, I'd like to see the spreadsheet.mubo wrote:I disagree with you here. But I think it may be a miscommunication... I think what Plasmoid has done is this:
1. Compute win ratios (column A)
2. Remove swiss pairings, and do step 1 again (column B, expressed as a difference relative to A)
3. Remove mirror matches, and do step 1 again (column C, expressed as a difference relative to A (or maybe B, not clear actually))
Which seems pretty reasonable to me.
As mentioned above, if this is to have any meaning when comparing with BBRC tiering then mirror matches need to be included. Fag-packet maths suggests the difference is ~0.2% for a team which is running at 55% without mirror-matches.I think you absolutely have to remove mirror matches to get meaningful WRs, they will all necessarily be 50%, and will thus drag all results towards 50%.
VoodooMike wrote:Unless, of course, the NAF is looking to break from the BBRC's definitions now?
I don't think Plasmoid mentioned the BBRC figures in his original post. He's looking at how much races/tiers need to be modified to push them all towards a 50% win ratio to get closer to a level playing field in tournaments. The BBRC tiering criteria were for bringing races into suitable performance margins under league play, it's not really relevant here.dode74 wrote:As mentioned above, if this is to have any meaning when comparing with BBRC tiering then mirror matches need to be included. Fag-packet maths suggests the difference is ~0.2% for a team which is running at 55% without mirror-matches.
I'm not suggesting that the NAF should.Unless, of course, the NAF is looking to break from the BBRC's definitions now?
I'll be taking this in little bits. So:Given that the tier definitions set out by the BBRC - which is what the numbers were (improperly) being compared to - are based on the inclusion of mirror matches, the exclusion of them is either disingenuous or dishonest. Yes, mirror matches will push the value toward 50%, but that was true when the tiers were created, too. There are 24 rosters, and unless you're allowing compositional imbalances heavily influence your numbers, that shouldn't have so much of an effect as to warrant diverging from the original choice by the BBRC to include them.
No. I'm not doing anything with the stats that has to do with the BBRC tiers. This is not NTBB. I'm not talking about which ranges each tier covers or ought to cover. I'm not doing anything in relation to the magic number 45 or 55. I'm trying to work out what would move all teams closer to 50% - not as a number generated by the BBRC, but as a number indicating that you win about as much as you lose. And I'm doing it with the perspective of a local tournament. Yes, I'm using the word Tier. And I'm using it in exactly the same way as a lot of other tiered-bonus Tournaments. Like the EuroBowl, referenced by Rolo on page one, which Refers to Tier 2 teams as: Chaos, Chaos Pact, Human, Khemri, Slann, Necromantic, High Elf, Elf, Nurgle - i.e. not a claim that these teams are significantly below 45% wins.Given that the tier definitions set out by the BBRC - which is what the numbers were (improperly) being compared to
I'd have thought that explicitly stating that I'm excluding mirror matches, and explicitly stating why would mean that no-one would accuse me of being secretly manipulative. And I'm not comparing my numbers to BBRC numbers. I'm arguing that this would be a better (non-BBRC) measure of a balanced performance.are based on the inclusion of mirror matches, the exclusion of them is either disingenuous or dishonest.
In the past you have said yourself that including mirror matches in performance/balance would be crazy, and that if the BBRC did so knowingly then they were stupid (or some other derogatory term, I don't remember). Do you not still think so? Again, I'm not saying that the NAF or anyone should break with the BBRC. I'm saying that if I want to host a tournament with teams on more equal footing, as many tournamnets try to, using gut feeling, then results from mirror matches will not tell me anything.Yes, mirror matches will push the value toward 50%, but that was true when the tiers were created, too.
He tried 4 tiers (as per the norm) but 5 seemed to work better. (I don't know if he carried out any removal of mirror matches or normalisation for composition, but I don't think he did.)I ran a random forest initialisation point, based on 5 clusters, and the best fit was the following (with centroids in brackets):
T1 (55.26) : Wood Elves, Undead, Dark Elves, Lizardmen
T2 (51.87) : Amazons, Elves, Norse, Dwarves, Chaos Dwarves, Necromantic, Skaven
T3 (48.93) : High Elves, Slann, Orc, Humans
T4 (45.54) : Khemri, Chaos Pact, Nurgle, Vampires, Underworld, Chaos
T5 (33.91) : Halflings, Goblins, Ogres
Notes:
All teams have the centroid within their CI95.
The T1 4 teams came out in every initialisation point I did using 5 groups, so I'm very condfident they are in the top tier.
High Elves have the T2 centroid within their CI95, but the T3 centroid is closer so are a T3 team. The T3 tier is also quite small at only 4 teams.
Chaos Pact and Khemri often moved between T3 and T4, but generally always ended in T4.
You can see TourneyT3 equates to ClusterT5. TourneyT1 largely equates to ClusterT1&2, and TourneyT2 largely equates to ClusterT3&4. With the exceptions of Orcs, Pro Elves and Necro: Orcs are ClusterT3 and TourneyT1, while Pro Elves and Necro are ClusterT2 and TourneyT2.Tier 1: Chaos Dwarf, Dwarf, Wood Elves, Skaven, Norse, Lizardmen, Orc, Undead, Amazon, Dark Elves
Tier 2: Chaos, Human, Khemri, Pact, Slann, High Elves, Nurgle, Necromantic, Pro Elves, Vampires, Underworld
Tier 3: Halflings, Goblins, Ogres (collectively known as the “stunty” teams, who often get awarded a special “stunty cup” for the highest placed in a tournament)
Sure thing. And as this was my first venture into Excel, ever, I think it is set up fairly transparantly, so as not to confuse myself any further. Let me know if you want to see it (or if my replies below make that irrelevant). Just PM me if you want it.Perhaps you'd care to share the spreadsheet itself if it is even vaguely transparent?
No no - it was a feature I built into my spreadsheet, allowing the performance numbers for each race individually to be modified by a specific number.4. What "feature"? I know excel pretty well and what you describe doesn't look like a simple button press to me.
I do. I didn't because I don't really use that first column for very much.1. First off, you *do* know how to provide the ranges for 95CI for each of the win%s by calculating a margin of error of a proportion.
That would be my hypothesis, yes. But not one that I set out to investigate. I appriciate that you tell me how to do it, but it would take a lot longer for me to figure out and then do, than I'm willing to Invest. I'm sure someone could do the math very quickly if they wanted to, and the result would no doubt be very interesting.if "people play teams which win more" is the hypothesis, then the result they see, the mean, is the relevant statistic. You could do that fairly simply in Excel.
What I'm trying to say is that I have the numbers for the games that were actually played. And that this is not random due to Swiss.2. What does "I set up the excel sheet to simply reflect the total number of opposing teams (and the average win-percentage against each race) - rather than the games that have actually been played" actually mean? Is it normalisation for composition (because that's what it sounds like you are trying to say)? How did you do it, exactly?
I didn't mean to make this about the BBRC anyway. But I've seen the data collected back when the BBRC decided on the tiers, and it didn't include information regarding opponnets. It was just a string of WDL for a ton of teams. So we didn't have the data. And we couldn't realistically have gotten it.3. I think your argument about mirror matches is flawed. While it's probably easier to do today, "back then" wasn't exactly the stone ages, and removal of mirror matches absolutely could be done by choice.
Not quite.5. This looks like a rerun of step 2 to me. You normalised for composition there, and you seem to be doing the same again here.
While this can not be quantified, so surely resides in hell with all other non-math arguments, this is indeed one of the points in the theory that Orcs numbers get dragged Down. Because Orcs are in all the boxed sets, so "you" or "someone you know" most likely has an orc team, and most people have tried them out as part of learning the game, so most people feel like them know them a Little. Another argument (while anecdotal) is just how many times tournament organizers how to help the new guy build an orc roster for the tournament. It happens. A lot. IMO,Teams aren't cheap, either in cost of minis or time to paint them, so a coach is likely to be limited in his choices anyway, and, if he is a regular league player, he may be more influenced by league performance than tournament performance for example.