Reconsidering the ruling on Khorne?

News and announcements from the worldwide Blood Bowl players' association

Moderator: TFF Mods

Post Reply
Wulfyn
Emerging Star
Emerging Star
Posts: 323
Joined: Mon May 19, 2014 9:33 pm

Re: Reconsidering the ruling on Khorne?

Post by Wulfyn »

Digger Goreman wrote:
Wulfyn wrote:There is also a precedent in Skaven and Slaan.... Neither of those teams are unbalanced, and neither feel like elves.
BULLSPIT! Slann are not nothing but "green elves and spam", while four furry elves on the field are four furry elves on the field.... Elves in degrees....

No one is a paragon of virtue... not me, Plasmoid, the Khorne-Bretts, sNAFu, you, the past mortem BBRC (which people hold up as shining examples just like partisans in my country fictionalize "the founding fathers"), even Galak (whom I believe most capable and tries to be fair minded) has a point of reference.... On the rare occasions I used the phrase, "best for the game", he would become flustered; a triad on the BBRC kept elves sacrosanct; Jervis Johnson toppled the apple cart because he thought his madcap attitude ruled; Plas continues to tinker (his spurious "statistics" (to which I unwittingly contributed) were used to ream the Necro team); people even invoke TT, Cyanide, GW, FUMMBL, NAF, and Nuffle depending on what they selfishly want, not realizing that "I know best!" is in everyone else's mind (even those who think all of this is a non issue) and certainly in GW's as they have divorced themselves from all but the money making aspect of a failed I.P. (stock holders definitely "know best")....

It's all just house rules at this point....
So you seem to be defining anything AG4 as elves, and the rest I don't see the point of because we're just having a discussion around the balance of the human team. I put forward a proposition and instead of saying why, specifically, you think AG4 human catchers won't work you seem to have launched into a rant regarding... well I'm honestly not sure what you are ranting against but I'm pretty sure I have not invoked any of the things you are saying.

Are you trying to shut down any discussion?

Reason: ''
User avatar
VoodooMike
Emerging Star
Emerging Star
Posts: 434
Joined: Thu Oct 07, 2010 8:03 am

Re: Reconsidering the ruling on Khorne?

Post by VoodooMike »

Here's an advertisement for a universal truth:

Image

Now, other than foolishly questioning the legitimacy of the text, what are the first things someone might ask...

Q: That's VoodooMike? He looks a lot like Beyonce.
A: I never claimed it was.

Q: Did Beyonce say that about VoodooMike?
A: I never claimed she did.

Q: Does Beyonce endorse this message in some way?
A: I never claimed she does.

It ultimately begs the question: if I'm not claiming that Beyonce is directly and validly relevant to the statement that I'm sexy, why the hell is she in the picture at all? Regardless of what I may claim or not claim, I'm pretty obviously implying one or more of the things mentioned above. If I wasn't, she wouldn't be present.

The same is obviously true of statistics, or numbers in general. People see numbers and they think "oh! there's math involved and math is like... precise and smart and stuff, so this is more than hand waving and make believe!"... which is where we get the classic saying about "there are three types of lies: lies, damned lies, and statistics". Numbers, used inappropriately but liberally, are among the most egregious forms of dishonesty available.

The use of appropriate and consistent metrics (which is what was mentioned above) is paramount in the use of any sort of measurement to justify engaging in change (or not) and in determining if a change is successful. NTBB says it's about narrowing the blood bowl tiers... then proceeds to say it uses a different method for measuring a team's win percentage, and a different definition for tiers.....

....now imagine we set out to improve standardized test scores in middle school. We notice that the average on these tests is 65%, and that the previously stated goal was to have he average be 55%. So we declare that the testing system is failing to challenge students and must be fixed! To this end, we decide that we will no longer base test scores on "number of answers correct out of total number of questions" and instead we'll base it on "number of answers correct MINUS number of incorrect answers, out of total questions"... and we will no longer compare the scores against all other students, only students of the same gender. Then we change the tests themselves.

Did we fix the testing system? Who the Duck knows! In fact, we can't tell if any changes we make to the test have any effect at all because we're not even measuring the same thing anymore... did we accomplish what we set out to do? Well hell, we don't have a clear, metric-driven goal in the first place so anything we do can be considered a success or failure depending on how our self-esteem is doing that day, right?

Reason: ''
Image
User avatar
Digger Goreman
Legend
Legend
Posts: 5000
Joined: Sun Jun 25, 2006 3:30 am
Location: Atlanta, GA., USA: Recruiting the Walking Dead for the Blood Bowl Zombie Nation
Contact:

Re: Reconsidering the ruling on Khorne?

Post by Digger Goreman »

For you Wulfyn: viewtopic.php?f=20&t=41027

Hope you enjoy the read.... Let's you know, after 39 years of hard core tabletop gaming, the considered position I adopt in relation to the problems and non-problems of Blood Bowl....

Reason: ''
LRB6/Icepelt Edition: Ah!, when Blood Bowl made sense....
"1 in 36, my Nuffled arse!"
User avatar
spubbbba
Legend
Legend
Posts: 2267
Joined: Fri Feb 01, 2008 12:42 pm
Location: York

Re: Reconsidering the ruling on Khorne?

Post by spubbbba »

VoodooMike wrote: It ultimately begs the question: if I'm not claiming that Beyonce is directly and validly relevant to the statement that I'm sexy, why the hell is she in the picture at all?
I don't think you ever need to give justification for including a picture of Beyonce. :D

Reason: ''
My past and current modelling projects showcased on Facebook, Instagram and Twitter.
Wulfyn
Emerging Star
Emerging Star
Posts: 323
Joined: Mon May 19, 2014 9:33 pm

Re: Reconsidering the ruling on Khorne?

Post by Wulfyn »

Digger Goreman wrote:For you Wulfyn: viewtopic.php?f=20&t=41027

Hope you enjoy the read.... Let's you know, after 39 years of hard core tabletop gaming, the considered position I adopt in relation to the problems and non-problems of Blood Bowl....
It was a very good read thanks. I was going to reply in more detail, but it was your point about undead that got me thinking. I agree with you that they are still too good. So after everything else you have said regarding team design, what GW did and what the BBRC did, after all that you are suggesting a change to one of the teams.

Well isn't that what we are also doing? Aren't we sharing a common vision of wanting to improve the game? We might disagree as to how to go about this, but as long as we are respectful and civil aren't we allies in purpose?

Reason: ''
legowarrior
Emerging Star
Emerging Star
Posts: 355
Joined: Wed Sep 15, 2010 4:14 pm

Re: Reconsidering the ruling on Khorne?

Post by legowarrior »

Well, until Blood Bowl 2 comes out, and Cyanide decides to start patching the game and rolling out buff and nerfs, like any other competitive video game in the world. And they'll have reams of data eventually, like Fumbbl, but unlike Fumbbl, they'll feel obliged to do something with it. Tweaking teams, updating skills so that some are worth taking (and making a much more interesting game, as Block doesn't become the go to skill for every piece on the board). Maybe even making pass just a little easier (as Jarvis once wanted in his interview with 3 Die Block).

The question is, will they want to keep the team at different win rates and bands (and so keep the tiers) or will they argue that Blood bowl is a competitive game, and that team should have a fair shot at winning?

If they want to sell teams to non hardcore Blood Bowl Players, I can see them making all the teams competitive, including Goblins and Halflings. You won't convince many people to pick up the halflings as a DLC if the Halflings are going to be steamrolled in every game by a new player. So, if you have somebody new to Blood bowl, and they are purchasing the game on steam, you'll have to entice them with shinies when they consider buying the DLCs.

Looking at the list, almost all the starting teams are pretty competitive, No joke teams, or under performers, outside of the humans, and they are receiving a boost. So, it's not like they have to make the adjustments anytime soon. The first few DLCs (Lizardmen and Wood Elves since they are pre order bonuses) will probably also be rather competitive. You won't see the joke teams till the end of the run, so maybe we'll be safe for a while.

Well, here's to the future!

Reason: ''
User avatar
Darkson
Da Spammer
Posts: 24047
Joined: Mon Aug 12, 2002 9:04 pm
Location: The frozen ruins of Felstad
Contact:

Re: Reconsidering the ruling on Khorne?

Post by Darkson »

legowarrior wrote:and Cyanide decides to start patching the game and rolling out buff and nerfs,
Or as is more likely, start screwing around without any idea what they're doing.

Reason: ''
Currently an ex-Blood Bowl coach, most likely to be found dying to Armoured Skeletons in the frozen ruins of Felstad, or bleeding into the arena sands of Rome or burning rubber for Mars' entertainment.
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Reconsidering the ruling on Khorne?

Post by plasmoid »

Just in case you all missed it:
AV+ for human catchers were on Galaks original 2011 list, as were 130K ogres.
So seeing them on the human roster is not an indication that Cyanide or going with NTBB. I'd be surprised if they did.
Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Reconsidering the ruling on Khorne?

Post by plasmoid »

Hi Dode,
None taken. You either see the problem or you don't, and I think you're honest enough to look at it although perhaps so close to the problem as to not be able to see it.
I probably am. I'm just pointing out that there is also a risk of being overly paranoid/pedantic.
Then clarify your intent. Is it testing? If so then what for? How are you testing? What are your criteria?
You're straying here, and I'm not looking for a 30 pager.
My point was merely that here was an example of one internet guys change now being changed by another internet guy.
But if you want to know - and we'll return to that point - I'm testing in the sense of playing and getting player feedback. As was done from LRB1 through PBBL12. Which BTW is the only kind that the BBRC members could have condoned back then, because the this was before the second coming of Mike.
Thanks. However, you're straying Again.
You said that some of the stats I presented was inferential.
I asked how that is possible, since I did all of the stats in exactly the same way.
You have made changes. We know this because the list is not the same as Galak's list upon which it is based.
True. I meant I'm not anymore though :)
But as stated changes were based on feedback from the players/testers. I've never claimed that these changes were based on game data. No way I could have done inferential statistics on such a minute sample.
The other portion was in 2013, where I made several roster-only changes because I was (rather aggressively) prompted to use the existing CRP data (NAF and Box) in my identification of problem teams.
Just to be clear, there have been very few changes to CRP+. But we've been over those.
Is that the alternative? Seems to me Koadah might have a fair volume of data for you.
Not pure CRP+, and certainly nowhere near enough for inferential stats. Nor do we have that for CRP for comparison.
Maybe it is only 20 years. In that case I still prefer the alternative.
Besides, you've made a rod for your own back with respect to the volume of data required. By splitting things down into TV bandings you've massively increased the amount of data required.
True. That would be my integrity.
Even Mike has said that stats for lifetime performance is a poor measure of balance. So is including mirror matches.
So I'll stick with that.
You're assuming they are actually buffs, for starters.

Not a dead end I'll be going Down. Since I can't collect the data anyway, nor do I pretend to.
But I will say, that if we accept that there may be a huge difference between how good a team really is and how well it is played, then we can know very Little about team balance - even for straight CRP. While I agree that initially and locally coaches could get it completely wrong, then I believe that given time and communication, the majority will eventually get it right.
And yet you pretend to be using data to justify the changes in the first place!
Maybe this misunderstanding is the root of your beef with NTBB?
1. I use CRP data (even though there is not enough to do inferential statistics) as my most accurate available way of identifying the problem teams.
2. I don't use match data to pick the specific changes to those problem teams. How could I?
3. I don't use match data to make changes between editions of NTBB. How could I?
4. The changes I have made between editions to the rosters have been based on player feedback - the exception being in 2013 when I started doing (1.)

Cheers
Martin

[Edit - pseudo crash again, meaning the full text was posted about 30 minutes after a half post appeared]

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
dode74
Ex-Cyanide/Focus toadie
Posts: 2565
Joined: Fri Jul 24, 2009 4:55 pm
Location: Near Reading, UK

Re: Reconsidering the ruling on Khorne?

Post by dode74 »

legowarrior wrote:And they'll have reams of data eventually, like Fumbbl, but unlike Fumbbl, they'll feel obliged to do something with it.
You're assuming they have the first clue as to how to analyse the data, and the vaguest idea of what to do with the game once they have analysed the data. I saw their original Khorne team (see how I got this back on topic!) before Galak and I got to it, and if you think what we have now is bad then you'd have wet yourself at their roster. If they do decide to make "tweaks" I would put money on them breaking the game.

Martin
I'm testing in the sense of playing and getting player feedback. As was done from LRB1 through PBBL12. Which BTW is the only kind that the BBRC members could have condoned back then
Actually it's the volume of data which has changed. Doubleskulls has said before that if they had access to the data then he would have liked to have done things very differently. However, invoking statistics to "test" the CRP data then simply using player feedback to test CRP+ is comparing apples and oranges.
I asked how that is possible, since I did all of the stats in exactly the same way.
Your FUMBBL data is only means, which cannot be inferential. I think the more important point is that you thought all of it was descriptive, which points to you still not knowing the principle behind the numbers you're producing.
I meant I'm not anymore though
I'll believe that when I see it! :lol:
No way I could have done inferential statistics on such a minute sample.
Of course you could! It just might not have shown what you wanted it to show, and therein lies part of the problem of "lies, damned lies and statistics" to which Mike referred.
Not pure CRP+, and certainly nowhere near enough for inferential stats. Nor do we have that for CRP for comparison.
If some of the rules are CRP+ then so much the better: they can be tested in isolation. And I don't think you know how much data Koadah has, so dismissing it on the basis of volume is nonsensical. Not sure what you mean by "CRP for comparison" either, since we have reams of that data.
Even Mike has said that stats for lifetime performance is a poor measure of balance. So is including mirror matches.
So I'll stick with that.
That doesn't mean you can compare apples and oranges, though!
if we accept that there may be a huge difference between how good a team really is and how well it is played, then we can know very Little about team balance - even for straight CRP.
Not true. How a team is played is integral to that team. Attempting to split the team away from the way it is played makes no sense at all. That's why we use lots of data from many coaches to try to control for how a team is played, but you can't separate it from how it is played entirely. By controlling you limit the variance created by that one variable, you don't filter it out entirely.
Maybe this misunderstanding is the root of your beef with NTBB?
What misunderstanding? You said it yourself:
  • "I use CRP data (even though there is not enough to do inferential statistics) as my most accurate available way of identifying the problem teams."

Reason: ''
harvestmouse
Star Player
Star Player
Posts: 510
Joined: Thu Jan 05, 2012 10:21 pm

Re: Reconsidering the ruling on Khorne?

Post by harvestmouse »

dode74 wrote:
legowarrior wrote:And they'll have reams of data eventually, like Fumbbl, but unlike Fumbbl, they'll feel obliged to do something with it.
You're assuming they have the first clue as to how to analyse the data, and the vaguest idea of what to do with the game once they have analysed the data. I saw their original Khorne team (see how I got this back on topic!) before Galak and I got to it, and if you think what we have now is bad then you'd have wet yourself at their roster. If they do decide to make "tweaks" I would put money on them breaking the game.
I think trying to move the game forward is a good thing, I just don't think it should be done independently. And like Dode points out, not by those who have no idea about what they are doing.

As for FUMBBL making changes; well in League (their unranked division) they have changed things and are using experimental rules. Maybe Koadah can help out with data there. The main divisions as LW knows, it's a different story. Christer has expressed the idea of making changes to ranked divisions and the idea went down badly.

The bottom line being that the TT element of FUMBBL do not feel it's FUMBBLs place to change the rules. So in this case, FUMBBL would need to work with a BBRC or/and the NAF to change anything in the ranked divisions.

Reason: ''
plasmoid
Legend
Legend
Posts: 5334
Joined: Sun May 05, 2002 8:55 am
Location: Copenhagen
Contact:

Re: Reconsidering the ruling on Khorne?

Post by plasmoid »

Hi Dode,
Your FUMBBL data is only means, which cannot be inferential. I think the more important point is that you thought all of it was descriptive, which points to you still not knowing the principle behind the numbers you're producing.
Then you haven't really looked at the Box table or read the accompanying text.
There is plenty of data in the Box table done with CI95. It's in all the bands that are more than a single cell.
And the way Mike has been after me for using CI95 data for anything suggests to me that you're the one not knowing what is going on here. I'll trust Mike over you with statistics. All of the CI95 data - both NAF and Box - which I present on the site relate to the sample that I have. That makes it descriptive, unless I'm mistaken. In order to make it inferential, the sample would have to be big enough to be used to make inferences about the total (in this case yet unplayed) pool of games.

This last bit may well be off, but I believe that the rule of thumb for inferential stats is 20.000 data points.
So, roughly, if I'm right, we'd need 24 (teams) times 20.000 = roughly 500.000 if it was distributed right to do inferential data on lifetime performance.
And maybe 4 times as much if we use the TV-bands that I use for NTBB.

Cheers
Martin

Reason: ''
Narrow Tier BB? http://www.plasmoids.dk/bbowl/NTBB.htm
Or just visit http://www.plasmoids.dk instead
dode74
Ex-Cyanide/Focus toadie
Posts: 2565
Joined: Fri Jul 24, 2009 4:55 pm
Location: Near Reading, UK

Re: Reconsidering the ruling on Khorne?

Post by dode74 »

Martin

I read the text, but the much of the table is presented as means. It is therefore descriptive data. I know why you did what you did but it is not inferential data until the calculation is actually done.
And the way Mike has been after me for using CI95 data for anything suggests to me that you're the one not knowing what is going on here. I'll trust Mike over you with statistics.
So would I, but I don't think Mike and I are at odds over what the meaning of "inferential data" is in this case.
All of the CI95 data - both NAF and Box - which I present on the site relate to the sample that I have. That makes it descriptive, unless I'm mistaken.
You are mistaken. When you calculate a CI95 range then you are saying that you have looked at a sample of size n and are inferring, from the sample size, that the mean of the population of size N, is somewhere in the CI95 range you've calculated. The CI around a sample statistic is calculated in such a way that it has a specified chance of surrounding (or containing) the value of the corresponding population parameter. A CI95 range is an inferential statistic because it infers something about the population from which you have taken the sample.
Out of curiosity, how would you present data which relates to a sample you don't have? Facetious question, I know, but the point is you can only present data about a sample you do have, which makes your apparent definition of "descriptive" rather nonsensical.
In order to make it inferential, the sample would have to be big enough to be used to make inferences about the total (in this case yet unplayed) pool of games.
The size of the sample (relative to the size of the population, which is large enough to not worry about for our purposes) limits the size of the range - it varies inversely with the square root of the sample size: quadruple the sample size and you get half the standard error and half the "width" of the CI range. A proportion sample can be considered big enough for inferential use with as few as 10 positive and 10 negative results, as a rule of thumb, meaning a sample size of 20 is enough to make an inference, but the range will be HUGE and probably of no use. Certainly a sample that small would give a range too wide for us to make use of it here.
So, roughly, if I'm right, we'd need 24 (teams) times 20.000 = roughly 500.000 if it was distributed right to do inferential data on lifetime performance.
Well, you're wrong. If I were in your shoes I'd run a t-test between equivalent samples from CRP and CRP+ (e.g. low TV Zons CRP vs low TV Zons CRP+) to see if you've actually made a difference, although Mike might have a better method.

Reason: ''
legowarrior
Emerging Star
Emerging Star
Posts: 355
Joined: Wed Sep 15, 2010 4:14 pm

Re: Reconsidering the ruling on Khorne?

Post by legowarrior »

Can't we just do a giant ANOVA and get it over with?

Reason: ''
User avatar
VoodooMike
Emerging Star
Emerging Star
Posts: 434
Joined: Thu Oct 07, 2010 8:03 am

Re: Reconsidering the ruling on Khorne?

Post by VoodooMike »

plasmoid wrote:1. I use CRP data (even though there is not enough to do inferential statistics) as my most accurate available way of identifying the problem teams.
The issue with this statement is not that you used CRP data, it's that you're implying you used descriptive statistics because you didn't have (or know how to use/obtain) inferential statistics, and that you think that's second best. It's not.. no no no... not at all. It's similar to saying that because you didn't have a physics book handy that explained how the universe began you defaulted to the only other book you had nearby that deals with the topic: The Bible. It's a pretty big conceptual shift, and in truth, it's better to say "I don't know" than it is to just make up an answer.
plasmoid wrote:In order to make it inferential, the sample would have to be big enough to be used to make inferences about the total (in this case yet unplayed) pool of games.
The difference isn't large sample sizes, it's measures that treat a set of data as a sample. In particular, you need measurements that give as clear a picture as possible about the distribution of the data, not simply its central tendency (such as the mean or the median). You can look at my post in the thread about NAF win%s in this same forum if you need a visual representation of what I'm talking about.
plasmoid wrote:This last bit may well be off, but I believe that the rule of thumb for inferential stats is 20.000 data points
Dear god... 20,000? That's way, way off. You can do inferential statistics with ANY number of datapoints, the only problem being that with too few data points your reliable distribution is going to be so wide that finding any effect will be next to impossible. The higher the number of data points, the more reliably your sample distribution will reflect the actual population distribution, which should be pretty easy to understand: as the size of our sample approaches the total number of the population, the distribution of our sample approaches perfect similarity to population distribution.

Imagine we have a handful of data points... we can certainly take the average, but that doesn't tell us much about how the various datapoints fall relative to that average... that's what standard deviation is, more or less, the average distance each datapoint is from the mean. Standard deviation is the key measure that will turn your descriptive means into potentially inferential distributions.
dode74 wrote:Well, you're wrong. If I were in your shoes I'd run a t-test between equivalent samples from CRP and CRP+ (e.g. low TV Zons CRP vs low TV Zons CRP+) to see if you've actually made a difference, although Mike might have a better method.
That would work just fine. If there's data from normal play, and data from play using alternate rules, the two sets of data can be compared with a t-test to see if they're significantly different. Low sample sizes will simply make differences harder to find, assuming they reliably exist at all.

Reason: ''
Image
Post Reply