Talk Fantasy Football

Posted: **Wed Jun 24, 2015 3:20 am**

I've created a couple of Tableau dashboards that use the NAF data to show how effective each race is 'home' and 'away' (higher and lower seed). I'm still working on a way where I can get a home & away aggregate WITHOUT splitting the match data, but I'll figure it out eventually (I'm still a Tableau newbie). There are some intriguing insights, check them out...

https://public.tableau.com/views/NAFRac ... _count=yes

https://public.tableau.com/views/NAFAwa ... _count=yes

Posted: **Wed Jun 24, 2015 6:34 am**

I always very much like all the statistics you create with the naf data!
But unless I misunderstood something, the home vs away statistics I really don't get.
Is this really just about in which order they were entered in the result sheet ?
If yes, I really don't get the sense of it, unless one has really too much time on their hands

No offence intended

Posted: **Wed Jun 24, 2015 7:14 am**

Oventa wrote:I always very much like all the statistics you create with the naf data!
But unless I misunderstood something, the home vs away statistics I really don't get.
Is this really just about in which order they were entered in the result sheet ?
If yes, I really don't get the sense of it, unless one has really too much time on their hands
No offence intended

The NAF data has one team classified as Home and the other as Away. I'm really only guessing that the 'higher seed' is the Home team because they win more - there's a clear pattern across all teams. If that is indeed the case, you can see why good teams have more Home data than Away data - because they win more!

Posted: **Wed Jun 24, 2015 7:21 am**

Could it not just be that when logging the results the person entering the results tends towards entering the winner first...?

Posted: **Wed Jun 24, 2015 7:33 am**

Fassbinder75 wrote:...there's a clear pattern....

I swear we JUST went over the difference between descriptive and inferential statistics in another thread. You're taking descriptive stats and eyeballing inferences from them. Don't do that.

Posted: **Wed Jun 24, 2015 7:36 am**

Shteve0 wrote:Could it not just be that when logging the results the person entering the results tends towards entering the winner first...?

Yep.

Posted: **Wed Jun 24, 2015 8:26 am**

VoodooMike wrote:
Fassbinder75 wrote:...there's a clear pattern....
I swear we JUST went over the difference between descriptive and inferential statistics in another thread. You're taking descriptive stats and eyeballing inferences from them. Don't do that.

Hey man I'm on your side with that stuff. I'm inferring an outcome based on a sample size of 32k. There are clearly more winners in the home column - and now that has been explained.

Posted: **Wed Jun 24, 2015 10:16 am**

Only works if all the results are Swiss (so you need to remove all round 1 matches), remove any fixed draw results (for example, swapping tables for players that can't win anything going into round 6 so they can play someone new), remove team events (for example, the ARBBL Pick'n'Mix that seeds the teams then the players, but the higher team will always be home, even if one or two of the players are lower seeded than their opponent) and that the TO didn't type the results in by hand and reverse the "seeding" (which you can't tell from the NAF results).

Posted: **Wed Jun 24, 2015 3:56 pm**

Fassbinder75 wrote:Hey man I'm on your side with that stuff. I'm inferring an outcome based on a sample size of 32k.

It's not about sides and I have no personal issues with you, it's about statistics. Sample size is nonsensical concept in relation to descriptive statistics because descriptive statistics only tell you about your body of data - it is the population, not a sample. Samples are something you use in inferential statistics when you're attempting to draw conclusions about a larger population by using those samples.

Fassbinder75 wrote:There are clearly more winners in the home column..

The number 2 is clearly larger than the number 1, but whether that has any meaning depends on the context. Descriptive statistics such as mean have only one context, and that's "this is a summary of the data we have", and that's insufficient to draw any useful conclusions from.

For example, we can say "there were more winners in the home column than the away column" and that's true... but we can't say that it represents a pattern, as that implies we'd see the same sort of results if we looked forward or backward in time... we can't say it represents a genuine difference in the environment, either, just that there was a difference in the data that was collected. That's the ultimate point: descriptive statistics tell you what happened and nothing else. They do not support any suppositions or inferences you might make and they provide no legitimate insight into anything but the individual dataset.

Posted: **Thu Jun 25, 2015 3:12 am**

VoodooMike wrote:
Fassbinder75 wrote:Hey man I'm on your side with that stuff. I'm inferring an outcome based on a sample size of 32k.
It's not about sides and I have no personal issues with you, it's about statistics. Sample size is nonsensical concept in relation to descriptive statistics because descriptive statistics only tell you about your body of data - it is the population, not a sample. Samples are something you use in inferential statistics when you're attempting to draw conclusions about a larger population by using those samples.

Fassbinder75 wrote:There are clearly more winners in the home column..
The number 2 is clearly larger than the number 1, but whether that has any meaning depends on the context. Descriptive statistics such as mean have only one context, and that's "this is a summary of the data we have", and that's insufficient to draw any useful conclusions from.

For example, we can say "there were more winners in the home column than the away column" and that's true... but we can't say that it represents a pattern, as that implies we'd see the same sort of results if we looked forward or backward in time... we can't say it represents a genuine difference in the environment, either, just that there was a difference in the data that was collected. That's the ultimate point: descriptive statistics tell you what happened and nothing else. They do not support any suppositions or inferences you might make and they provide no legitimate insight into anything but the individual dataset.

I didn’t say I had a conclusion – I suspected it might be seeding, but I wasn’t proposing that Humans get AV8 Catchers or anything like it. When I said a clear pattern, I meant a clear pattern of results – poor choice of language on my part.

That aside, given the variability of coaches, team build, format – do you think it is possible to use data available in an inferential way? From my limited understanding of inferential statistics, a null hypothesis would have to have an extremely narrow focus to be measurable, and I suspect the data won’t be rich enough to draw a reliable conclusion from.

Posted: **Fri Jun 26, 2015 5:12 am**

Fassbinder75 wrote:I didn’t say I had a conclusion – I suspected it might be seeding, but I wasn’t proposing that Humans get AV8 Catchers or anything like it. When I said a clear pattern, I meant a clear pattern of results – poor choice of language on my part.

That's just semantics, really... suspicions are just conclusions you're not 100% certain of. For statistics to support a conclusion, theory, suspicion, etc... they need to be of the sort we can use in inferential calculations. We can say that the mean of the home and the mean of the away groups is different, but we cannot use them to support any assumptions about a meaning to that difference. We can't say that the difference is definitely real rather than just being a fluke, because the mean doesn't give us information to base it on.

In order to infer legitimate difference, or to generalize a group of data to a wider inaccessible group of data (eg, this year's results to all results) we need to use distributions not simple measures of central tendency. There are an infinite number of different distributions that can have the same mean, which is why the mean is insufficient data. Some examples:

The shape of our distributions matters because in order to determine if two distributions are significant in their difference, meaning the likelihood of the difference being the result of a real effect rather than just random chance, we are examining the extent of their overlap:

We need to find significance in the difference before we can reliably say there is a difference worth talking about, at which point we can start wondering why that difference exists.

The issue with using descriptive statistics for inferences is that doing so involves inventing the curve around the mean... we're just pulling the curves out of our asses and bending them so they say whatever we've decided to believe. That's not the numbers supporting supposition, it's a work of artistic fiction.

Fassbinder75 wrote:That aside, given the variability of coaches, team build, format – do you think it is possible to use data available in an inferential way? From my limited understanding of inferential statistics, a null hypothesis would have to have an extremely narrow focus to be measurable, and I suspect the data won’t be rich enough to draw a reliable conclusion from.

Of course its possible, you just need to use the original match data to find not just a mean, but the measures of the distributions, which can then be used to calculate significance levels. It is not possible to determine distribution simply from the mean, however (as the first little picture shows)... so if all you have is the mean, it isn't going to be useful for making any legitimate inferences.

Talk Fantasy Football

NAF Win% Stats (2013 & 2014 data)

NAF Win% Stats (2013 & 2014 data)

Re: NAF Win% Stats (2013 & 2014 data)

Re: NAF Win% Stats (2013 & 2014 data)

Re: NAF Win% Stats (2013 & 2014 data)

Re:

Re: NAF Win% Stats (2013 & 2014 data)

Re: NAF Win% Stats (2013 & 2014 data)

Re: NAF Win% Stats (2013 & 2014 data)

Re: NAF Win% Stats (2013 & 2014 data)

Re: NAF Win% Stats (2013 & 2014 data)