Fassbinder75 wrote:I didn’t say I had a conclusion – I suspected it might be seeding, but I wasn’t proposing that Humans get AV8 Catchers or anything like it. When I said a clear pattern, I meant a clear pattern of results – poor choice of language on my part.
That's just semantics, really... suspicions are just conclusions you're not 100% certain of. For statistics to support a conclusion, theory, suspicion, etc... they need to be of the sort we can use in inferential calculations. We can say that the mean of the home and the mean of the away groups is different, but we
cannot use them to support any assumptions about a meaning to that difference. We can't say that the difference is definitely real rather than just being a fluke, because the mean doesn't give us information to base it on.
In order to infer legitimate difference, or to generalize a group of data to a wider inaccessible group of data (eg, this year's results to all results) we need to use
distributions not simple measures of central tendency. There are an infinite number of different distributions that can have the same mean, which is why the mean is insufficient data. Some examples:
The shape of our distributions matters because in order to determine if two distributions are
significant in their difference, meaning the likelihood of the difference being the result of a real effect rather than just random chance, we are examining the extent of their overlap:
We need to find significance in the difference before we can reliably say there is a difference worth talking about, at which point we can start wondering why that difference exists.
The issue with using descriptive statistics for inferences is that doing so involves inventing the curve around the mean... we're just pulling the curves out of our asses and bending them so they say whatever we've decided to believe. That's not the numbers supporting supposition, it's a work of artistic fiction.
Fassbinder75 wrote:That aside, given the variability of coaches, team build, format – do you think it is possible to use data available in an inferential way? From my limited understanding of inferential statistics, a null hypothesis would have to have an extremely narrow focus to be measurable, and I suspect the data won’t be rich enough to draw a reliable conclusion from.
Of course its possible, you just need to use the original match data to find not just a mean, but the measures of the distributions, which can then be used to calculate significance levels. It is not possible to determine distribution simply from the mean, however (as the first little picture shows)... so if all you have is the mean, it isn't going to be useful for making any legitimate inferences.