The one-way analysis of variance (ANOVA for independent groups) proposed by Ronald Fisher, is used to verify the hypothesis determining the equality of means of an analysed variable in several () populations.
Basic assumptions:
Hypotheses:
where:
,,…, – means of an analysed variable of each population.
The test statistic is defined by:
where:
– mean square between-groups,
– mean square within-groups,
– between-groups sum of squares,
– within-groups sum of squares,
– total sum of squares,
– between-groups degrees of freedom,
– within-groups degrees of freedom,
– total degrees of freedom,
,
– samples sizes for ,
– values of a variable taken from a sample for , .
The F statistic has the F Snedecor distribution with and degrees of freedom.
The p-value, designated on the basis of the test statistic, is compared with the significance level :
Effect size - partial
This quantity indicates the proportion of explained variance to total variance associated with a factor. Thus, in a one-factor ANOVA model for independent groups, it indicates what proportion of the between-groups variability in outcomes can be attributed to the factor under study determining the independent groups.
POST-HOC tests
Introduction to contrast and POST-HOC testing
The settings window with the One-way ANOVA for independent groups
can be opened in Statistics
menu→Parametric tests
→ANOVA for independent groups
or in ''Wizard''.
There are 150 persons chosen randomly from the population of workers of 3 different transport companies. From each company there are 50 persons drawn to the sample. Before the experiment begins, you should check if the average age of the workers of these companies is similar, because the next step of the experiment depends on it. The age of each participant is written in years.
Age (company 1): 27, 33, 25, 32, 34, 38, 31, 34, 20, 30, 30, 27, 34, 32, 33, 25, 40, 35, 29, 20, 18, 28, 26, 22, 24, 24, 25, 28, 32, 32, 33, 32, 34, 27, 34, 27, 35, 28, 35, 34, 28, 29, 38, 26, 36, 31, 25, 35, 41, 37\\Age (company 2): 38, 34, 33, 27, 36, 20, 37, 40, 27, 26, 40, 44, 36, 32, 26, 34, 27, 31, 36, 36, 25, 40, 27, 30, 36, 29, 32, 41, 49, 24, 36, 38, 18, 33, 30, 28, 27, 26, 42, 34, 24, 32, 36, 30, 37, 34, 33, 30, 44, 29\\Age (company 3): 34, 36, 31, 37, 45, 39, 36, 34, 39, 27, 35, 33, 36, 28, 38, 25, 29, 26, 45, 28, 27, 32, 33, 30, 39, 40, 36, 33, 28, 32, 36, 39, 32, 39, 37, 35, 44, 34, 21, 42, 40, 32, 30, 23, 32, 34, 27, 39, 37, 35.
Before proceeding with the ANOVA analysis, the normality of the data distribution was confirmed.
The analysis window tested the assumption of equality of variance, obtaining p>0.05 in both tests.
Hypotheses:
Comparing the p-value = 0.005147 of the one-way analysis of variance with the significance level , you can draw the conclusion that the average ages of workers of these transport companies is not the same. Based just on the ANOVA result, you do not know precisely which groups differ from others in terms of age. To gain such knowledge, it must be used one of the POST-HOC tests, for example the Tukey test. To do this, you should resume the analysis by clicking and then, in the options window for the test, you should select Tukey HSD
and Add graph
.
The critical difference (CD) calculated for each pair of comparisons is the same (because the groups sizes are equal) and counts to 2.730855. The comparison of the value with the value of the mean difference indicates, that there are significant differences only between the mean age of the workers from the first and the third transport company (only if these 2 groups are compared, the value is less than the difference of the means). The same conclusion you draw, if you compare the p-value of POST-HOC test with the significance level . The workers of the first transport company are about 3 years younger (on average) than the workers of the third transport company. Two interlocking homogeneous groups were obtained, which are also marked on the graph.
We can provide a detailed description of the data by selecting Descriptive statistics in the analysis window