Statistical Power

The use of a significance level of 5% controls the probability of erroneously rejecting the null hypothesis when it is, in fact, true. Rejecting the null hypothesis when it is true is called a Type I error. However, there is another error can be made - that is failing to reject the null hypothesis when it is, in fact, not true. This is called a Type II error.

The probability of rejecting the null hypothesis when the null hypothesis is false is called the Power of a test. A good experiment is therefore one with high statistical power. The power of a statistical test can be calculated at the planning stage of an experiment - unfortunately this is not always done and these experiments are consequently often inadequate.

The statistical power of an experiment is determined by the following:

(a) The level of significance to be used
(b) The variability of the data (as measured, for example, by its standard deviation)
(c) The size of the difference in the population it is required to detect.
(d) The size of the samples

By setting the power (often 80%) and any three of these four values the remaining one can calculated. However, since we usually use a 5% level of significance, we need only set three out of (b), (c), (d) and the power to determine the other. The variability of the data (b) needs to approximately assessed, usually from previous studies or from the literature, then the sample sizes (d) can be determined for a given difference (c) or, alternatively, for a specific sample size the difference likely to be detected can be calculated.

Computer packages such as Minitab (Click on Stat > Power and Sample Size) and Gpower allow the user perform a statistical power analysis to determine an appropriate sample size for an experiment or to check whether a given sample size is sufficient to provide reasonable power.