One-way ANOVA (ANalysis Of VAriance) with post-hoc Tukey HSD (Honestly Significant Difference) Test Calculator for comparing multiple treatments
.... now mooooooved to the above domain name astatsa.com!
- One-way Anova with post-hoc Tukey HSD Calculator, with Scheffé, Bonferroni and Holm multiple comparson results also provided.
- - Tukey HSD uses with Tukey-Kramer formula when treatments (sample groups) have unequal observations (i.e. unbalanced observations)
- - Select the number of treatments, then enter your observation data by typing or copy-paste, then proceed to the results
Select the number of independent treatments below:
Select \(k\), the number of independent treatments, sometimes also called samples. Since these are independent and not paired or correlated, the number of observations of each treatment may be different.
This would lead to an input screen with \(k\) columns to paste your observation data on various treatments. This calculator is hard-coded for a maximum of 10 treatments, which is more than adequate for most researchers.
*Note that when \(k=2\) there is only one pair of (independent) treatements/ samples to be compared, so the Tukey HSD Test for pairwise comparison of multiple treatments/ samples is not conducted. In this case, the one-way ANOVA is equivalent to a t-test with the \(F\) ratio such that \(F=t^2\).
What this calculator does:
Microsoft Excel can do one-way ANOVA of multiple treatments (columns) nicely.
But it stops there in its tracks. Within Excel, followup of a successful ANOVA
with post-hoc Tukey HSD has to be done manually, if you know how to! This
self-contained calculator, with flexibility to vary the number of treatments
(columns) to be compared, starts with one-way ANOVA. If ANOVA
indicates statistical significance, this calculator automatically
performs pairwise post-hoc Tukey HSD, Scheffé, Bonferroni and Holm multiple
comparison of all treatments (columns). Excel has the
necessary built-in statistical functions to conduct Scheffé, Bonferroni and
Holm multiple comparison from first principles. However, it lacks the key
built-in statistical function needed for conducting Excel-contained Tukey HSD.
Continuing education in Statistics 101:
The hard-core statistical packages demand a certain expertise to format
the input data, write code to implement the procedures and then decipher their
1970s Old School Mainframe Era output. In contrast, when spouting out Tukey
HSD, Scheffé, Bonferroni and Holm multiple comparison results, this calculator
also tells you how to verify and reproduce their output and results manually in
Excel, by teaching you how to take the output of Anova (from Excel or other
package), enabling you to conduct post-hoc Tukey HSD, Scheffé, Bonferroni and
Holm multiple comparison by hand in Excel. Your automatic A grade results from
wizardry in producing post-hoc Tukey HSD, Scheffé, Bonferroni and Holm
pairwise multiple comparison yourself manually in Excel, in which case you
would no longer need this calculator, nor have to struggle with harnessing the
old school statistical packages.
After providing guidelines on how to conduct Tukey HSD, Scheffé, Bonferroni and Holm pairwise multiple comparison by hand in Excel, this site provides R code with a tutorial on how to repeat and reproduce the results provided in this calculator using R. Users unfamiliar with the R statistical package are encouraged to follow this tutorial and not only learn some basic R, but also become grandmasters of harnessing a complex modern statistical package to conduct Tukey HSD, Scheffé, Bonferroni and Holm pairwise multiple comparison.
This calculator is designed to relieve biomedical scientists from the
travails of coding heavy-duty statistical packages:
Are you a biomedical or social scientist, who has narrow interest in one-way
ANOVA followed automatically by post-hoc Tukey HSD, Scheffé, Bonferroni and
Holm methods, but do not have the patience and perseverence to hack code to
harness R, Stata, SPSS, SAS or Matlab? This is the right tool for you! It was
inspired by the frustration of several biomedical scientists with learning the
software setup and coding of these serious statistical packages, almost like
operating heavy bulldozer machinery to swat an irritating mosquito. For code
grandmasters, fully working code and setup instructions are provided for
replication of the results in the serious academic-research-grade open-source
(and hence free) R statistical package.
Formulae and Methodology:
The one-way ANOVA starting point of this calculator reproduces
the output of Microsoft Excel's built-in ANOVA feature. The follow-up post-hoc Tukey HSD multiple comparison
part of this calculator is based on the formulae and procedures at the NIST
Engineering Statistics Handbook page on Tukey's method. Tukey originated
his HSD test, constructed for pairs with equal number of samples in each treatment, way back in 1949. When
the sample sizes are unequal, we the calculator automatically applies the Tukey-Kramer method Kramer
originated in 1956. A decent writeup on these relevant formulae appear in
the Tukey range test
Wiki entry. The NIST Handbook page mentions this modification but dooes
not provide the formula, while the Wiki entry makes adequately specifies it.
The Scheffé, Bonferroni and Holm methods of multiple comparison applies to contrasts, of which pairs are a subset. The NIST Engineering Statistics Handbook page defines contrasts. However, this calculator is hard-coded for contrasts that are pairs, and hence does not pester the user for additional input that defines generalized contrast structures. The post-hoc Scheffé multiple comparison of treatment pairs by this calculator is based on the formulae and procedures at the NIST Engineering Statistics Handbook page on Scheffé's method that was published by Scheffé in 1953.
The Bonferroni and Holm methods of multiple comparison depends on the number of relevant pairs being compared simultaneously. This calculator is hard-coded for Bonferroni and Holm simultaneous multiple comparison of (1) all pairs and (2) only a subset of pairs relative to one treatment, the first column, deemed to be the control. On the other hand, Scheffé's method is independent of the number of contrasts under consideration. The post-hoc Bonferroni simultaneous multiple comparison of treatment pairs by this calculator is based on the formulae and procedures at the NIST Engineering Statistics Handbook page on Bonferroni's method. The original Bonferroni published paper in Italian dating back to 1936 is hard to find on the web.
A significant improvement over the Bonferroni method was proposed by Holm (1979). Among the many reviews of the merits of the Holm method and its uniform superiority over the Bonferroni method, that of Aickin and Gensler (1996) is notable. This paper is the also source of our algorithm to make comparisons according to the Holm method. All statistical packages today incorporate the Holm method.
Relative merits of Tukey, Scheffé, Bonferroni and Holm Methods:
There is wide agreement that each of these three methods have their merits. The recommendation on the relative merits and advantages of each of these methods in the NIST Engineering Statistics Handbook page on comparison of these methods are reproduced below:
No one comparison method is uniformly best - each has its uses
- If all pairwise comparisons are of interest, Tukey has the edge. If only a subset of pairwise comparisons are required, Bonferroni may sometimes be better.
- When the number of contrasts to be estimated is small, (about as many as there are factors) Bonferroni is better than Scheffé. Actually, unless the number of desired contrasts is at least twice the number of factors, Scheffé will always show wider confidence bands than Bonferroni.
- Many computer packages include all three methods. So, study the output and select the method with the smallest confidence band.
- No single method of multiple comparisons is uniformly best among all the methods."
Uniform superiority of the Holm Method over the Bonferroni method:
The following excerpts from Aickin and Gensler (1996) makes it clear that the Holm method is uniformly superior to the Bonferroni method:
- "Public health researchers are sometimes required to make adjustments for multiple testing in reporting their results, which reduces the apparent significance of effects and thus reduces statistical power. The Bonferroni procedure is the most widely recommended way of doing this, but another procedure, that of Holm, is uniformly better....... As we have shown, Holm(ed) P values are easy to compute. Consequently, there does not appear to be any valid reason to continue using the Bonferroni procedure."
In addition to the wisdom of the NIST scientists as above, we have observed rare situations where one-way ANOVA produces a p-value above 0.05, producing human (though not computer) disappointment, but Bonferroni comparion of fewer contrasts (pairs) discerns a subset of contrasts (pairs) that are significantly different.
