# Post Hoc Tests – Tukey HSD

Tukey’s Honest Significant Difference (HSD) test is a post hoc test commonly used to assess the significance of differences between pairs of groups. This applies for example to one-way ANOVA, factorial design ANOVA, …

As usual, the null hypothesis states that the means of the tested groups are equal. The assumptions are:

• the observations are independent,
• normality of distribution,
• equality of variance

NB: these are the same assumptions as for one-way ANOVA or multi-way ANOVA; on other words, if you have been allowed (or if you allowed yourself ) to conduct an ANOVA test, then it is ok to run Tukey’s test.

The function is `TukeyHSD()` and the syntax is `TukeyHSD(results.aov, groups, conf.level)` where `results.aov` is the vector that stores the results of the function `aov()` that you have previously run when performing the ANOVA and `conf.level` specifies the confidence level (usually fixed at 0.95).

Let’s take the example from one-way ANOVA.  The code for the dataframe and the analysis (with ANOVA results) is:

```size <-c (25,22,28,24,26,24,22,21,23,25,26,30,25,24,21,27,28,23,25,24,20,22,24,23,22,24,20,19,21,22)
location <- as.factor(c(rep("ForestA",10), rep("ForestB",10), rep("ForestC",10)))
my.dataframe <- data.frame(size,location)
results <- aov(size~location, data=my.dataframe)
summary(results)
```

Now we just have to transfer the object “results” from the function aov() into `TukeyHSD()`:

```results.post.hoc <- TukeyHSD(results, conf.level=0.95)
results.post.hoc
```

The output displays the results of all pairwise comparisons among the tested groups (here three groups: `ForestA,` `ForestB` and `ForestC`, thus 3 comparisons`)`: you’ll find the actual difference between the means under` diff` and the adjusted p-value (`p adj`) for each pairwise comparison. As denoted by the yellow frame, the only significant difference to be reported in this test is between the means of the groups `ForestB` and `ForestC.`

Note that there exists a graphical output of the Tukey HSD test and the function `plot(TukeyHSD( )`) displays it:

```par(mar=c(8,8,8,8))
plot(TukeyHSD(results), las=1)
```

Shown above are the difference between the group means and the confidence intervals. Thus, you may easily visualize the differences between groups and understand the level of significance (for example: p-value greater than- but close to 0.05 (p adj=0.0619) in the case of ForestC-ForestA correlated with the right whisker very close to the dashed line…)

Alternative function from the package `multcomp`:

The package `multcomp` offers various functions and functionalities which are sometimes more practical or adapted to our problems. Among them is the function `glht()` which can be used to perform Tukey’s test. Note that it must be used on the resulting output of the function `lm()`. Reusing our previous example, here is the code to run the test:

```library(multcomp)
lm.results <- lm(size~location)
Tukey.HSD.results <- glht(lm.results, linfct=mcp(size='Tukey'))
summary(Tukey.HSD.results)
```

And here is the output with all three comparisons, p-values and corresponding stars. Note that the values of the p-values are very, very close to those printed by TukeyHSD(), but they are not strictly identical…