Eduard Szöcs

Data in Environmental Science and Eco(toxico-)logy

Quantitative Ecotoxicology, Page 223, Example 5.1, Analysis of Variance

This is example 5.1 on page 223 of Quantitative Ecotoxicology - reproduced with R.

Next we do some house-keeping: convert the data to long format and the concentration to factor.

Let’s have first look at the data:

Transform response

Next we apply the arcsine transformation:

This adds the transformed values as column y_asin to our data.frame. Survivals of 1 (100%) are transformed

, where n = number of animals per replicate.

All other values are transformed using

If we would have had observations with 0% survival, these zero values should be transformed using

by adding an extra ifelse().

Let’s look at the transformed values:

Doesn’t look that different…

ANOVA

To fit a ANOVA to his data we use the aov() function:

And summary() gives the anova table:

The within-treatment variance is termed Residuals and the between-treatment variance is named according to the predictor conc. The total variance is simply the sum of those and not displayed.

R already performed an F-test for us, indicated by the F value (=ratio of the Mean Sq) and Pr (>F) columns.

Multiple Comparisons

Now we know that there is a statistically significant treatment effect, we might be interested which treatments differ from the control group.

The in the book mentioned Tukey contrasts (comparing each level with each other) can be easily done with the multcomp package:

However, this leads to 15 comparisons (and tests) and we may not be interested in all. Note that we are wrong in 1 out of 20 tests ($\alpha = 0.05$) (if we do not apply a correction for multiple testing).

An alternative would be just to compare the control group to the treatments. This is called Dunnett contrasts and leads to only 5 comparisons.

The syntax is the same, just change Tukey to Dunnett:

The column Estimate gives use the difference in means between the control and the respective treatments and Std. Error the standard error from these estimates. Both are combined to a t value (=Estimate / Std. Error), from which we can get a p-value (P(>t)).

Note, that the p-values are already corrected for multiple testing, as indicated at the bottom of the output. If you want to change the correction method you can use:

This applies Bonferroni-correction, see ?p.adjust and ?adjusted for other methods.

Outlook

Warton & Hui (2011) demonstrated that the arcsine transform should not be used in either circumstance. Similarly as O’Hara & Kotze (2010) showed that count data should not be log-transformed.

I a future post I will show how to analyse this data without transformation using Generalized Linear Models (GLM) and perhaps some simulations showing that using GLM can lead to an increased statistical power for ecotoxicological data sets.

Note, that I couldn’t find any reference to Generalized Linear Models in Newman (2012) and EPA (2002), although they have been around for 30 years now (Nelder & Wedderburn, 1972).

References

• Warton, D. I., & Hui, F. K. (2011). The arcsine is asinine: the analysis of proportions in ecology. Ecology, 92(1), 3-10.

• O’Hara, R. B., & Kotze, D. J. (2010). Do not log‐transform count data. Methods in Ecology and Evolution, 1(2), 118-122.

• Newman, M. C. (2012). Quantitative ecotoxicology. Taylor & Francis, Boca Raton, FL.

• EPA (2002). Methods for Measuring the Acute Toxicity of Effluents and Receiving Waters to Freshwater and Marine Organisms.

• Nelder J.A., & Wedderburn R.W.M. (1972). Generalized Linear Models. Journal of the Royal Statistical Society Series A (General) 135:370–384.

Written on September 5, 2015