This is example 5.1 on page 223 of Quantitative Ecotoxicology - reproduced with R.
Load and clean the data
Read the data into R:
1 2 3 4 5 6 7 8
1 2 3 4 5 6 7
Next we do some house-keeping: convert the data to long format and the concentration to factor.
1 2 3 4 5 6
1 2 3 4 5 6 7
Let’s have first look at the data:
Next we apply the arcsine transformation:
1 2 3 4
This adds the transformed values as column
y_asin to our data.frame.
Survivals of 1 (100%) are transformed
, where n = number of animals per replicate.
All other values are transformed using
If we would have had observations with 0% survival, these zero values should be transformed using
by adding an extra
Let’s look at the transformed values:
1 2 3 4 5 6 7
Doesn’t look that different…
To fit a ANOVA to his data we use the
summary() gives the anova table:
1 2 3 4 5
The within-treatment variance is termed Residuals and the between-treatment variance is named according to the predictor conc. The total variance is simply the sum of those and not displayed.
R already performed an F-test for us, indicated by the F value (=ratio of the Mean Sq) and Pr (>F) columns.
Now we know that there is a statistically significant treatment effect, we might be interested which treatments differ from the control group.
The in the book mentioned Tukey contrasts (comparing each level with each other) can be easily done with the
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
However, this leads to 15 comparisons (and tests) and we may not be interested in all. Note that we are wrong in 1 out of 20 tests ($\alpha = 0.05$) (if we do not apply correction for multiple testing).
An alternative would be just to compare the control group to the treatments. This is called Dunnett contrasts and leads to only 5 comparison.
The syntax is the same, just change Tukey to Dunnett:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
The column Estimate gives use the difference in means between the control and the respective treatments and Std. Error the standard error from these estimates. Both are combined to a t value (=Estimate / Std. Error), from which we can get a p-value (P(>t)).
Note, that the p-values are already corrected for multiple testing, as indicated at the bottom of the output. If you want to change the correction method you can use:
This applies Bonferroni-correction, see
?adjusted for other methods.
Warton & Hui (2011) demonstrated that the arcsine transform should not be used in either circumstance. Similarly as O’Hara & Kotze (2010) showed that count data should not be log-transformed.
I a future post I will show how to analyse this data without transformation using Generalized Linear Models (GLM) and perhabs some simulations showing that using GLM can lead to an increased statistical power for ecotoxicological data sets.
Note, that I couldn’t find any reference to Generalized Linear Models in Newman (2012) and EPA (2002), although they have been around for 30 years now (Nelder & Wedderburn, 1972).
Warton, D. I., & Hui, F. K. (2011). The arcsine is asinine: the analysis of proportions in ecology. Ecology, 92(1), 3-10.
O’Hara, R. B., & Kotze, D. J. (2010). Do not log‐transform count data. Methods in Ecology and Evolution, 1(2), 118-122.
Newman, M. C. (2012). Quantitative ecotoxicology. Taylor & Francis, Boca Raton, FL.
EPA (2002). Methods for Measuring the Acute Toxicity of Effluents and Receiving Waters to Freshwater and Marine Organisms.
Nelder J.A., & Wedderburn R.W.M. (1972). Generalized Linear Models. Journal of the Royal Statistical Society Series A (General) 135:370–384.