Fork me on GitHub

ed=r

R in Ecotoxicology and Environmental Sciences

Quantitative Ecotoxicology, Page 178, Example 4.9, Time-to-death

This is example 4.9 on page 178 of Quantitative Ecotoxicology - time-to-death data.

Thankfully, Prof. Newman provided me the data for this example. You can get it from the github-repo (TOXICTY.csv).

1
2
3
4
5
require(RCurl)
url <- getURL("https://raw.github.com/EDiLD/r-ed/master/quantitative_ecotoxicology/data/TOXICITY.csv",
    ssl.verifypeer = FALSE)
TOXICITY <- read.table(text = url, header = TRUE)
head(TOXICITY)
1
2
3
4
5
6
7
##   TTD TANK  PPT WETWT STDLGTH
## 1   8    1 15.8 0.112     1.9
## 2   8    1 15.8 0.050     1.5
## 3   8    1 15.8 0.029     1.2
## 4   8    1 15.8 0.045     1.4
## 5   8    2 15.8 0.097     1.8
## 6   8    2 15.8 0.048     1.4
1
summary(TOXICITY)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
##       TTD            TANK           PPT           WETWT      
##  Min.   : 8.0   Min.   : 1.0   Min.   : 0.0   Min.   :0.024  
##  1st Qu.:24.0   1st Qu.: 4.0   1st Qu.:10.3   1st Qu.:0.068  
##  Median :72.0   Median : 7.0   Median :11.6   Median :0.092  
##  Mean   :63.3   Mean   : 7.5   Mean   :11.7   Mean   :0.135  
##  3rd Qu.:97.0   3rd Qu.:11.0   3rd Qu.:15.8   3rd Qu.:0.129  
##  Max.   :97.0   Max.   :14.0   Max.   :20.1   Max.   :1.489  
##                                               NA's   :70     
##     STDLGTH    
##  Min.   :1.20  
##  1st Qu.:1.60  
##  Median :1.70  
##  Mean   :1.87  
##  3rd Qu.:2.00  
##  Max.   :4.90  
##  NA's   :70

The data consists of 5 columns:

  • TTD : Time to death
  • TANK : Tank
  • PPT : NaCl Concentration
  • WETWT : wet weight
  • STDLGTH : Standard length

Columns 4 and 5 have 70 NA’s (no data available due to measurement error), but we won’t use these in this example. The observations with TTD = 97 are ‘survivors’, since the experiment run only 96 hours.

First we need to create a column FLAG for the status of the animal (dead/alive):

1
TOXICITY$FLAG <- ifelse(TOXICITY$TTD > 96, 1, 2)

So 1 denotes alive and 2 dead.

Then we can plot the data. Each line is a tank and colors denote the NaCl concentrations.

1
2
3
4
require(survival)
mod <- survfit(Surv(TTD, FLAG) ~ PPT + strata(TANK), data = TOXICITY)
plot(mod, col = rep(1:7, each = 2), mark.time = FALSE, xlab = "Hours", ylab = "% Survival")
legend("bottomleft", legend = sort(unique(TOXICITY$PPT)), col = 1:7, lty = 1)

plot of chunk plot_surv

We see a clear relationship between concentration and the survival curves. In this example we are interested in differences between the duplicates. We see that the two curves for the 11.6 g/L concentration are quite similar, while there is more divergence between tanks in the 13.2 g/L treatment.

We can test for differences using the survdiff function. With the rho argument we can specify the type of test: rho = 0 is a log-rank test and rho = 1 is equivalent to the Peto & Peto modification of the Gehan-Wilcoxon test.

First the log-rank test for each concentration:

1
survdiff(Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 10.3, ], rho = 0)
1
2
3
4
5
6
7
8
9
## Call:
## survdiff(formula = Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 
##     10.3, ], rho = 0)
## 
##          N Observed Expected (O-E)^2/E (O-E)^2/V
## TANK=9  40        6     9.02      1.01      2.22
## TANK=10 38       11     7.98      1.14      2.22
## 
##  Chisq= 2.2  on 1 degrees of freedom, p= 0.136
1
survdiff(Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 10.8, ], rho = 0)
1
2
3
4
5
6
7
8
9
## Call:
## survdiff(formula = Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 
##     10.8, ], rho = 0)
## 
##         N Observed Expected (O-E)^2/E (O-E)^2/V
## TANK=7 41       10     11.6     0.226     0.503
## TANK=8 39       12     10.4     0.253     0.503
## 
##  Chisq= 0.5  on 1 degrees of freedom, p= 0.478
1
survdiff(Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 11.6, ], rho = 0)
1
2
3
4
5
6
7
8
9
## Call:
## survdiff(formula = Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 
##     11.6, ], rho = 0)
## 
##         N Observed Expected (O-E)^2/E (O-E)^2/V
## TANK=5 37       20     18.9    0.0623     0.129
## TANK=6 41       20     21.1    0.0559     0.129
## 
##  Chisq= 0.1  on 1 degrees of freedom, p= 0.719
1
survdiff(Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 13.2, ], rho = 0)
1
2
3
4
5
6
7
8
9
## Call:
## survdiff(formula = Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 
##     13.2, ], rho = 0)
## 
##         N Observed Expected (O-E)^2/E (O-E)^2/V
## TANK=3 38       34     40.6      1.06       3.1
## TANK=4 40       37     30.4      1.41       3.1
## 
##  Chisq= 3.1  on 1 degrees of freedom, p= 0.0781
1
survdiff(Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 15.8, ], rho = 0)
1
2
3
4
5
6
7
8
9
## Call:
## survdiff(formula = Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 
##     15.8, ], rho = 0)
## 
##         N Observed Expected (O-E)^2/E (O-E)^2/V
## TANK=1 39       39     44.5     0.674      3.09
## TANK=2 40       40     34.5     0.868      3.09
## 
##  Chisq= 3.1  on 1 degrees of freedom, p= 0.0789

We could also run this in a for loop (here the Wilcoxon test):

1
2
3
4
5
for (i in sort(unique(TOXICITY$PPT)[-c(2, 7)])) {
    cat("\n", i, "\n")
    print(survdiff(Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == i,
        ], rho = 1))
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
## 
##  10.3 
## Call:
## survdiff(formula = Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 
##     i, ], rho = 1)
## 
##          N Observed Expected (O-E)^2/E (O-E)^2/V
## TANK=9  40     5.41     8.21     0.952      2.28
## TANK=10 38    10.09     7.29     1.071      2.28
## 
##  Chisq= 2.3  on 1 degrees of freedom, p= 0.131 
## 
##  10.8 
## Call:
## survdiff(formula = Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 
##     i, ], rho = 1)
## 
##         N Observed Expected (O-E)^2/E (O-E)^2/V
## TANK=7 41     8.75    10.33     0.240     0.596
## TANK=8 39    10.81     9.24     0.269     0.596
## 
##  Chisq= 0.6  on 1 degrees of freedom, p= 0.44 
## 
##  11.6 
## Call:
## survdiff(formula = Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 
##     i, ], rho = 1)
## 
##         N Observed Expected (O-E)^2/E (O-E)^2/V
## TANK=5 37     15.4     14.8    0.0205    0.0524
## TANK=6 41     15.9     16.5    0.0184    0.0524
## 
##  Chisq= 0.1  on 1 degrees of freedom, p= 0.819 
## 
##  13.2 
## Call:
## survdiff(formula = Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 
##     i, ], rho = 1)
## 
##         N Observed Expected (O-E)^2/E (O-E)^2/V
## TANK=3 38     17.8     23.5      1.37      5.02
## TANK=4 40     24.9     19.2      1.68      5.02
## 
##  Chisq= 5  on 1 degrees of freedom, p= 0.0251 
## 
##  15.8 
## Call:
## survdiff(formula = Surv(TTD, FLAG) ~ TANK, data = TOXICITY[TOXICITY$PPT == 
##     i, ], rho = 1)
## 
##         N Observed Expected (O-E)^2/E (O-E)^2/V
## TANK=1 39     22.6     27.0     0.707      3.15
## TANK=2 40     27.8     23.4     0.814      3.15
## 
##  Chisq= 3.1  on 1 degrees of freedom, p= 0.076

Basically we get the same results as in the book:

None of the log-rank tests is statistically significant (at the 0.05 level). The wilcoxon test for the 13.2 g/L treatment shows a p < 0.05. This is also in agreement with the plot.

The $\chi^2$ values differ slightly but share the same trend - I suspect this is due to different data used.

With this dataset we can do much more. We already saw that there might be a relationship between survival time and concentration, but more on this later (example 4.10).

Code and data are available on my github-repo under file name ‘p176’.

Quantitative Ecotoxicology, Page 162, Example 4.7, Duplicate Treatments

This is a short one (example 4.7 on page 1621 of Quantitative Ecotoxicology).

First we create the data as matrix:

1
2
3
TEST <- matrix(c(1, 19, 6, 14), byrow = TRUE, ncol = 2, dimnames = list(c("Tank_A",
    "Tank_B"), c("Number_Dead", "Number_Surviving")))
TEST
1
2
3
##        Number_Dead Number_Surviving
## Tank_A           1               19
## Tank_B           6               14

The we can easily run fisher’s Exact test for this 2x2 table using the function fisher.test():

1
fisher.test(TEST)
1
2
3
4
5
6
7
8
9
10
11
## 
## 	Fisher's Exact Test for Count Data
## 
## data:  TEST 
## p-value = 0.09148
## alternative hypothesis: true odds ratio is not equal to 1 
## 95 percent confidence interval:
##  0.0025451 1.2461419 
## sample estimates:
## odds ratio 
##    0.12883
1
fisher.test(TEST, alternative = "greater")
1
2
3
4
5
6
7
8
9
10
11
## 
## 	Fisher's Exact Test for Count Data
## 
## data:  TEST 
## p-value = 0.9958
## alternative hypothesis: true odds ratio is greater than 1 
## 95 percent confidence interval:
##  0.0051465       Inf 
## sample estimates:
## odds ratio 
##    0.12883
1
fisher.test(TEST, alternative = "less")
1
2
3
4
5
6
7
8
9
10
11
## 
## 	Fisher's Exact Test for Count Data
## 
## data:  TEST 
## p-value = 0.04574
## alternative hypothesis: true odds ratio is less than 1 
## 95 percent confidence interval:
##  0.00000 0.96589 
## sample estimates:
## odds ratio 
##    0.12883

The results are identical to the one in the book.

Quantitative Ecotoxicology, Page 159, Example 4.6, Control Mortality

This is example 4.6 on page 159 of Quantitative Ecotoxicology. It is about how to deal with control mortalities.

First we need the data:

1
2
3
4
require(RCurl)
url <- getURL("https://raw.github.com/EDiLD/r-ed/master/quantitative_ecotoxicology/data/p160.csv",
    ssl.verifypeer = FALSE)
NAP <- read.table(text = url, header = TRUE, sep = ";")
1
head(NAP)
1
2
3
4
5
6
7
##   CONC DEAD TOTAL
## 1    0    1    26
## 2    0    0    26
## 3    0    0    26
## 4    0    0    26
## 5    0    1    26
## 6    0    0    26

The data consists of number of dead animals (DEAD) from all animals (TOTAL) exposed to different concentrations (CONC). First we create a new column with the proportion of dead animals:

1
NAP$PROP <- NAP$DEAD/NAP$TOTAL

Here is a plot of the data. Note the use of expression() (greek letters in the axis labels).

1
2
plot(NAP$CONC, NAP$PROP, pch = 16, xlab = expression(paste("Concentration (",
    mu, "g/L)")), ylab = "Proportion Dead", main = "Raw data")

plot of chunk plot_raw

Control mortality

We can estimate the mean control mortality and the confidence interval for the mean using the t.test function:

1
2
contr_m <- t.test(NAP$PROP[NAP$CONC == 0])
contr_m
1
2
3
4
5
6
7
8
9
10
11
## 
## 	One Sample t-test
## 
## data:  NAP$PROP[NAP$CONC == 0] 
## t = 1.5811, df = 5, p-value = 0.1747
## alternative hypothesis: true mean is not equal to 0 
## 95 percent confidence interval:
##  -0.0080228  0.0336638 
## sample estimates:
## mean of x 
##  0.012821

These can be also easily extracted from the t.test object:

1
2
## extract the values from t.test-object mean
contr_m$estimate
1
2
## mean of x 
##  0.012821
1
2
# CI
contr_m$conf.int
1
2
3
## [1] -0.0080228  0.0336638
## attr(,"conf.level")
## [1] 0.95

This gives nearly the same values as in the book. I don’t know what the SAS option OPTC is doing or computing, however it seems like it is the mean +- CI for the control group.

Abbott’s formula

We could adjust for control mortality using Abbott’s formula:

with $p_c$ = the corrected, p = original and $p_0$ = control mortality.

The mean control mortality can be calculated as:

1
2
d_control <- mean(NAP$PROP[NAP$CONC == 0])
d_control
1
## [1] 0.012821

And the corrected mortalities using Abbotts formula as:

1
2
NAP$PROP_c <- (NAP$PROP - d_control)/(1 - d_control)
NAP$PROP_c
1
2
3
4
5
##  [1]  0.025974 -0.012987 -0.012987 -0.012987  0.025974 -0.012987  0.064935
##  [8]  0.181818 -0.012987  0.311169  0.259740  0.181818  0.512266  0.454545
## [15]  0.610390  0.649351  0.688312  0.766234  0.774892  0.805195  0.805195
## [22]  1.000000  1.000000  0.922078  1.000000  0.961039  1.000000  1.000000
## [29]  1.000000  1.000000

Dose-Response-Models

Ignoring control mortality

As in the previous example we can fit a dose-response-model to this data using the drc package:

1
2
require(drc)
mod1 <- drm(PROP ~ CONC, data = NAP, fct = LL.2())

Comparing with other model this models performs quite good.

1
mselect(mod1, fctList = list(LL.3(), LL.4(), LL.5(), W1.2(), W1.3(), W1.4()))
1
2
3
4
5
6
7
8
##      logLik      IC Lack of fit   Res var
## LL.2 47.803 -89.607    0.015649 0.0025908
## W1.2 47.495 -88.990    0.014927 0.0026447
## LL.3 47.977 -87.954    0.014617 0.0026559
## W1.3 47.632 -87.264    0.013863 0.0027177
## W1.4 48.454 -86.908    0.014244 0.0026717
## LL.4 48.075 -86.150    0.013437 0.0027400
## LL.5 48.933 -85.866    0.013822 0.0026912
1
2
plot(mod1, broken = TRUE, type = "all", bp = 500, xt = seq(500, 3000, 500))
mtext("Dose-Response-Model - LL2.2", 3)

plot of chunk plot_mod1

Using the corrected mortalities

We can also fit a model to the corrected mortalities PROP_c.

Abbotts correction resulted to some negative mortalities, therefore I set the control and all negative mortalities to zero:

1
NAP$PROP_c[NAP$PROP_c < 0 | NAP$CONC == 0] <- 0

Then we fit a dose-response model:

1
mod2 <- drm(PROP_c ~ CONC, data = NAP, fct = LL.2())

However a Weibull model fits slightly better the data, so I change to a two-parameter Weibull model (using the update function).

1
mselect(mod2, fctList = list(LL.3(), LL.4(), LL.5(), W1.2(), W1.3(), W1.4()))
1
2
3
4
5
6
7
8
##      logLik      IC Lack of fit   Res var
## W1.2 48.574 -91.148  4.6619e-69 0.0024611
## LL.2 48.451 -90.902  4.5674e-69 0.0024813
## W1.3 48.680 -89.361  4.2744e-69 0.0025342
## LL.3 48.667 -89.333  4.2646e-69 0.0025365
## LL.4 48.690 -87.381  3.8391e-69 0.0026299
## W1.4 48.638 -87.277  3.8060e-69 0.0026390
## LL.5 49.423 -86.846  3.8700e-69 0.0026047
1
2
plot(mod2, broken = TRUE, type = "all", bp = 500, xt = seq(500, 3000, 500))
mtext("Corrected mortalities - W1.2", 3)

plot of chunk plot_mod2

A model without fixed lower limit

The two-parameter log-logistic model from above (mod1) performs quite good. However its lower limit is fixed to 0 and the upper limit to 1. Since we have a small amount of control mortality we could check if a model with varying lower limit (will be estimated) makes sense.

Let’s fit a three parameter log-logistic function, where the lower limit is an additional parameter:

1
2
3
mod3 <- drm(PROP ~ CONC, data = NAP, fct = LL.3u())
plot(mod3, broken = TRUE, type = "all", bp = 500, xt = seq(500, 3000, 500))
mtext("Free (estimated) lower limit - LL3.u", 3)

plot of chunk plot_mod3

However looking at the summary we see that the lower limit is estimated as $0.007 \pm 0.02$ and is statistically not significant.

1
summary(mod3)
1
2
3
4
5
6
7
8
9
10
11
12
13
## 
## Model fitted: Log-logistic (ED50 as parameter) with upper limit at 1 (3 parms)
## 
## Parameter estimates:
## 
##                 Estimate Std. Error    t-value p-value
## b:(Intercept)  -13.71026    1.11140  -12.33603    0.00
## c:(Intercept)    0.00686    0.01958    0.35045    0.73
## e:(Intercept) 2099.76948   13.38536  156.87056    0.00
## 
## Residual standard error:
## 
##  0.051716 (27 degrees of freedom)

Since the lower limit (=control mortality) is so low we could also stick with mod1 since it brings no improvement.

1
mselect(mod3, fctList = list(LL.2(), LL2.3u()))
1
2
3
4
##        logLik      IC Lack of fit   Res var
## LL.2   47.803 -89.607    0.015649 0.0025908
## LL.3u  47.872 -87.744    0.014383 0.0026746
## LL2.3u 47.794 -87.587    0.014212 0.0026885

All three considered models give nearly the same $LC_{50}$ around 2100:

1
ED(mod1, 50, interval = "delta")
1
2
3
4
5
6
## 
## Estimated effective doses
## (Delta method-based confidence interval(s))
## 
##      Estimate Std. Error  Lower Upper
## 1:50   2097.1       10.9 2074.9  2119
1
ED(mod2, 50, interval = "delta")
1
2
3
4
5
6
## 
## Estimated effective doses
## (Delta method-based confidence interval(s))
## 
##      Estimate Std. Error  Lower Upper
## 1:50   2088.2       10.2 2067.2  2109
1
ED(mod3, 50, interval = "delta")
1
2
3
4
5
6
## 
## Estimated effective doses
## (Delta method-based confidence interval(s))
## 
##      Estimate Std. Error  Lower Upper
## 1:50   2099.8       13.4 2072.3  2127

Code and data is also available on my github-repo under file name ‘p159’.

R, Knitr and Octopress

Recently I moved this blog from tumblr to github since I feeled I need more liberty. I have no experience with web programming, so I chose Octopress as blogging framework since it nicely integrates with github.

However after relocation my workflow for generating posts was quite laborious - it was similar to the tumblr-workflow:

  • Create an .Rmd file
  • knit the file
  • move the file to octopress/source/_posts/
  • move the figures to octopress/source/figure/
  • and so on…

This was non-satisfying and I thought that this must be automated into a smooth workflow!

Last week I took a look at the new Rcpp Gallery and noticed that this is also hosted on github. I skimmed through their code and found a lot of useful stuff. Shamelessly I copied some files from the and modified them for my octopress needs.

Basically I have now a folder for my .Rmd files (octopress/source/src). Invoking make in this folder creates the markdown-files in _posts and sets up the right figure paths (in _posts/figure). My workflow now:

  • rake new_rmd[test], this sets up an empty .Rmd file in /src, according to the octopress naming-scheme and header.
  • add my post to this .Rmd file
  • invoke make in /src, this sets up the .markdown file in _posts and figures
  • the usual rake gen_deploy

I am quite happy at the moment with this workflow. I can also use R-Studio.

Here are the changes I made to my octopress:

1) I added a new function rake new_rmd to my Rakefile. This mimics the behavior of rake new_post and creates a .Rmd-file in source/src. Simply add these lines to your Rakefile:

Rakefile
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# usage rake new_rmd[my-new-rmd] or rake new_post['my new rmd'] or rake new_rmd (defaults to "new-rmd")
desc "Begin a new post in #{source_dir}/#{src_dir}"
task :new_rmd, :title do |t, args|
  raise "### You haven't set anything up yet. First run `rake install` to set up an Octopress theme." unless File.directory?(source_dir)
  mkdir_p "#{source_dir}/#{src_dir}"
  args.with_defaults(:title => 'new-rmd')
  title = args.title
  filename = "#{source_dir}/#{src_dir}/#{Time.now.strftime('%Y-%m-%d')}-#{title.to_url}.#{new_src_ext}"
  if File.exist?(filename)
    abort("rake aborted!") if ask("#{filename} already exists. Do you want to overwrite?", ['y', 'n']) == 'n'
  end
  puts "Creating new post: #{filename}"
  open(filename, 'w') do |post|
    post.puts "---"
    post.puts "layout: post"
    post.puts "title: \"#{title.gsub(/&/,'&amp;')}\""
    post.puts "date: #{Time.now.strftime('%Y-%m-%d %H:%M')}"
    post.puts "comments: true"
    post.puts "categories: "
    post.puts "published: false"
    post.puts "---"
  end
end

2) I modified and cleaned knit.sh from the Rcpp Gallery for my needs. This runs knitr on the .Rmd files and saves the output to _posts. Put this file into source/_scripts!

knit.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
#! /usr/bin/Rscript

knit <- function (inputFile, outputFile) {

  # per-document figure paths
  stem <- tools::file_path_sans_ext(inputFile)
  prefix <- paste(stem, "-", sep="")
  knitr::opts_chunk$set(fig.path=file.path('../figure', prefix))

  # configure output options
  knitr::pat_md()
  knitr::opts_knit$set(out.format = 'markdown')
  renderOcto()

  # do the knit
  knitr::knit(input = inputFile, output = outputFile)
}

# adaption of knitr::render_jekyll
renderOcto <- function(extra = '') {
  knitr::render_markdown(TRUE)
  # code
  hook.c = function(x, options) {
	  prefix <- sprintf("\n\n```r", options$label)
	  suffix <- "```\n\n"
	  paste(prefix, x, suffix,sep="\n")
	}
  # output
  hook.o = function(x, options) {
	if (knitr:::output_asis(x, options))
		x
	else
		stringr::str_c('\n\n```\n',
					   x,
					   '```\n\n')
  }

  knitr::knit_hooks$set(source = hook.c, output = hook.o, warning = hook.o,
                        error = hook.o, message = hook.o)
}

# get arguments and call knit
args <- commandArgs(TRUE)
inputFile <- args[1]
outputFile <- args[2]
knit(inputFile, outputFile)

3) Modified and cleaned the Makefile from the Rcpp Gallery for my needs. This runs knit.sh on the /src folder, returns the .markdown files to _posts/ and cleans up the .md and .hmtl files created by R-Studio. Put this file into source/src!

Makefile
1
2
3
4
5
6
7
8
9
10
KNIT = ../_scripts/knit.sh
POSTS_DIR = ../_posts
MD_FILES := $(patsubst %.Rmd, $(POSTS_DIR)/%.markdown, $(wildcard *.Rmd))

all: $(MD_FILES)

$(POSTS_DIR)/%.markdown: %.Rmd
  $(KNIT) $< $@
	$(RM) *.md
	$(RM) *.html

You can skim through my github-repo for this site and copy the files from there. I have not tested this very extensively - in case you find any bugs or have improvements, please let me know.

Edit 01.03.2013

I had to add this line of code

to source/_includes/custom/head.html so that the figures are displayed properly, see here.

Quantitative Ecotoxicology, Page 147, Example 4.3, LC50

This is about example 4.3 on page 147 of Quantitative Ecotoxicology. This example is about Calculations of $LC_{50}$ values. In this post I won’t reproduce the SAS-Code since I do not have any experience with SAS PROC PROBIT and I do not fully understand whats happening there.

Instead I will fit Dose-Response-Models using the drc-package to this data.

Get the data from here and read it into R:

1
2
3
4
require(RCurl)
url <- getURL("https://raw.github.com/EDiLD/r-ed/master/quantitative_ecotoxicology/data/p147.csv",
    ssl.verifypeer = FALSE)
salt <- read.table(text = url, header = TRUE, sep = ";")
1
head(salt)
1
2
3
4
5
6
7
##   DEAD TOTAL CONC
## 1   16    76 10.3
## 2   22    79 10.8
## 3   40    77 11.6
## 4   69    76 13.2
## 5   78    78 15.8
## 6   77    77 20.1

So the data consists of number of dead animals (DEAD) from all animals (TOTAL) exposed to a concentration (CONC). First we create a new column with the proportion of dead animals:

1
salt$prop <- salt$DEAD/salt$TOTAL

Lets have a look at the raw data (note that I use a logarithmic scale for the x-axis):

1
2
plot(salt$CONC, salt$prop, xlab = "Concentration", ylab = "Proportion dead",
    log = "x")

plot of chunk p147_plot_raw

I will use the drc-package of Christian Ritz and Jens Strebig to fit dose-response-curves to this data. The main function of this package is drm:

Here I fit a two-parameter log-logistic model to the data (see Ritz (2010) for a review of dose-response-models used in ecotoxicology):

1
2
require(drc)
mod <- drm(prop ~ CONC, data = salt, fct = LL.2())

So the usage is similar to lm() or nls(), except the fct argument. This argument defines the model that is fitted to the data.

We can compare this model with other models using the AIC (the smaller the better).

Here I compare the 2-parameter log-logistic model with a two-parameter Weibull and a 2-parameter Gompertz model.

drc has for this purpose the mselect() function that shows some diagnostics for every model:

  • likelihood (the higher the better; however use this only when your models are nested)
  • AIC (the lower the better)
  • residual variance (the lower the better)
1
mselect(mod, fctList = list(W1.2(), G.2()))
1
2
3
4
##      logLik      IC Lack of fit    Res var
## LL.2 14.613 -23.226          NA 0.00067322
## W1.2     NA      NA          NA         NA
## G.2      NA      NA          NA         NA

The LL.2-model has the lowest AIC so I will keep this. Lets see how the model looks like:

1
2
3
4
5
6
# raw data
plot(prop ~ CONC, data = salt, xlim = c(9, 21), ylim = c(0, 1), ylab = "Proportion dead",
    xlab = "Concentration")

conc_pred <- seq(9, 21, 0.1)
lines(conc_pred, predict(mod, newdata = data.frame(CONC = conc_pred)))

plot of chunk p147_plot_mod

We can get the $LC_{50}$ with confidence interval from the model using the ED() function:

1
ED(mod, 50, interval = "delta")
1
2
3
4
5
6
## 
## Estimated effective doses
## (Delta method-based confidence interval(s))
## 
##      Estimate Std. Error   Lower Upper
## 1:50  11.4768     0.0575 11.3171  11.6

Code and data is available on my github-repo under file name ‘p147’.

References

Ritz C (2010). “Toward A Unified Approach to Dose-Response Modeling in Ecotoxicology.” Environmental Toxicology And Chemistry, 29. ISSN 07307268, http://dx.doi.org/10.1002/etc.7.