A user of my Introductory Fisheries Analyses with R book recently asked me how to compute standard errors (SE) for the various PSD indices by using the usual equation for the SE of a proportion (square root of p times (1-p) divided by n). Below is my response using the Inch Lake data from the book.
- Load required packages.
library(FSA)
library(magrittr)
library(dplyr)
- Load (and prepare) the Inch Lake data used in the Size Structure chapter of the book (this and next step is from this script).
inchAlls <- read.csv("http://derekogle.com/IFAR/scripts/InchLake1113.csv") %>%
mutate(gcat=psdAdd(tl,species)) %>%
filterD(gcat!="substock")
## No known Gabelhouse (PSD) lengths for: Iowa Darter
- Compute the PSD-X values.
( freq <- xtabs(~species+gcat,data=inchAlls) )
## gcat
## species stock quality preferred memorable
## Black Crappie 33 40 8 81
## Bluegill 571 199 223 43
## Largemouth Bass 26 36 0 0
## Pumpkinseed 5 2 0 0
## Yellow Perch 4 0 0 1
iPSDs <- prop.table(freq,margin=1)*100
( PSDs <- t(apply(iPSDs,MARGIN=1,FUN=rcumsum)) )
## gcat
## species stock quality preferred memorable
## Black Crappie 100 79.62963 54.93827 50.000000
## Bluegill 100 44.88417 25.67568 4.150579
## Largemouth Bass 100 58.06452 0.00000 0.000000
## Pumpkinseed 100 28.57143 0.00000 0.000000
## Yellow Perch 100 20.00000 20.00000 20.000000
- Compute the row sums for the
freq
table to get the total number of fish (that are stock size or greater). These will be n in the SE calculations.
( sums <- rowSums(freq) )
## Black Crappie Bluegill Largemouth Bass Pumpkinseed
## 162 1036 62 7
## Yellow Perch
## 5
- Compute the SE for PSD-Q.
SEs <- sqrt(PSDs[,"quality"]*(100-PSDs[,"quality"])/sums)
round(SEs,1)
## Black Crappie Bluegill Largemouth Bass Pumpkinseed
## 3.2 1.5 6.3 17.1
## Yellow Perch
## 17.9
- Repeat for other size indices. For example, for PSD-P.
SEs <- sqrt(PSDs[,"preferred"]*(100-PSDs[,"preferred"])/sums)
round(SEs,1)
## Black Crappie Bluegill Largemouth Bass Pumpkinseed
## 3.9 1.4 0.0 0.0
## Yellow Perch
## 17.9
Note that the SE formula is usually used for proportions (and then multiplied by 100 to get a SE for percentages). However, it can also be shown algebraically to be useful for percentages.
I would caution against using these standard errors to produce confidence intervals for the PSD-X values. If confidence intervals are the end-product for this analysis, then I suggest using the methods described in my book.