Packages and Data

setwd("~/Northland College/Great Graphs")
library(tidyverse)
library(patchwork)

be <- read.csv("BRWE.csv")
be$Season <- factor(be$Season, levels = c("S", "F"), labels = c("Spring", "Fall"))
be$Stream <- factor(be$Stream, levels = c("Head", "Mid", "Lower"))
str(be)

## Bad River Watershed
br <- read.csv("Bad River.csv")
br$Season <- factor(br$Season, levels = c("S", "F"), labels = c("Spring", "Fall"))
br$Stream <- factor(br$Stream, levels = c("Head", "Mid", "Lower"))
str(br)

cbPalette <- c("Head"="#000000", "Lower"="#56B4E9","Mid"="#CC79A7")

p <- ggplot(data = be, mapping = aes(x=Year, y=pEPT, color=Stream)) +
  geom_point(size=5, alpha=0.25) +
  geom_jitter(width = 0.1, height = 0.02, size = 5, alpha = 0.25) +
  scale_x_continuous(breaks = seq(2004, 2020, 2), 
                     expand = expansion(mult = c(0.03, 0.05))) +
  scale_y_continuous(breaks = seq(20,100,20), limits = c(20,100),
                     expand = expansion(mult = c(0.07,0.05)),
                     labels = function(x) paste0(x, "%")) +
  scale_color_manual(values=cbPalette, ) +
  geom_smooth(method = "loess", se=FALSE) +
  theme_bw() +
  theme(panel.grid = element_blank(), 
        strip.text = element_text(size=12, face = "bold"), 
        panel.spacing=unit(4,unit="mm"), axis.text = element_text(size=10),
        plot.title = element_text(size = 16), legend.position = "none") +
  labs(title = "Marcoinvertebrate % EPT for the Bad River Watershed, 2004-2018", 
       caption = "Source: Superior Rivers Watershed Association") +
  facet_grid(vars(Season)) +
  labs(x=element_blank(),
       y=element_blank(),
       color="Stream Reach")

The following graphs are from a subset of the macroinvertebrate data that has been collected from 2003 through 2018 by the Superior Rivers Watershed Association. The data was filtered so that the major sampling rivers (Bad River, Morango River, Potato River, Tylor Fork River & the White River) within the watershed were used and only samples that had more then 100 specimens collected. R was used to calculate the Hilsenhoff Biotic Index (HBI) and the EPT Richness Index (%EPT), both of these indexes are used to estimate the water quality based on macroinvertebrates abundance and tolerance.

Bad River Watershed HBI

The goal of this graph is to illustrate the HBI values for the different reaches on the Bad River watershed over the years between the seasons spring and fall so to get an idea of the water quality.

B <- ggplot(data = br, mapping = aes(x=Year, y=HBI, color=Stream)) +
  geom_point(size=5, alpha=0.25) +
  geom_jitter(width = 0.1, height = 0.02, size = 5, alpha = 0.25) +
  scale_x_continuous(breaks = seq(2004, 2020, 2),
                     expand = expansion(mult = c(0.03, 0.05))) +
  scale_y_continuous(breaks = seq(0, 6, 0.75)) +
  scale_color_manual(values=cbPalette, ) +
  geom_smooth(method = "loess", se=FALSE) +
  theme_bw() +
  theme(panel.grid = element_blank(), 
        strip.text = element_text(size=12, face = "bold"), 
        panel.spacing=unit(4,unit="mm"), axis.text = element_text(size=10),
        plot.title = element_text(size = 16)) +
  labs(title = "Marcoinvertebrate HBI Values for the Bad River Watershed, 2004-2018", 
       caption = "Source: Superior Rivers Watershed Association") +
  facet_grid(vars(Season)) +
  labs(x=element_blank(),
       y=element_blank(),
       color="Stream Reach")
B

#R> `geom_smooth()` using formula 'y ~ x'

This graph shows that the fall samples have a more consistent trend for HBI values over the years compared to the spring samples. The fall samples tend to be lower in values, meaning that the water quality is better, then in spring were it appears to be more inconsistent. The fall graphic shows that the HBI value of the head waters is starting to increase in value starting in 2017, this could have been a result of the hundred year floods. The years between 2008 and 2013 had the lowest HBI values, indicating that the water quality was at its best during this period. The spring data is more inconsistent then the fall. One hypothesis about why this could be is the spring melt bringing more runoff into the watershed then in the fall. Over all in the water quality of the Bad River Watershed is between Good and Excellent.

A scatter plot was choosen so that individual samples could be seen and a regression line could easly be added. I decided to facet the graph into spring and fall because different marcoinvertebrates are present at different times of the year and separating by season takes this into account. I decided to do a grid facet so that the trends were easier to follow since the x-axis would be spread out and there would only one. The y-axis scale I decided to do by 0.25 because that is the scale used to determine the water quality categories based on HBI values. The x-axis scale is ever two years so that the axis is not over crowded but you can still identify individual years. The axis are not labels because I felt that the title made it obvious which one is year and HBI. The colors I choose so that they would be different enough that they would not blend together but not over powering. Made the head waters the darker color and the lower reaches lighter because it made sense from a gradient stead point. I took the grid lines out so to not take away from the data. I made the points larger so that they would stand out over the regression line but so they were not to big that they would move into a different water quality category. I added a loess regression line to show the trend of the HBI values over the years more clearly. The text size was changed so that it was easier to read.

Bad River Watershed HBI & %EPT

The goal of this graphic is to compare and see if there is a relationship between the HBI values and the percent EPT from the Bad River watershed.

B / p

#R> `geom_smooth()` using formula 'y ~ x'
#R> `geom_smooth()` using formula 'y ~ x'

This graphic illustrates that there is a relationship between HBI values and percent EPT, when the HBI values are low the EPT percentages are high. This is how it should be because the lower the HBI value the better the water quality is meaning that the species present are more pollution intolerant species. It is these intolerant species that are used to calculate percent EPT so if they are more abundant the percent is high. The percent EPT values also appears to follow the same fluctuation as the HBI values, it is basically a mirror image of the HBI values graph.

The decisions made for the HBI graph are the same as stated above. For the percent EPT graph a scatter plot was for the same reasons stated above for the HBI graph. The x-axis scale is the same as the HBI graph so to not over crowd the axis but also make individual years identifiable. The y-axis is in percentages because the EPT values are percents and are spread out by 20s because the axis would be over crowded otherwise. I did not have the y-axis end right at 100% because of my point size. If I had ended the graph at 100% a few points would have been cut in half. The size, color, facet and loess regression line reasons are all the same as stated above. I decided not to show the legend because I was stacking the two separate graphs and only wanted one legend for the two. I decided to stack the graphs on top of each other so that they both could be spread out and the years were lined up so it was easier to compare years to each other.

Project Part 1

Maria Lefevre

22 May, 2020

Packages and Data

Bad River Watershed HBI

Bad River Watershed HBI & %EPT