In this activity, we will review creating summary tables and explore the water quality data more in depth. We did create some summaries in the previous summary table lesson.

First, go ahead and read in your water quality data file from where ever you have it saved.

wq <- read.csv('data/raleigh_water_analysis.csv')

Understanding the data

First, we will work a little to understand what data we have. You will use this information to complete the methods section of your poster

  1. How many streams are in the data set?
  2. How many characteristics did the City of Raleigh measure?
  3. How many years are included in the dataset?
  4. What months did the City of Raleigh collect in?

Summary Activity

First, let’s do a practice review by looking to see if there is a relationship between the amount of Nitrogen and Phosphorus in streams. To do this:

  1. Write down your hypothesis. Do you think there will be a relationship?
  2. Make a scatterplot with Phosphorous_total_mg_L and Nitrogen_total_mg_L.
  3. Create a linear model to analyze the relationship.
  4. Make a summary table with the average nitrogen and phosphorus for each site.

Now, let’s look at a few more challenging summaries to investigate.

  1. What site had the highest average Copper_mg_L?
  2. What three sites had the highest average E_coli_MPN_100mL?
  3. How many sites had a average Nitrogen_total_mg_L greater than 1.0?
  4. How many years had a average Nitrogen_total_mg_L greater than 1.0?