Tidy Data Identification

For each data frame below identify whether the data are in tidy format or not. Provide reasoning for your answer.

Data frame A
dog breed height weight
1 Sheltie 17 27
2 Black Lab 18 45
3 German Shepherd 29 58
4 Sheltie 14 19
Data frame B
student name year pretest posttest
1 Jamila JR 84 91
2 Frank SR 76 88
3 Rose SR 65 77
4 Courtney SO 85 86
5 Lavelle JR 81 91
Data frame C
county type info
Ramsey size 155.78
Hennepin size 556.62
Dakota size 569.58
Ramsey population 508640.00
Hennepin population 1152425.00
Dakota population 398552.00
Ramsey established 1849.00
Hennepin established 1852.00
Dakota established 1849.00

 

Pivoting I

For each situation below identify the number of rows and columns in the new pivoted data frame. [Think deeply about your answers here but note that you will not be graded on your answers to these questions.]

  1. Pivot data frame B (from the question above) in such a manner that pre- and post-test scores will be in one column (each test score will be in its own row of this column) with a separate column that indicates whether the test score is from a pre- or post-test.
  2. Pivot data frame C (from the question above) in such a manner that the “items” listed under type will be separated into new columns with the corresponding values under info in each column.

 

Pivoting II

For each situation in “Pivoting I” enter the original data frame in R, pivot the data to a new data frame (show your R and the final pivoted data frame). From these results reflect on your answers in “Pivoting I.” Were each of your answers correct? Which ones (if any) were not correct and where do you think your reasoning went awry?

 

Wisconsin Deer Harvest

The number of antlered deer harvested per year in five counties in northern Wisconsin were recorded in WIDeer.xlsx. Examine and load these data into R to answer the following questions.

  1. Are these data tidy? Explain.
  2. Pivot these data into a tidy format (that will be easy to plot number harvested per year separated by county). Show your R code and resulting data frame.

 

Fish Tagging Data

Northland students have individually tagged many fish on Inch Lake. Some of these fish are later recaptured. For each encounter the fish species, unique tag number, capture date, total length (inches), and weight (g) is recorded. In addition, a variable called recap is created which is 0 for the first encounter with the fish (the tagging time) and 1 for the second encounter (the recapture time). These data are recorded in InchLakeTags.xlsx.

Load these data into R and wrangle the data into a new data frame that will allow three new variables to be created – amount of time between captures (i.e., between tagging and recapture), change in total length between captures, and change in weight between captures. You do not need to calculate these new variables (you will do this in a future exercise). Show all of your code and your final data frame. [Hint: Each variable that is a part of a change calculation should be its own variable.]