Filtering I

For data frame B shown below (you will need to enter this data frame in R), demonstrate each type of conditioning expression from Table 7.1. For each expression show your code, show the results, and write a sentence about what you were trying to accomplish. [Make choices that do not result in an empty data frame.]

Data frame B
student name year pretest posttest
1 Jamila JR 84 91
2 Frank SR 76 88
3 Rose SR 65 77
4 Courtney SO 85 86
5 Lavelle JR 81 91

 

Fish Tagging Data

In this excercise you created a data frame that would allow three variables to be computed – amount of time between captures (i.e., between tagging and recapture), change in total length between captures, and change in weight between captures. Continue that exercise here by …

  • adding the three variables mentioned above to the data frame,
  • moving each “change” variable to be immediately after the tag variable, and
  • sorting the data from from the fish that was “at large” the shortest to the longest amount of time.

Use this new data frame to create the following data frames. For each data frame show your code and use some code to demonstrate that your filtering was likely successful.

  1. Construct a data frame of just Pumpkinseed fish.

  2. Construct a data frame of Pumpkinseed and Bluegill fish.

  3. Construct a data frame of Black Crappie that were initially more than 13 inches long.

  4. Construct a data frame of Largemouth Bass that were larger than 14 inches at recapture and were at-large for at least three years.

 

Quarterbacks I

sports-reference.com has a wealth of information on major American sports. Here you will work with statistics for college quarterbacks from the 2020-21 season. Use the following steps to get an Excel file of the data from their webpage.

  1. Goto https://www.sports-reference.com/;
  2. Select “CFB” in the list of sports in the gray box near the top of the page;
  3. Select “Years” in the list of items in the gray box near the top of the page;
  4. Select “2020” in the list of years on the left side of the table;
  5. Hover over “Stats” in the gray box near the middle of the page and select “Passing” from the items that will appear;
  6. Hover over the “Share & Export” item just above the top-right side of the table of statistics and select “Get as Excel Workbook”;
  7. Save that workbook to your computer;
  8. Open the workbook in Excel; and
  9. “File … Save as” an Excel Workbook (this works around the file downloading in a non-standard Excel format).

Once you have the data on your computer load the data into R and perform the following cleaning tasks.

  1. Remove the last four columns that pertain to rushing statistics.
  2. Fix the three remaining variables that had numbers attached to their names.
  3. Change the “Y/A” variable to “Yds_per_Att” and the “AY/A” variable to “AdjYds_per_Att”.

With the cleaned data set create data frames that match the following conditions. Show your code and do “something” that demonstrates that your data frame has the correct observations.

  1. Only those quarterbacks that played more than 10 games.

  2. Quarterbacks that played in one off the “Power 5” Conferences (ACC, Big 12, Big Ten, Pac-12, and SEC).

  3. Quarterbacks from “Power 5” conferences that attempted more than 400 passes.

  4. Quarterbacks from NON “Power 5” conferences that had a completion percentage between 45 and 60%.

 

Quarterbacks II

Repeat the steps shown above to create a data frame of passing data for quarterbacks from the 2019 year. Then combine these data with your data frame of passing data for quarterbacks from the 2020 year to make one data frame with data from both years. Make sure to include a new variable that identifies what year the observation is from. Show your code and the structure of your final data frame.