class: inverse, middle # R for MTH107 - [Getting Data into R](#RData) - [Filtering Data in R](#RFilter) - [Univariate EDA in R](#RUnivEDA) - [Quantitative](#RUnivEDAQ) - [Categorical](#RUnivEDAC) - [Quantitative by Group](#RUnivEDAQC) - [Bivariate EDA in R](#RBivEDA) - [Quantitative](#RBivEDAQ) - [Categorical](#RBivEDAC) - [Linear Regression in R](#RRegression) - [t-Tests in R](#Rttests) - [1-sample t-Test in R](#Rttests1) - [2-sample t-Test in R](#Rttests2) - [Chi-square in R](#RChi) - [Chi-square Test in R](#RChiChi) - [Goodness-of-Fit Test in R](#RChiGOF) --- class: inverse, center, middle name: RData # Load Important Packages --- ## Load Important Packages - Many calculations in this course require functions in the `NCStats` package. -- - Nearly all plots in this course require functions in the `ggplot2` package. -- - Thus, these packages **MUST** be loaded at the beginning of each session. ```r #!# Loading important packages ... must do every time library(NCStats) library(ggplot2) ``` --- class: inverse, center, middle --- class: inverse, center, middle # Loading Data into R --- ## Introducing the Data - Size (bill, flipper, weight) of three Penguin species from three islands in Antarctica. -- - Collected by [Dr. Kristen Gorman](https://www.uaf.edu/cfos/people/faculty/detail/kristen-gorman.php) and the [Palmer Station, Antarctica LTER](https://pal.lternet.edu/). <img src="https://www.uaf.edu/cfos/images/people/faculty_prof/gorman-kristen-4x5.jpg" width="33%" style="display: block; margin: auto;" /> --- ## Introducing the Data - Size (bill, flipper, weight) of three Penguin species from three islands in Antarctica. - Collated by [Allison Horst](https://www.allisonhorst.com/). <img src="https://allisonhorst.github.io/palmerpenguins/reference/figures/lter_penguins.png" width="67%" style="display: block; margin: auto;" /> --- ## Introducing the Data - Size (bill, flipper, weight) of three Penguin species from three islands in Antarctica. - Data in [Penguins.csv](https://raw.githubusercontent.com/allisonhorst/penguins/master/data-raw/penguins.csv) on the class' [data webpage](http://derekogle.com/NCMTH107/resources/data_107). <img src="https://pbs.twimg.com/profile_images/1041703659984191489/qWBxildv.jpg" width="40%" style="display: block; margin: auto;" /> --- ## Loading Data into R 1. Download or save data (CSV file) to folder on your computer. - Shown in a separate video for both data you must enter and for a data file that is provided to you. -- 1. Start an R script with code for `NCStats` and `ggplot2` packages. -- 1. Save script in **same folder** with the data file. -- 1. Use `Session`, `Set working directory`, `To source file location` to set the working directory (where the data file is located). -- 1. Copy resultant `setwd()` from console and paste into script. -- 1. Use `read.csv()` with filename in quotes to load data into R. - Should be assigned to a short object name (what data is called in R). -- 1. Examine data object. - Use `str()` to see the structure. - Use `peek()` to see a subset of rows/individuals. --- count: false ## Loading Data into R .left-panel-loaddata-user[ ```r #!# Set to your OWN working directory ... this is Derek's computer. *setwd("C:/aaaWork/Web/GitHub/NCMTH107/modules/HO") ``` ] .right-panel-loaddata-user[ ] --- count: false ## Loading Data into R .left-panel-loaddata-user[ ```r #!# Set to your OWN working directory ... this is Derek's computer. setwd("C:/aaaWork/Web/GitHub/NCMTH107/modules/HO") *peng <- read.csv("penguins.csv") ``` ] .right-panel-loaddata-user[ ] --- count: false ## Loading Data into R .left-panel-loaddata-user[ ```r #!# Set to your OWN working directory ... this is Derek's computer. setwd("C:/aaaWork/Web/GitHub/NCMTH107/modules/HO") peng <- read.csv("penguins.csv") *str(peng) ``` ] .right-panel-loaddata-user[ ``` 'data.frame': 344 obs. of 7 variables: $ species : chr "Adelie" "Adelie" "Adelie" "Adelie" ... $ island : chr "Torgersen" "Torgersen" "Torgersen" "Torgersen" ... $ bill_length_mm : num 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ... $ bill_depth_mm : num 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ... $ flipper_length_mm: int 181 186 195 NA 193 190 181 195 193 190 ... $ body_mass_g : int 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 ... $ sex : chr "male" "female" "female" NA ... ``` ] --- count: false ## Loading Data into R .left-panel-loaddata-user[ ```r #!# Set to your OWN working directory ... this is Derek's computer. setwd("C:/aaaWork/Web/GitHub/NCMTH107/modules/HO") peng <- read.csv("penguins.csv") str(peng) *peek(peng) ``` ] .right-panel-loaddata-user[ ``` 'data.frame': 344 obs. of 7 variables: $ species : chr "Adelie" "Adelie" "Adelie" "Adelie" ... $ island : chr "Torgersen" "Torgersen" "Torgersen" "Torgersen" ... $ bill_length_mm : num 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ... $ bill_depth_mm : num 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ... $ flipper_length_mm: int 181 186 195 NA 193 190 181 195 193 190 ... $ body_mass_g : int 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 ... $ sex : chr "male" "female" "female" NA ... ``` ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 18 Adelie Torgersen 42.5 20.7 197 4500 male 36 Adelie Dream 39.2 21.1 196 4150 male 54 Adelie Biscoe 42.0 19.5 200 4050 male 72 Adelie Torgersen 39.7 18.4 190 3900 male 91 Adelie Dream 35.7 18.0 202 3550 female 109 Adelie Biscoe 38.1 17.0 181 3175 female 127 Adelie Torgersen 38.8 17.6 191 3275 female 145 Adelie Dream 37.3 16.8 192 3000 female 163 Gentoo Biscoe 40.9 13.7 214 4650 female 181 Gentoo Biscoe 48.2 14.3 210 4600 female 199 Gentoo Biscoe 45.5 13.9 210 4200 female 217 Gentoo Biscoe 45.8 14.2 219 4700 female 235 Gentoo Biscoe 47.4 14.6 212 4725 female 253 Gentoo Biscoe 48.5 15.0 219 4850 female 272 Gentoo Biscoe NA NA NA NA <NA> 290 Chinstrap Dream 52.0 18.1 201 4050 male 308 Chinstrap Dream 54.2 20.8 201 4300 male 326 Chinstrap Dream 49.8 17.3 198 3675 female 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] <style> .left-panel-loaddata-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-loaddata-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-loaddata-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle --- class: inverse, center, middle name: RFilter # Filtering Data --- ## Isolate Single Variable - Examine one variable in a data frame. -- - use `dataframename$variablename` --- count: false ## Isolate Variable .left-panel-filterBM-user[ ```r *headtail(peng) ## Examine just the body mass variable ``` ] .right-panel-filterBM-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Isolate Variable .left-panel-filterBM-user[ ```r headtail(peng) ## Examine just the body mass variable *peng$body_mass_g ``` ] .right-panel-filterBM-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` [1] 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 3300 3700 3200 3800 4400 3700 3450 4500 3325 4200 3400 3600 3800 [24] 3950 3800 3800 3550 3200 3150 3950 3250 3900 3300 3900 3325 4150 3950 3550 3300 4650 3150 3900 3100 4400 3000 4600 [47] 3425 2975 3450 4150 3500 4300 3450 4050 2900 3700 3550 3800 2850 3750 3150 4400 3600 4050 2850 3950 3350 4100 3050 [70] 4450 3600 3900 3550 4150 3700 4250 3700 3900 3550 4000 3200 4700 3800 4200 3350 3550 3800 3500 3950 3600 3550 4300 [93] 3400 4450 3300 4300 3700 4350 2900 4100 3725 4725 3075 4250 2925 3550 3750 3900 3175 4775 3825 4600 3200 4275 3900 [116] 4075 2900 3775 3350 3325 3150 3500 3450 3875 3050 4000 3275 4300 3050 4000 3325 3500 3500 4475 3425 3900 3175 3975 [139] 3400 4250 3400 3475 3050 3725 3000 3650 4250 3475 3450 3750 3700 4000 4500 5700 4450 5700 5400 4550 4800 5200 4400 [162] 5150 4650 5550 4650 5850 4200 5850 4150 6300 4800 5350 5700 5000 4400 5050 5000 5100 4100 5650 4600 5550 5250 4700 [185] 5050 6050 5150 5400 4950 5250 4350 5350 3950 5700 4300 4750 5550 4900 4200 5400 5100 5300 4850 5300 4400 5000 4900 [208] 5050 4300 5000 4450 5550 4200 5300 4400 5650 4700 5700 4650 5800 4700 5550 4750 5000 5100 5200 4700 5800 4600 6000 [231] 4750 5950 4625 5450 4725 5350 4750 5600 4600 5300 4875 5550 4950 5400 4750 5650 4850 5200 4925 4875 4625 5250 4850 [254] 5600 4975 5500 4725 5500 4700 5500 4575 5500 5000 5950 4650 5500 4375 5850 4875 6000 4925 NA 4850 5750 5200 5400 [277] 3500 3900 3650 3525 3725 3950 3250 3750 4150 3700 3800 3775 3700 4050 3575 4050 3300 3700 3450 4400 3600 3400 2900 [300] 3800 3300 4150 3400 3800 3700 4550 3200 4300 3350 4100 3600 3900 3850 4800 2700 4500 3950 3650 3550 3500 3675 4450 [323] 3400 4300 3250 3675 3325 3950 3600 4050 3350 3450 3250 4050 3800 3525 3950 3650 3650 4000 3400 3775 4100 3775 ``` ] <style> .left-panel-filterBM-user { color: #777; width: 34.3137254901961%; height: 92%; float: left; font-size: 80% } .right-panel-filterBM-user { width: 63.7254901960784%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-filterBM-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## Filtering Data - Create a subset of a larger data.frame based on a condition or conditions. -- - Use `filterD()` - First argument is larger data.frame. - Following arguments are conditions. -- - Assign (save) result to a new object (i.e., name for subsetted data.frame). --- ## Filtering Data Table: Comparison operators used in `filterD()` and their results. |Comparison Operator |Rows Returned from Original Data Frame | |:----------------------------|:--------------------------------------------------------| |`var==value` |Rows where `var` **IS equal** to `value` | |`var!=value` |Rows where `var` **is NOT equal** to `value` | |`var %in% c(value1,value2)` |Rows where `var` **IS IN** vector of `value`s | |`var>value` |Rows where `var` is **greater than** `value` | |`var>=value` |Rows where `var` is **greater than or equal to** `value` | |`var<value` |Rows where `var` is **less than** `value` | |`var<=value` |Rows where `var` is **less than or equal to** `value` | |condition1,condition2 |Rows where **BOTH** conditions are true | |condition1 | condition2 |Rows where **ONE or BOTH** conditions are true | --- count: false ## Filtering Data .left-panel-filter1-auto[ ```r *headtail(peng) ``` ] .right-panel-filter1-auto[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Filtering Data .left-panel-filter1-auto[ ```r headtail(peng) *unique(peng$species) ``` ] .right-panel-filter1-auto[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` [1] "Adelie" "Gentoo" "Chinstrap" ``` ] --- count: false ## Filtering Data .left-panel-filter1-auto[ ```r headtail(peng) unique(peng$species) ## Isolate just Chinstrap Penguins *peng_chin <- filterD(peng,species=="Chinstrap") ``` ] .right-panel-filter1-auto[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` [1] "Adelie" "Gentoo" "Chinstrap" ``` ] --- count: false ## Filtering Data .left-panel-filter1-auto[ ```r headtail(peng) unique(peng$species) ## Isolate just Chinstrap Penguins peng_chin <- filterD(peng,species=="Chinstrap") *peek(peng_chin) ``` ] .right-panel-filter1-auto[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` [1] "Adelie" "Gentoo" "Chinstrap" ``` ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 4 Chinstrap Dream 45.4 18.7 188 3525 female 7 Chinstrap Dream 46.1 18.2 178 3250 female 11 Chinstrap Dream 46.6 17.8 193 3800 female 14 Chinstrap Dream 52.0 18.1 201 4050 male 18 Chinstrap Dream 58.0 17.8 181 3700 female 21 Chinstrap Dream 42.4 17.3 181 3600 female 25 Chinstrap Dream 46.7 17.9 195 3300 female 29 Chinstrap Dream 46.4 17.8 191 3700 female 32 Chinstrap Dream 54.2 20.8 201 4300 male 36 Chinstrap Dream 47.5 16.8 199 3900 female 39 Chinstrap Dream 46.9 16.6 192 2700 female 43 Chinstrap Dream 50.9 19.1 196 3550 male 47 Chinstrap Dream 50.1 17.9 190 3400 female 50 Chinstrap Dream 49.8 17.3 198 3675 female 54 Chinstrap Dream 50.7 19.7 203 4050 male 57 Chinstrap Dream 45.2 16.6 191 3250 female 61 Chinstrap Dream 51.9 19.5 206 3950 male 64 Chinstrap Dream 55.8 19.8 207 4000 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] <style> .left-panel-filter1-auto { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-filter1-auto { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-filter1-auto { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ## Filtering Data .left-panel-filter2-user[ ```r *headtail(peng) *unique(peng$species) ## Isolate just Chinstrap and Gentoo Penguins ``` ] .right-panel-filter2-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` [1] "Adelie" "Gentoo" "Chinstrap" ``` ] --- count: false ## Filtering Data .left-panel-filter2-user[ ```r headtail(peng) unique(peng$species) ## Isolate just Chinstrap and Gentoo Penguins *peng_chingen <- filterD(peng,species %in% c("Chinstrap","Gentoo")) ``` ] .right-panel-filter2-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` [1] "Adelie" "Gentoo" "Chinstrap" ``` ] --- count: false ## Filtering Data .left-panel-filter2-user[ ```r headtail(peng) unique(peng$species) ## Isolate just Chinstrap and Gentoo Penguins peng_chingen <- filterD(peng,species %in% c("Chinstrap","Gentoo")) *peek(peng_chingen) ``` ] .right-panel-filter2-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` [1] "Adelie" "Gentoo" "Chinstrap" ``` ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Gentoo Biscoe 46.1 13.2 211 4500 female 10 Gentoo Biscoe 46.8 15.4 215 5150 male 20 Gentoo Biscoe 48.7 15.1 222 5350 male 30 Gentoo Biscoe 50.0 15.3 220 5550 male 40 Gentoo Biscoe 48.7 15.7 208 5350 male 51 Gentoo Biscoe 46.6 14.2 210 4850 female 61 Gentoo Biscoe 45.3 13.8 208 4200 female 71 Gentoo Biscoe 47.7 15.0 216 4750 female 81 Gentoo Biscoe 49.1 14.5 212 4625 female 91 Gentoo Biscoe 47.5 15.0 218 4950 female 101 Gentoo Biscoe 48.5 15.0 219 4850 female 111 Gentoo Biscoe 50.5 15.2 216 5000 female 121 Gentoo Biscoe 46.8 14.3 215 4850 female 131 Chinstrap Dream 46.1 18.2 178 3250 female 141 Chinstrap Dream 50.3 20.0 197 3300 male 152 Chinstrap Dream 49.5 19.0 200 3800 male 162 Chinstrap Dream 52.0 20.7 210 4800 male 172 Chinstrap Dream 49.0 19.6 212 4300 male 182 Chinstrap Dream 49.3 19.9 203 4050 male 192 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] <style> .left-panel-filter2-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-filter2-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-filter2-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ## Filtering Data .left-panel-filter3-user[ ```r *headtail(peng) ## Isolate just male Chinstrap Penguins ``` ] .right-panel-filter3-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Filtering Data .left-panel-filter3-user[ ```r headtail(peng) ## Isolate just male Chinstrap Penguins *peng_chinmale <- filterD(peng,species=="Chinstrap",sex=="male") ``` ] .right-panel-filter3-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Filtering Data .left-panel-filter3-user[ ```r headtail(peng) ## Isolate just male Chinstrap Penguins peng_chinmale <- filterD(peng,species=="Chinstrap",sex=="male") *peek(peng_chinmale) ``` ] .right-panel-filter3-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 50.0 19.5 196 3900 male 2 Chinstrap Dream 51.3 19.2 193 3650 male 4 Chinstrap Dream 51.3 18.2 197 3750 male 5 Chinstrap Dream 51.3 19.9 198 3700 male 7 Chinstrap Dream 52.0 18.1 201 4050 male 9 Chinstrap Dream 50.3 20.0 197 3300 male 11 Chinstrap Dream 48.5 17.5 191 3400 male 13 Chinstrap Dream 52.0 19.0 197 4150 male 14 Chinstrap Dream 49.5 19.0 200 3800 male 16 Chinstrap Dream 54.2 20.8 201 4300 male 18 Chinstrap Dream 49.7 18.6 195 3600 male 20 Chinstrap Dream 53.5 19.9 205 4500 male 21 Chinstrap Dream 49.0 19.5 210 3950 male 23 Chinstrap Dream 50.8 18.5 201 4450 male 25 Chinstrap Dream 51.5 18.7 187 3250 male 27 Chinstrap Dream 50.7 19.7 203 4050 male 29 Chinstrap Dream 49.3 19.9 203 4050 male 30 Chinstrap Dream 50.2 18.8 202 3800 male 32 Chinstrap Dream 55.8 19.8 207 4000 male 34 Chinstrap Dream 50.8 19.0 210 4100 male ``` ] <style> .left-panel-filter3-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-filter3-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-filter3-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ## Filtering Data .left-panel-filterHvy-user[ ```r *headtail(peng) ## Isolate just Penguins with body mass more than 5900 g ... ~13 lbs ``` ] .right-panel-filterHvy-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Filtering Data .left-panel-filterHvy-user[ ```r headtail(peng) ## Isolate just Penguins with body mass more than 5900 g ... ~13 lbs *peng_heavy <- filterD(peng,body_mass_g>5900) ``` ] .right-panel-filterHvy-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Filtering Data .left-panel-filterHvy-user[ ```r headtail(peng) ## Isolate just Penguins with body mass more than 5900 g ... ~13 lbs peng_heavy <- filterD(peng,body_mass_g>5900) *peek(peng_heavy) ``` ] .right-panel-filterHvy-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Gentoo Biscoe 49.2 15.2 221 6300 male 2 Gentoo Biscoe 59.6 17.0 230 6050 male 3 Gentoo Biscoe 51.1 16.3 220 6000 male 4 Gentoo Biscoe 45.2 16.4 223 5950 male 5 Gentoo Biscoe 49.8 15.9 229 5950 male 6 Gentoo Biscoe 48.8 16.2 222 6000 male ``` ] <style> .left-panel-filterHvy-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-filterHvy-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-filterHvy-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ## Filtering Data .left-panel-filterWt2-user[ ```r *headtail(peng) ## Isolate Penguins with body masses b/w 4500 and 5000 g ``` ] .right-panel-filterWt2-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Filtering Data .left-panel-filterWt2-user[ ```r headtail(peng) ## Isolate Penguins with body masses b/w 4500 and 5000 g *peng_wt <- filterD(peng,body_mass_g>4500,body_mass_g<5000) ``` ] .right-panel-filterWt2-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Filtering Data .left-panel-filterWt2-user[ ```r headtail(peng) ## Isolate Penguins with body masses b/w 4500 and 5000 g peng_wt <- filterD(peng,body_mass_g>4500,body_mass_g<5000) *peek(peng_wt) ``` ] .right-panel-filterWt2-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.2 19.6 195 4675 male 3 Adelie Dream 39.6 18.8 190 4600 male 5 Adelie Biscoe 41.0 20.0 203 4725 male 8 Gentoo Biscoe 46.5 13.5 210 4550 female 10 Gentoo Biscoe 40.9 13.7 214 4650 female 13 Gentoo Biscoe 48.2 14.3 210 4600 female 15 Gentoo Biscoe 42.6 13.7 213 4950 female 18 Gentoo Biscoe 46.6 14.2 210 4850 female 20 Gentoo Biscoe 45.8 14.2 219 4700 female 23 Gentoo Biscoe 47.7 15.0 216 4750 female 25 Gentoo Biscoe 47.5 14.2 209 4600 female 28 Gentoo Biscoe 47.4 14.6 212 4725 female 30 Gentoo Biscoe 43.4 14.4 218 4600 female 33 Gentoo Biscoe 45.5 14.5 212 4750 female 35 Gentoo Biscoe 49.4 15.8 216 4925 male 38 Gentoo Biscoe 48.5 15.0 219 4850 female 40 Gentoo Biscoe 47.3 13.8 216 4725 <NA> 43 Gentoo Biscoe 43.5 15.2 213 4650 female 45 Gentoo Biscoe 47.2 13.7 214 4925 female 48 Chinstrap Dream 52.0 20.7 210 4800 male ``` ] <style> .left-panel-filterWt2-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-filterWt2-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-filterWt2-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle --- class: inverse, center, middle name: RUnivEDA # Univariate EDA in R --- name: RUnivEDAQ ## Univariate EDA - Quantitative ### Summary Statistics - Computed with `Summarize()` - Uses formula of the form `~qvar` as first argument. -- - Must include data.frame name in `data=` argument. -- - Optionally set number of decimals with `digits=`. --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQsum-user[ ```r *headtail(peng_chin) ## Distribution of flipper lengths for Chinstrap Penguins ``` ] .right-panel-UnivEDAQsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQsum-user[ ```r headtail(peng_chin) ## Distribution of flipper lengths for Chinstrap Penguins *Summarize(~flipper_length_mm,data=peng_chin,digits=1) ``` ] .right-panel-UnivEDAQsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` n mean sd min Q1 median Q3 max 68.0 195.8 7.1 178.0 191.0 196.0 201.0 212.0 ``` ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQsum-user[ ```r headtail(peng_chin) ## Distribution of flipper lengths for Chinstrap Penguins Summarize(~flipper_length_mm,data=peng_chin,digits=1) ## Distribution of bill lengths for Chinstrap Penguins ``` ] .right-panel-UnivEDAQsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` n mean sd min Q1 median Q3 max 68.0 195.8 7.1 178.0 191.0 196.0 201.0 212.0 ``` ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQsum-user[ ```r headtail(peng_chin) ## Distribution of flipper lengths for Chinstrap Penguins Summarize(~flipper_length_mm,data=peng_chin,digits=1) ## Distribution of bill lengths for Chinstrap Penguins *Summarize(~bill_length_mm,data=peng_chin,digits=2) ``` ] .right-panel-UnivEDAQsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` n mean sd min Q1 median Q3 max 68.0 195.8 7.1 178.0 191.0 196.0 201.0 212.0 ``` ``` n mean sd min Q1 median Q3 max 68.00 48.83 3.34 40.90 46.35 49.55 51.08 58.00 ``` ] <style> .left-panel-UnivEDAQsum-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDAQsum-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDAQsum-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## Univariate EDA - Quantitative ### Histogram - Use `ggplot()` to declare data. - Must include data.frame name in `data=` argument. -- - Set `x=` to variable names in `mapping=aes()`. -- - "Add on" `geom_histogram()` with - `binwidth=` to set bin width. -- - `boundary=0` for start of bins. -- - `color=` and `fill=` to set bar outline and fill colors. -- - Clean up with ... - `labs()` to label axes, -- - `scale_y_continuous(expand=expansion(mult=c(0,0.05)))` to "sit" bars on x-axis, and -- - `theme_NCStats()` to make look nice. --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist-non_seq[ ```r headtail(peng_chin) ## Distribution of flipper lengths for Chinstrap Penguins ``` ] .right-panel-UnivEDAQhist-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist-non_seq[ ```r headtail(peng_chin) ## Distribution of flipper lengths for Chinstrap Penguins *ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm)) ``` ] .right-panel-UnivEDAQhist-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist_non_seq_2_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist-non_seq[ ```r headtail(peng_chin) ## Distribution of flipper lengths for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm)) + * geom_histogram( * ) ``` ] .right-panel-UnivEDAQhist-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist_non_seq_3_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist-non_seq[ ```r headtail(peng_chin) ## Distribution of flipper lengths for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm)) + geom_histogram( * binwidth=5,boundary=0, ) ``` ] .right-panel-UnivEDAQhist-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist_non_seq_4_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist-non_seq[ ```r headtail(peng_chin) ## Distribution of flipper lengths for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm)) + geom_histogram( binwidth=5,boundary=0, * color="black",fill="lightgray" ) ``` ] .right-panel-UnivEDAQhist-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist_non_seq_5_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist-non_seq[ ```r headtail(peng_chin) ## Distribution of flipper lengths for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm)) + geom_histogram( binwidth=5,boundary=0, color="black",fill="lightgray" ) + * labs(x="Flipper Length (mm)",y="Frequency of Penguins") ``` ] .right-panel-UnivEDAQhist-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist_non_seq_6_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist-non_seq[ ```r headtail(peng_chin) ## Distribution of flipper lengths for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm)) + geom_histogram( binwidth=5,boundary=0, color="black",fill="lightgray" ) + labs(x="Flipper Length (mm)",y="Frequency of Penguins") + * scale_y_continuous(expand=expansion(mult=c(0,0.05))) ``` ] .right-panel-UnivEDAQhist-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist_non_seq_7_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist-non_seq[ ```r headtail(peng_chin) ## Distribution of flipper lengths for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm)) + geom_histogram( binwidth=5,boundary=0, color="black",fill="lightgray" ) + labs(x="Flipper Length (mm)",y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + * theme_NCStats() ``` ] .right-panel-UnivEDAQhist-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist_non_seq_8_output-1.png" width="100%" /> ] <style> .left-panel-UnivEDAQhist-non_seq { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDAQhist-non_seq { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDAQhist-non_seq { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist2A-rotate[ ```r headtail(peng_chin) ## Distribution of bill lengths for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm)) + geom_histogram(binwidth=5,boundary=0,color="black",fill="lightgray") + labs(x="Flipper Length (mm)",y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + theme_NCStats() ``` ] .right-panel-UnivEDAQhist2A-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist2A_rotate_1_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist2A-rotate[ ```r headtail(peng_chin) ## Distribution of bill lengths for Chinstrap Penguins *ggplot(data=peng_chin,mapping=aes(x=bill_length_mm)) + geom_histogram(binwidth=5,boundary=0,color="black",fill="lightgray") + labs(x="Flipper Length (mm)",y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + theme_NCStats() ``` ] .right-panel-UnivEDAQhist2A-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist2A_rotate_2_output-1.png" width="100%" /> ] <style> .left-panel-UnivEDAQhist2A-rotate { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDAQhist2A-rotate { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDAQhist2A-rotate { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist2B-rotate[ ```r headtail(peng_chin) ## Distribution of bill lengths for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=bill_length_mm)) + geom_histogram(binwidth=5,boundary=0,color="black",fill="lightgray") + labs(x="Flipper Length (mm)",y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + theme_NCStats() ``` ] .right-panel-UnivEDAQhist2B-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist2B_rotate_1_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist2B-rotate[ ```r headtail(peng_chin) ## Distribution of bill lengths for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=bill_length_mm)) + geom_histogram(binwidth=5,boundary=0,color="black",fill="lightgray") + * labs(x="Bill Length (mm)",y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + theme_NCStats() ``` ] .right-panel-UnivEDAQhist2B-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist2B_rotate_2_output-1.png" width="100%" /> ] <style> .left-panel-UnivEDAQhist2B-rotate { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDAQhist2B-rotate { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDAQhist2B-rotate { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist2C-rotate[ ```r headtail(peng_chin) ## Distribution of bill lengths for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=bill_length_mm)) + geom_histogram(binwidth=5,boundary=0,color="black",fill="lightgray") + labs(x="Bill Length (mm)",y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + theme_NCStats() ``` ] .right-panel-UnivEDAQhist2C-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist2C_rotate_1_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Quantitative .left-panel-UnivEDAQhist2C-rotate[ ```r headtail(peng_chin) ## Distribution of bill lengths for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=bill_length_mm)) + * geom_histogram(binwidth=2,boundary=0,color="black",fill="lightgray") + labs(x="Bill Length (mm)",y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + theme_NCStats() ``` ] .right-panel-UnivEDAQhist2C-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist2C_rotate_2_output-1.png" width="100%" /> ] <style> .left-panel-UnivEDAQhist2C-rotate { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDAQhist2C-rotate { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDAQhist2C-rotate { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- name: RUnivEDAC ## Univariate EDA - Categorical ### Frequency Tables - Computed with `xtabs()` - Uses formula of the form `~cvar` as first argument. -- - Must include data.frame name in `data=` argument. -- - Assign (save) result to object. --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACsum-user[ ```r *headtail(peng) ## Species composition ``` ] .right-panel-UnivEDACsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACsum-user[ ```r headtail(peng) ## Species composition *( freqSpec <- xtabs(~species,data=peng) ) ## Samplings by island ``` ] .right-panel-UnivEDACsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` species Adelie Chinstrap Gentoo 152 68 124 ``` ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACsum-user[ ```r headtail(peng) ## Species composition ( freqSpec <- xtabs(~species,data=peng) ) ## Samplings by island *( freqIsland <- xtabs(~island,data=peng) ) ## Sex distribution ``` ] .right-panel-UnivEDACsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` species Adelie Chinstrap Gentoo 152 68 124 ``` ``` island Biscoe Dream Torgersen 168 124 52 ``` ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACsum-user[ ```r headtail(peng) ## Species composition ( freqSpec <- xtabs(~species,data=peng) ) ## Samplings by island ( freqIsland <- xtabs(~island,data=peng) ) ## Sex distribution *( freqSex <- xtabs(~sex,data=peng) ) ``` ] .right-panel-UnivEDACsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` species Adelie Chinstrap Gentoo 152 68 124 ``` ``` island Biscoe Dream Torgersen 168 124 52 ``` ``` sex female male 165 168 ``` ] <style> .left-panel-UnivEDACsum-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDACsum-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDACsum-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## Univariate EDA - Categorical ### Percentage Tables - Computed with `percTable()` - Uses saved `xtabs()` object as first argument. --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACsump-auto[ ```r *freqSpec ``` ] .right-panel-UnivEDACsump-auto[ ``` species Adelie Chinstrap Gentoo 152 68 124 ``` ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACsump-auto[ ```r freqSpec *percTable(freqSpec) ``` ] .right-panel-UnivEDACsump-auto[ ``` species Adelie Chinstrap Gentoo 152 68 124 ``` ``` species Adelie Chinstrap Gentoo 44.2 19.8 36.0 ``` ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACsump-auto[ ```r freqSpec percTable(freqSpec) *freqIsland ``` ] .right-panel-UnivEDACsump-auto[ ``` species Adelie Chinstrap Gentoo 152 68 124 ``` ``` species Adelie Chinstrap Gentoo 44.2 19.8 36.0 ``` ``` island Biscoe Dream Torgersen 168 124 52 ``` ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACsump-auto[ ```r freqSpec percTable(freqSpec) freqIsland *percTable(freqIsland) ``` ] .right-panel-UnivEDACsump-auto[ ``` species Adelie Chinstrap Gentoo 152 68 124 ``` ``` species Adelie Chinstrap Gentoo 44.2 19.8 36.0 ``` ``` island Biscoe Dream Torgersen 168 124 52 ``` ``` island Biscoe Dream Torgersen 48.8 36.0 15.1 ``` ] <style> .left-panel-UnivEDACsump-auto { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDACsump-auto { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDACsump-auto { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## Univariate EDA - Categorical ### Bar Chart - Use `ggplot()` to declare data. - Must include data.frame name in `data=` argument. - Set `x=` to variable names in `mapping=aes()`. -- - Add on `geom_bar()` with - `color=` and `fill=` to set bar outline and fill colors. -- - Clean up with ... - `labs()` to label axes - `scale_y_continuous(expand=expansion(mult=c(0,0.05)))` to "sit" bars on x-axis, and - `theme_NCStats()` to make look nice. --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACbar-non_seq[ ```r headtail(peng) ## Species composition ``` ] .right-panel-UnivEDACbar-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACbar-non_seq[ ```r headtail(peng) ## Species composition *ggplot(data=peng,mapping=aes(x=species)) ``` ] .right-panel-UnivEDACbar-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDACbar_non_seq_2_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACbar-non_seq[ ```r headtail(peng) ## Species composition ggplot(data=peng,mapping=aes(x=species)) + * geom_bar( * ) ``` ] .right-panel-UnivEDACbar-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDACbar_non_seq_3_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACbar-non_seq[ ```r headtail(peng) ## Species composition ggplot(data=peng,mapping=aes(x=species)) + geom_bar( * color="black",fill="lightgray" ) ``` ] .right-panel-UnivEDACbar-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDACbar_non_seq_4_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACbar-non_seq[ ```r headtail(peng) ## Species composition ggplot(data=peng,mapping=aes(x=species)) + geom_bar( color="black",fill="lightgray" ) + * labs(x="Species",y="Frequency of Penguins") ``` ] .right-panel-UnivEDACbar-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDACbar_non_seq_5_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACbar-non_seq[ ```r headtail(peng) ## Species composition ggplot(data=peng,mapping=aes(x=species)) + geom_bar( color="black",fill="lightgray" ) + labs(x="Species",y="Frequency of Penguins") + * scale_y_continuous(expand=expansion(mult=c(0,0.05))) ``` ] .right-panel-UnivEDACbar-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDACbar_non_seq_6_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACbar-non_seq[ ```r headtail(peng) ## Species composition ggplot(data=peng,mapping=aes(x=species)) + geom_bar( color="black",fill="lightgray" ) + labs(x="Species",y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + * theme_NCStats() ``` ] .right-panel-UnivEDACbar-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDACbar_non_seq_7_output-1.png" width="100%" /> ] <style> .left-panel-UnivEDACbar-non_seq { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDACbar-non_seq { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDACbar-non_seq { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACbar2A-rotate[ ```r headtail(peng) ## Sampling by Island ggplot(data=peng,mapping=aes(x=species)) + geom_bar(color="black",fill="lightgray") + labs(x="Species",y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + theme_NCStats() ``` ] .right-panel-UnivEDACbar2A-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDACbar2A_rotate_1_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACbar2A-rotate[ ```r headtail(peng) ## Sampling by Island *ggplot(data=peng,mapping=aes(x=island)) + geom_bar(color="black",fill="lightgray") + labs(x="Species",y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + theme_NCStats() ``` ] .right-panel-UnivEDACbar2A-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDACbar2A_rotate_2_output-1.png" width="100%" /> ] <style> .left-panel-UnivEDACbar2A-rotate { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDACbar2A-rotate { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDACbar2A-rotate { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACbar2B-rotate[ ```r headtail(peng) ## Sampling by Island ggplot(data=peng,mapping=aes(x=island)) + geom_bar(color="black",fill="lightgray") + labs(x="Species",y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + theme_NCStats() ``` ] .right-panel-UnivEDACbar2B-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDACbar2B_rotate_1_output-1.png" width="100%" /> ] --- count: false ## Univariate EDA - Categorical .left-panel-UnivEDACbar2B-rotate[ ```r headtail(peng) ## Sampling by Island ggplot(data=peng,mapping=aes(x=island)) + geom_bar(color="black",fill="lightgray") + * labs(x="Island",y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + theme_NCStats() ``` ] .right-panel-UnivEDACbar2B-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDACbar2B_rotate_2_output-1.png" width="100%" /> ] <style> .left-panel-UnivEDACbar2B-rotate { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDACbar2B-rotate { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDACbar2B-rotate { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- name: RUnivEDAQC ## Univariate EDA - Quant by Groups ### Summary Statistics - Computed with `Summarize()` - Uses formula of the form `qvar~cvar` as first argument. -- - Must include data.frame name in `data=` argument. - Optionally set number of decimals with `digits=`. --- count: false .left-panel-UnivEDAQsum2-user[ ```r *headtail(peng) ## Distribution of flipper lengths by species ``` ] .right-panel-UnivEDAQsum2-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false .left-panel-UnivEDAQsum2-user[ ```r headtail(peng) ## Distribution of flipper lengths by species *Summarize(flipper_length_mm~species,data=peng,digits=1) ``` ] .right-panel-UnivEDAQsum2-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` species n nvalid mean sd min Q1 median Q3 max 1 Adelie 152 151 190.0 6.5 172 186 190 195 210 2 Chinstrap 68 68 195.8 7.1 178 191 196 201 212 3 Gentoo 124 123 217.2 6.5 203 212 216 221 231 ``` ] <style> .left-panel-UnivEDAQsum2-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDAQsum2-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDAQsum2-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## Univariate EDA - Quant by Groups ### Histogram - Make a histogram exactly as before. -- - Separate by groups by adding on `facet_wrap()` - Must include grouping variable name in `vars()` as first argument. --- count: false .left-panel-UnivEDAQhist2-user[ ```r *headtail(peng) ## Distribution of flipper lengths by species ``` ] .right-panel-UnivEDAQhist2-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false .left-panel-UnivEDAQhist2-user[ ```r headtail(peng) ## Distribution of flipper lengths by species *ggplot(data=peng,mapping=aes(x=flipper_length_mm)) + * geom_histogram(binwidth=5,boundary=0, * color="black",fill="lightgray") + * labs(x="Flipper Length (mm)", * y="Frequency of Penguins") + * scale_y_continuous(expand=expansion(mult=c(0,0.05))) + * theme_NCStats() ``` ] .right-panel-UnivEDAQhist2-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist2_user_2_output-1.png" width="100%" /> ] --- count: false .left-panel-UnivEDAQhist2-user[ ```r headtail(peng) ## Distribution of flipper lengths by species ggplot(data=peng,mapping=aes(x=flipper_length_mm)) + geom_histogram(binwidth=5,boundary=0, color="black",fill="lightgray") + labs(x="Flipper Length (mm)", y="Frequency of Penguins") + scale_y_continuous(expand=expansion(mult=c(0,0.05))) + theme_NCStats() + * facet_wrap(vars(species)) ``` ] .right-panel-UnivEDAQhist2-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQhist2_user_3_output-1.png" width="100%" /> ] <style> .left-panel-UnivEDAQhist2-user { color: #777; width: 39.2156862745098%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDAQhist2-user { width: 58.8235294117647%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDAQhist2-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## Univariate EDA - Quant by Groups ### Boxplots - Use `ggplot()` to declare data. - Must include data.frame name in `data=` argument. - Set `x=` (group) and `y=` (quantitative) to variable names in `mapping=aes()`. -- - Add on `geom_boxplot()` with - `color=` and `fill=` to set point outline and fill colors. -- - Clean up with ... - `labs()` to label axes - `theme_NCStats()` to make look nice. --- count: false .left-panel-UnivEDAQBoxplot-non_seq[ ```r headtail(peng) ## Distribution of flipper lengths by species ``` ] .right-panel-UnivEDAQBoxplot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false .left-panel-UnivEDAQBoxplot-non_seq[ ```r headtail(peng) ## Distribution of flipper lengths by species *ggplot(data=peng,mapping=aes(x=species,y=flipper_length_mm)) ``` ] .right-panel-UnivEDAQBoxplot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQBoxplot_non_seq_2_output-1.png" width="100%" /> ] --- count: false .left-panel-UnivEDAQBoxplot-non_seq[ ```r headtail(peng) ## Distribution of flipper lengths by species ggplot(data=peng,mapping=aes(x=species,y=flipper_length_mm)) + * geom_boxplot(color="black",fill="lightgray") ``` ] .right-panel-UnivEDAQBoxplot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQBoxplot_non_seq_3_output-1.png" width="100%" /> ] --- count: false .left-panel-UnivEDAQBoxplot-non_seq[ ```r headtail(peng) ## Distribution of flipper lengths by species ggplot(data=peng,mapping=aes(x=species,y=flipper_length_mm)) + geom_boxplot(color="black",fill="lightgray") + * labs(x="Species",y="Flipper Length (mm)") ``` ] .right-panel-UnivEDAQBoxplot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQBoxplot_non_seq_4_output-1.png" width="100%" /> ] --- count: false .left-panel-UnivEDAQBoxplot-non_seq[ ```r headtail(peng) ## Distribution of flipper lengths by species ggplot(data=peng,mapping=aes(x=species,y=flipper_length_mm)) + geom_boxplot(color="black",fill="lightgray") + labs(x="Species",y="Flipper Length (mm)") + * theme_NCStats() ``` ] .right-panel-UnivEDAQBoxplot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQBoxplot_non_seq_5_output-1.png" width="100%" /> ] <style> .left-panel-UnivEDAQBoxplot-non_seq { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDAQBoxplot-non_seq { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDAQBoxplot-non_seq { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false .left-panel-UnivEDAQBoxplot2A-rotate[ ```r headtail(peng) ## Distribution of bill lengths by species ggplot(data=peng,mapping=aes(x=species,y=flipper_length_mm)) + geom_boxplot(color="black",fill="lightgray") + labs(x="Species",y="Flipper Length (mm)") + theme_NCStats() ``` ] .right-panel-UnivEDAQBoxplot2A-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQBoxplot2A_rotate_1_output-1.png" width="100%" /> ] --- count: false .left-panel-UnivEDAQBoxplot2A-rotate[ ```r headtail(peng) ## Distribution of bill lengths by species *ggplot(data=peng,mapping=aes(x=species,y=bill_length_mm)) + geom_boxplot(color="black",fill="lightgray") + labs(x="Species",y="Flipper Length (mm)") + theme_NCStats() ``` ] .right-panel-UnivEDAQBoxplot2A-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQBoxplot2A_rotate_2_output-1.png" width="100%" /> ] <style> .left-panel-UnivEDAQBoxplot2A-rotate { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDAQBoxplot2A-rotate { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDAQBoxplot2A-rotate { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false .left-panel-UnivEDAQBoxplot2B-rotate[ ```r headtail(peng) ## Distribution of bill lengths by species ggplot(data=peng,mapping=aes(x=species,y=bill_length_mm)) + geom_boxplot(color="black",fill="lightgray") + labs(x="Species",y="Flipper Length (mm)") + theme_NCStats() ``` ] .right-panel-UnivEDAQBoxplot2B-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQBoxplot2B_rotate_1_output-1.png" width="100%" /> ] --- count: false .left-panel-UnivEDAQBoxplot2B-rotate[ ```r headtail(peng) ## Distribution of bill lengths by species ggplot(data=peng,mapping=aes(x=species,y=bill_length_mm)) + geom_boxplot(color="black",fill="lightgray") + * labs(x="Species",y="Bill Length (mm)") + theme_NCStats() ``` ] .right-panel-UnivEDAQBoxplot2B-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/UnivEDAQBoxplot2B_rotate_2_output-1.png" width="100%" /> ] <style> .left-panel-UnivEDAQBoxplot2B-rotate { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-UnivEDAQBoxplot2B-rotate { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-UnivEDAQBoxplot2B-rotate { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle --- class: inverse, center, middle name: RBivEDA # Bivariate EDA in R --- name: RBivEDAQ ## Bivariate EDA - Quantitative ### Correlation Coefficient - Computed with `corr()` - Uses formula of the form `~qvar1+qvar2` as first argument. -- - Must include data.frame name in `data=` argument. -- - Include `use="pairwise.complete.obs"` to remove any missing values. --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQsum-user[ ```r *headtail(peng_chin) ## Relationship between body mass & flipper length for Chinstrap Penguins ``` ] .right-panel-BivEDAQsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQsum-user[ ```r headtail(peng_chin) ## Relationship between body mass & flipper length for Chinstrap Penguins *corr(~body_mass_g+flipper_length_mm,data=peng_chin, * use="pairwise.complete.obs") ``` ] .right-panel-BivEDAQsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` [1] 0.6415594 ``` ] --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQsum-user[ ```r headtail(peng_chin) ## Relationship between body mass & flipper length for Chinstrap Penguins corr(~body_mass_g+flipper_length_mm,data=peng_chin, use="pairwise.complete.obs") ## Relationship between body mass, flipper length, and bill sizes ``` ] .right-panel-BivEDAQsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` [1] 0.6415594 ``` ] --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQsum-user[ ```r headtail(peng_chin) ## Relationship between body mass & flipper length for Chinstrap Penguins corr(~body_mass_g+flipper_length_mm,data=peng_chin, use="pairwise.complete.obs") ## Relationship between body mass, flipper length, and bill sizes *corr(~body_mass_g+flipper_length_mm+bill_length_mm+bill_depth_mm, * data=peng_chin,use="pairwise.complete.obs",digits=3) ``` ] .right-panel-BivEDAQsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` [1] 0.6415594 ``` ``` body_mass_g flipper_length_mm bill_length_mm bill_depth_mm body_mass_g 1.000 0.642 0.514 0.604 flipper_length_mm 0.642 1.000 0.472 0.580 bill_length_mm 0.514 0.472 1.000 0.654 bill_depth_mm 0.604 0.580 0.654 1.000 ``` ] <style> .left-panel-BivEDAQsum-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-BivEDAQsum-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-BivEDAQsum-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## Bivariate EDA - Quantitative ### Scatterplot - Use `ggplot()` to declare data. - Must include data.frame name in `data=` argument. - Set `x=` and `y=` to variable names in `mapping=aes()`. -- - Add on `geom_point()` with - `pch=21` to make circle with border and fill color options. -- - `color=` and `fill=` to set point outline and fill colors. -- - Clean up with ... - `labs()` to label axes - `theme_NCStats()` to make look nice. --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQPlot-non_seq[ ```r headtail(peng_chin) ## Relationship between body mass and flipper length for Chinstrap Penguins ``` ] .right-panel-BivEDAQPlot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQPlot-non_seq[ ```r headtail(peng_chin) ## Relationship between body mass and flipper length for Chinstrap Penguins *ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm,y=body_mass_g)) ``` ] .right-panel-BivEDAQPlot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/BivEDAQPlot_non_seq_2_output-1.png" width="100%" /> ] --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQPlot-non_seq[ ```r headtail(peng_chin) ## Relationship between body mass and flipper length for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm,y=body_mass_g)) + * geom_point( * ) ``` ] .right-panel-BivEDAQPlot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/BivEDAQPlot_non_seq_3_output-1.png" width="100%" /> ] --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQPlot-non_seq[ ```r headtail(peng_chin) ## Relationship between body mass and flipper length for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm,y=body_mass_g)) + geom_point( * pch=21, ) ``` ] .right-panel-BivEDAQPlot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/BivEDAQPlot_non_seq_4_output-1.png" width="100%" /> ] --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQPlot-non_seq[ ```r headtail(peng_chin) ## Relationship between body mass and flipper length for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm,y=body_mass_g)) + geom_point( pch=21, * color="black",fill="lightgray" ) ``` ] .right-panel-BivEDAQPlot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/BivEDAQPlot_non_seq_5_output-1.png" width="100%" /> ] --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQPlot-non_seq[ ```r headtail(peng_chin) ## Relationship between body mass and flipper length for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm,y=body_mass_g)) + geom_point( pch=21, color="black",fill="lightgray" ) + * labs(x="Flipper Length (mm)",y="Body Mass (g)") ``` ] .right-panel-BivEDAQPlot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/BivEDAQPlot_non_seq_6_output-1.png" width="100%" /> ] --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQPlot-non_seq[ ```r headtail(peng_chin) ## Relationship between body mass and flipper length for Chinstrap Penguins ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm,y=body_mass_g)) + geom_point( pch=21, color="black",fill="lightgray" ) + labs(x="Flipper Length (mm)",y="Body Mass (g)") + * theme_NCStats() ``` ] .right-panel-BivEDAQPlot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/BivEDAQPlot_non_seq_7_output-1.png" width="100%" /> ] <style> .left-panel-BivEDAQPlot-non_seq { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-BivEDAQPlot-non_seq { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-BivEDAQPlot-non_seq { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQPlot2A-rotate[ ```r headtail(peng) ## Relationship between bill depth and flipper length ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm,y=body_mass_g)) + geom_point(pch=21,color="black",fill="lightgray") + labs(x="Flipper Length (mm)",y="Body Mass (g)") + theme_NCStats() ``` ] .right-panel-BivEDAQPlot2A-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/BivEDAQPlot2A_rotate_1_output-1.png" width="100%" /> ] --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQPlot2A-rotate[ ```r headtail(peng) ## Relationship between bill depth and flipper length *ggplot(data=peng,mapping=aes(x=flipper_length_mm,y=bill_depth_mm)) + geom_point(pch=21,color="black",fill="lightgray") + labs(x="Flipper Length (mm)",y="Body Mass (g)") + theme_NCStats() ``` ] .right-panel-BivEDAQPlot2A-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/BivEDAQPlot2A_rotate_2_output-1.png" width="100%" /> ] <style> .left-panel-BivEDAQPlot2A-rotate { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-BivEDAQPlot2A-rotate { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-BivEDAQPlot2A-rotate { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQPlot2B-rotate[ ```r headtail(peng) ## Relationship between bill depth and flipper length ggplot(data=peng,mapping=aes(x=flipper_length_mm,y=bill_depth_mm)) + geom_point(pch=21,color="black",fill="lightgray") + labs(x="Flipper Length (mm)",y="Body Mass (g)") + theme_NCStats() ``` ] .right-panel-BivEDAQPlot2B-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/BivEDAQPlot2B_rotate_1_output-1.png" width="100%" /> ] --- count: false ## Bivariate EDA - Quantitative .left-panel-BivEDAQPlot2B-rotate[ ```r headtail(peng) ## Relationship between bill depth and flipper length ggplot(data=peng,mapping=aes(x=flipper_length_mm,y=bill_depth_mm)) + geom_point(pch=21,color="black",fill="lightgray") + * labs(x="Flipper Length (mm)",y="Bill Depth (mm)") + theme_NCStats() ``` ] .right-panel-BivEDAQPlot2B-rotate[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/BivEDAQPlot2B_rotate_2_output-1.png" width="100%" /> ] <style> .left-panel-BivEDAQPlot2B-rotate { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-BivEDAQPlot2B-rotate { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-BivEDAQPlot2B-rotate { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- name: RBivEDAC ## Bivariate EDA - Categorical ### Frequency Tables - Computed with `xtabs()` - Uses formula of the form `~cvarRow+cvarCol` as first argument. -- - Must include data.frame name in `data=` argument. - Assign (save) result to object. -- - Put saved `xtabs()` object in `addmargins()` to show totals. --- count: false .left-panel-BivEDACsum-user[ ```r *headtail(peng) ## Species composition by island ``` ] .right-panel-BivEDACsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false .left-panel-BivEDACsum-user[ ```r headtail(peng) ## Species composition by island *( freqIS <- xtabs(~island+species,data=peng) ) ``` ] .right-panel-BivEDACsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` species island Adelie Chinstrap Gentoo Biscoe 44 0 124 Dream 56 68 0 Torgersen 52 0 0 ``` ] --- count: false .left-panel-BivEDACsum-user[ ```r headtail(peng) ## Species composition by island ( freqIS <- xtabs(~island+species,data=peng) ) *addmargins(freqIS) ``` ] .right-panel-BivEDACsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` species island Adelie Chinstrap Gentoo Biscoe 44 0 124 Dream 56 68 0 Torgersen 52 0 0 ``` ``` species island Adelie Chinstrap Gentoo Sum Biscoe 44 0 124 168 Dream 56 68 0 124 Torgersen 52 0 0 52 Sum 152 68 124 344 ``` ] --- count: false .left-panel-BivEDACsum-user[ ```r headtail(peng) ## Species composition by island ( freqIS <- xtabs(~island+species,data=peng) ) addmargins(freqIS) ## Sex distribution by species *freqSS <- xtabs(~species+sex,data=peng) *addmargins(freqSS) ``` ] .right-panel-BivEDACsum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` species island Adelie Chinstrap Gentoo Biscoe 44 0 124 Dream 56 68 0 Torgersen 52 0 0 ``` ``` species island Adelie Chinstrap Gentoo Sum Biscoe 44 0 124 168 Dream 56 68 0 124 Torgersen 52 0 0 52 Sum 152 68 124 344 ``` ``` sex species female male Sum Adelie 73 73 146 Chinstrap 34 34 68 Gentoo 58 61 119 Sum 165 168 333 ``` ] <style> .left-panel-BivEDACsum-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-BivEDACsum-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-BivEDACsum-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## Bivariate EDA - Categorical ### Percentage Tables - Computed with `percTable()` - Uses saved `xtabs()` object as first argument. -- - Default is to make total percentages. -- - Use `margin=1` to make row percentages. - Use `margin=2` to make column percentages. --- count: false .left-panel-BivEDACsump-user[ ```r ## Frequency table *addmargins(freqIS) ``` ] .right-panel-BivEDACsump-user[ ``` species island Adelie Chinstrap Gentoo Sum Biscoe 44 0 124 168 Dream 56 68 0 124 Torgersen 52 0 0 52 Sum 152 68 124 344 ``` ] --- count: false .left-panel-BivEDACsump-user[ ```r ## Frequency table addmargins(freqIS) ## Total percentages table *percTable(freqIS) ``` ] .right-panel-BivEDACsump-user[ ``` species island Adelie Chinstrap Gentoo Sum Biscoe 44 0 124 168 Dream 56 68 0 124 Torgersen 52 0 0 52 Sum 152 68 124 344 ``` ``` species island Adelie Chinstrap Gentoo Sum Biscoe 12.8 0.0 36.0 48.8 Dream 16.3 19.8 0.0 36.1 Torgersen 15.1 0.0 0.0 15.1 Sum 44.2 19.8 36.0 100.0 ``` ] --- count: false .left-panel-BivEDACsump-user[ ```r ## Frequency table addmargins(freqIS) ## Total percentages table percTable(freqIS) ## Row percentages table *percTable(freqIS,margin=1) ``` ] .right-panel-BivEDACsump-user[ ``` species island Adelie Chinstrap Gentoo Sum Biscoe 44 0 124 168 Dream 56 68 0 124 Torgersen 52 0 0 52 Sum 152 68 124 344 ``` ``` species island Adelie Chinstrap Gentoo Sum Biscoe 12.8 0.0 36.0 48.8 Dream 16.3 19.8 0.0 36.1 Torgersen 15.1 0.0 0.0 15.1 Sum 44.2 19.8 36.0 100.0 ``` ``` species island Adelie Chinstrap Gentoo Sum Biscoe 26.2 0.0 73.8 100.0 Dream 45.2 54.8 0.0 100.0 Torgersen 100.0 0.0 0.0 100.0 ``` ] --- count: false .left-panel-BivEDACsump-user[ ```r ## Frequency table addmargins(freqIS) ## Total percentages table percTable(freqIS) ## Row percentages table percTable(freqIS,margin=1) ## Column percentages table *percTable(freqIS,margin=2) ``` ] .right-panel-BivEDACsump-user[ ``` species island Adelie Chinstrap Gentoo Sum Biscoe 44 0 124 168 Dream 56 68 0 124 Torgersen 52 0 0 52 Sum 152 68 124 344 ``` ``` species island Adelie Chinstrap Gentoo Sum Biscoe 12.8 0.0 36.0 48.8 Dream 16.3 19.8 0.0 36.1 Torgersen 15.1 0.0 0.0 15.1 Sum 44.2 19.8 36.0 100.0 ``` ``` species island Adelie Chinstrap Gentoo Sum Biscoe 26.2 0.0 73.8 100.0 Dream 45.2 54.8 0.0 100.0 Torgersen 100.0 0.0 0.0 100.0 ``` ``` species island Adelie Chinstrap Gentoo Biscoe 28.9 0.0 100.0 Dream 36.8 100.0 0.0 Torgersen 34.2 0.0 0.0 Sum 99.9 100.0 100.0 ``` ] <style> .left-panel-BivEDACsump-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-BivEDACsump-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-BivEDACsump-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle --- class: inverse, center, middle name: RRegression # Linear Regression in R --- ## Fitting Linear Regression - Computed with `lm()` - Uses formula of the form `qvarResp~qvarExplan` as first argument. -- - Must include data.frame name in `data=` argument. - Assign (save) result to object. -- - Use `rSquared()` to compute r<sup>2</sup> value. --- count: false ## Fitting Linear Regression .left-panel-RegressionSum-user[ ```r *headtail(peng_chin) ## Can body mass be predicted from flipper length for Chinstraps ``` ] .right-panel-RegressionSum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Fitting Linear Regression .left-panel-RegressionSum-user[ ```r headtail(peng_chin) ## Can body mass be predicted from flipper length for Chinstraps *( slr <- lm(body_mass_g~flipper_length_mm,data=peng_chin) ) ``` ] .right-panel-RegressionSum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` Coefficients: (Intercept) flipper_length_mm -3037.20 34.57 ``` ] --- count: false ## Fitting Linear Regression .left-panel-RegressionSum-user[ ```r headtail(peng_chin) ## Can body mass be predicted from flipper length for Chinstraps ( slr <- lm(body_mass_g~flipper_length_mm,data=peng_chin) ) *rSquared(slr) ``` ] .right-panel-RegressionSum-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` Coefficients: (Intercept) flipper_length_mm -3037.20 34.57 ``` ``` [1] 0.4115985 ``` ] <style> .left-panel-RegressionSum-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-RegressionSum-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-RegressionSum-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## Showing Best-Fit Line - Make a scatterplot as usual -- - Add line with `geom_smooth(method="lm",se=FALSE)` --- count: false ## Showing Best-Fit Line .left-panel-RegressionPlot-non_seq[ ```r headtail(peng_chin) ## Can body mass be predicted from flipper length for Chinstraps ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm,y=body_mass_g)) + geom_point(pch=21,color="black",fill="lightgray") + labs(x="Flipper Length (mm)",y="Body Mass (g)") + theme_NCStats() ``` ] .right-panel-RegressionPlot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/RegressionPlot_non_seq_1_output-1.png" width="100%" /> ] --- count: false ## Showing Best-Fit Line .left-panel-RegressionPlot-non_seq[ ```r headtail(peng_chin) ## Can body mass be predicted from flipper length for Chinstraps ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm,y=body_mass_g)) + geom_point(pch=21,color="black",fill="lightgray") + labs(x="Flipper Length (mm)",y="Body Mass (g)") + theme_NCStats() + * geom_smooth( * ) ``` ] .right-panel-RegressionPlot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/RegressionPlot_non_seq_2_output-1.png" width="100%" /> ] --- count: false ## Showing Best-Fit Line .left-panel-RegressionPlot-non_seq[ ```r headtail(peng_chin) ## Can body mass be predicted from flipper length for Chinstraps ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm,y=body_mass_g)) + geom_point(pch=21,color="black",fill="lightgray") + labs(x="Flipper Length (mm)",y="Body Mass (g)") + theme_NCStats() + geom_smooth( * method="lm", ) ``` ] .right-panel-RegressionPlot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/RegressionPlot_non_seq_3_output-1.png" width="100%" /> ] --- count: false ## Showing Best-Fit Line .left-panel-RegressionPlot-non_seq[ ```r headtail(peng_chin) ## Can body mass be predicted from flipper length for Chinstraps ggplot(data=peng_chin,mapping=aes(x=flipper_length_mm,y=body_mass_g)) + geom_point(pch=21,color="black",fill="lightgray") + labs(x="Flipper Length (mm)",y="Body Mass (g)") + theme_NCStats() + geom_smooth( method="lm", * se=FALSE ) ``` ] .right-panel-RegressionPlot-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` <img src="Penguins_files/figure-html/RegressionPlot_non_seq_4_output-1.png" width="100%" /> ] <style> .left-panel-RegressionPlot-non_seq { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-RegressionPlot-non_seq { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-RegressionPlot-non_seq { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## Make a Prediction - Use `predict()` - Saved `lm()` object is the first argument -- - Second argument is a data frame created with `data.frame()`, including the **EXACT** name of the explanatory variable set equal to the value for which to make the prediction --- count: false ## Make a Prediction .left-panel-RegressionPredict-user[ ```r *slr ## Predict body weight if flipper length is 195 mm ``` ] .right-panel-RegressionPredict-user[ ``` Coefficients: (Intercept) flipper_length_mm -3037.20 34.57 ``` ] --- count: false ## Make a Prediction .left-panel-RegressionPredict-user[ ```r slr ## Predict body weight if flipper length is 195 mm *predict(slr,data.frame(flipper_length_mm=195)) ``` ] .right-panel-RegressionPredict-user[ ``` Coefficients: (Intercept) flipper_length_mm -3037.20 34.57 ``` ``` 1 3704.616 ``` ] <style> .left-panel-RegressionPredict-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-RegressionPredict-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-RegressionPredict-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle --- class: inverse, center, middle name: Rttests # t-Tests in R --- name: Rttests1 ## 1-Sample t-Test - Computed with `t.test()` -- - Use formula of form `~qvar` as first argument. -- - Must include data.frame name in `data=` argument. -- - Assign value in H<sub>0</sub> to `mu=`. -- - Choose H<sub>A</sub> direction in `alt=` with `"two.sided"` for "not equals", `"less"` for "less than", or `"greater"` for "greater than". -- - Set confidence level (1-α) in `conf.level=` --- count: false ## 1-Sample t-Test .left-panel-ttest1-non_seq[ ```r headtail(peng_chin) ## Is mean flipper length of Chinstrap Penguins ## greater than 190 mm. Use alpha=0.01 ``` ] .right-panel-ttest1-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## 1-Sample t-Test .left-panel-ttest1-non_seq[ ```r headtail(peng_chin) ## Is mean flipper length of Chinstrap Penguins ## greater than 190 mm. Use alpha=0.01 *t.test(~flipper_length_mm,data=peng_chin * ) ``` ] .right-panel-ttest1-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` One Sample t-test with flipper_length_mm t = 226.4198, df = 67, p-value < 2.2e-16 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: 194.0972 197.5498 sample estimates: mean of x 195.8235 ``` ] --- count: false ## 1-Sample t-Test .left-panel-ttest1-non_seq[ ```r headtail(peng_chin) ## Is mean flipper length of Chinstrap Penguins ## greater than 190 mm. Use alpha=0.01 t.test(~flipper_length_mm,data=peng_chin * ,mu=190 ) ``` ] .right-panel-ttest1-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` One Sample t-test with flipper_length_mm t = 6.7334, df = 67, p-value = 4.531e-09 alternative hypothesis: true mean is not equal to 190 95 percent confidence interval: 194.0972 197.5498 sample estimates: mean of x 195.8235 ``` ] --- count: false ## 1-Sample t-Test .left-panel-ttest1-non_seq[ ```r headtail(peng_chin) ## Is mean flipper length of Chinstrap Penguins ## greater than 190 mm. Use alpha=0.01 t.test(~flipper_length_mm,data=peng_chin ,mu=190 * ,alt="greater" ) ``` ] .right-panel-ttest1-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` One Sample t-test with flipper_length_mm t = 6.7334, df = 67, p-value = 2.265e-09 alternative hypothesis: true mean is greater than 190 95 percent confidence interval: 194.381 Inf sample estimates: mean of x 195.8235 ``` ] --- count: false ## 1-Sample t-Test .left-panel-ttest1-non_seq[ ```r headtail(peng_chin) ## Is mean flipper length of Chinstrap Penguins ## greater than 190 mm. Use alpha=0.01 t.test(~flipper_length_mm,data=peng_chin ,mu=190 ,alt="greater" * ,conf.level=0.99 ) ``` ] .right-panel-ttest1-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Chinstrap Dream 46.5 17.9 192 3500 female 2 Chinstrap Dream 50.0 19.5 196 3900 male 3 Chinstrap Dream 51.3 19.2 193 3650 male 66 Chinstrap Dream 49.6 18.2 193 3775 male 67 Chinstrap Dream 50.8 19.0 210 4100 male 68 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` One Sample t-test with flipper_length_mm t = 6.7334, df = 67, p-value = 2.265e-09 alternative hypothesis: true mean is greater than 190 99 percent confidence interval: 193.7623 Inf sample estimates: mean of x 195.8235 ``` ] <style> .left-panel-ttest1-non_seq { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-ttest1-non_seq { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-ttest1-non_seq { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- name: Rttests2 ## Levene's Test - Computed with `levenesTest()` -- - Use form of form `qvar~cvar` as first argument. -- - Must include data.frame name in `data=` argument. --- count: false ## Levene's Test .left-panel-LevenesTest-user[ ```r *headtail(peng_chingen) ## Is mean flipper length smaller for Chinstrap ## than Gentoo Penguins. Use alpha=0.10. ``` ] .right-panel-LevenesTest-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Gentoo Biscoe 46.1 13.2 211 4500 female 2 Gentoo Biscoe 50.0 16.3 230 5700 male 3 Gentoo Biscoe 48.7 14.1 210 4450 female 190 Chinstrap Dream 49.6 18.2 193 3775 male 191 Chinstrap Dream 50.8 19.0 210 4100 male 192 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Levene's Test .left-panel-LevenesTest-user[ ```r headtail(peng_chingen) ## Is mean flipper length smaller for Chinstrap ## than Gentoo Penguins. Use alpha=0.10. *levenesTest(flipper_length_mm~species,data=peng_chingen) ``` ] .right-panel-LevenesTest-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Gentoo Biscoe 46.1 13.2 211 4500 female 2 Gentoo Biscoe 50.0 16.3 230 5700 male 3 Gentoo Biscoe 48.7 14.1 210 4450 female 190 Chinstrap Dream 49.6 18.2 193 3775 male 191 Chinstrap Dream 50.8 19.0 210 4100 male 192 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` Levene's Test for Homogeneity of Variance (center = median) Df F value Pr(>F) group 1 0.2187 0.6406 189 ``` ] <style> .left-panel-LevenesTest-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-LevenesTest-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-LevenesTest-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- ## 2-Sample t-Test - Computed with `t.test()` -- - Use form of form `qvar~cvar` as first argument. - Must include data.frame name in `data=` argument. -- - Set `var.equal=TRUE` if Levene's Test indicates that variances are equal. -- - Assign value in H<sub>0</sub> to `mu=` (usually 0, which is the **default**). -- - Choose H<sub>A</sub> direction in `alt=` with `"two.sided"` for "not equals", `"less"` for "less than", or `"greater"` for "greater than". -- - Set confidence level (1-α) in `conf.level=` --- count: false ## 2-Sample t-Test .left-panel-ttest2-non_seq[ ```r headtail(peng_chingen) ## Is mean flipper length smaller for Chinstrap ## than Gentoo Penguins. Use alpha=0.10. ``` ] .right-panel-ttest2-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Gentoo Biscoe 46.1 13.2 211 4500 female 2 Gentoo Biscoe 50.0 16.3 230 5700 male 3 Gentoo Biscoe 48.7 14.1 210 4450 female 190 Chinstrap Dream 49.6 18.2 193 3775 male 191 Chinstrap Dream 50.8 19.0 210 4100 male 192 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## 2-Sample t-Test .left-panel-ttest2-non_seq[ ```r headtail(peng_chingen) ## Is mean flipper length smaller for Chinstrap ## than Gentoo Penguins. Use alpha=0.10. *t.test(flipper_length_mm~species,data=peng_chingen * ) ``` ] .right-panel-ttest2-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Gentoo Biscoe 46.1 13.2 211 4500 female 2 Gentoo Biscoe 50.0 16.3 230 5700 male 3 Gentoo Biscoe 48.7 14.1 210 4450 female 190 Chinstrap Dream 49.6 18.2 193 3775 male 191 Chinstrap Dream 50.8 19.0 210 4100 male 192 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` Welch Two Sample t-test with flipper_length_mm by species t = -20.4633, df = 127.608, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -23.42923 -19.29770 sample estimates: mean in group Chinstrap mean in group Gentoo 195.8235 217.1870 ``` ] --- count: false ## 2-Sample t-Test .left-panel-ttest2-non_seq[ ```r headtail(peng_chingen) ## Is mean flipper length smaller for Chinstrap ## than Gentoo Penguins. Use alpha=0.10. t.test(flipper_length_mm~species,data=peng_chingen * ,var.equal=TRUE ) ``` ] .right-panel-ttest2-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Gentoo Biscoe 46.1 13.2 211 4500 female 2 Gentoo Biscoe 50.0 16.3 230 5700 male 3 Gentoo Biscoe 48.7 14.1 210 4450 female 190 Chinstrap Dream 49.6 18.2 193 3775 male 191 Chinstrap Dream 50.8 19.0 210 4100 male 192 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` Two Sample t-test with flipper_length_mm by species t = -21.0329, df = 189, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -23.36706 -19.35987 sample estimates: mean in group Chinstrap mean in group Gentoo 195.8235 217.1870 ``` ] --- count: false ## 2-Sample t-Test .left-panel-ttest2-non_seq[ ```r headtail(peng_chingen) ## Is mean flipper length smaller for Chinstrap ## than Gentoo Penguins. Use alpha=0.10. t.test(flipper_length_mm~species,data=peng_chingen ,var.equal=TRUE * ,alt="less" ) ``` ] .right-panel-ttest2-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Gentoo Biscoe 46.1 13.2 211 4500 female 2 Gentoo Biscoe 50.0 16.3 230 5700 male 3 Gentoo Biscoe 48.7 14.1 210 4450 female 190 Chinstrap Dream 49.6 18.2 193 3775 male 191 Chinstrap Dream 50.8 19.0 210 4100 male 192 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` Two Sample t-test with flipper_length_mm by species t = -21.0329, df = 189, p-value < 2.2e-16 alternative hypothesis: true difference in means is less than 0 95 percent confidence interval: -Inf -19.68453 sample estimates: mean in group Chinstrap mean in group Gentoo 195.8235 217.1870 ``` ] --- count: false ## 2-Sample t-Test .left-panel-ttest2-non_seq[ ```r headtail(peng_chingen) ## Is mean flipper length smaller for Chinstrap ## than Gentoo Penguins. Use alpha=0.10. t.test(flipper_length_mm~species,data=peng_chingen ,var.equal=TRUE ,alt="less" * ,conf.level=0.90 ) ``` ] .right-panel-ttest2-non_seq[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Gentoo Biscoe 46.1 13.2 211 4500 female 2 Gentoo Biscoe 50.0 16.3 230 5700 male 3 Gentoo Biscoe 48.7 14.1 210 4450 female 190 Chinstrap Dream 49.6 18.2 193 3775 male 191 Chinstrap Dream 50.8 19.0 210 4100 male 192 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` Two Sample t-test with flipper_length_mm by species t = -21.0329, df = 189, p-value < 2.2e-16 alternative hypothesis: true difference in means is less than 0 90 percent confidence interval: -Inf -20.05721 sample estimates: mean in group Chinstrap mean in group Gentoo 195.8235 217.1870 ``` ] <style> .left-panel-ttest2-non_seq { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-ttest2-non_seq { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-ttest2-non_seq { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle --- class: inverse, center, middle name: RChi # Chi-square Tests in R --- name: RChiChi ## Chi-square Test - Must have a two-way observed frequency table - Grouping variable should be in rows. - Response variable should be in columns. -- - **Raw data** ... use `xtabs()` (*described in reading and here*). -- - **Summarized data** ... use `matrix()` (*described in reading*). -- - Compute chi-square results with `chisq.test()`. - Object with observed frequency table as first argument. -- - Use `correct=FALSE` to "turn-off" continuity correction. -- - Assign (save) result to an object. -- - Obtain expected table by appending `$expected` to object name. -- - Construct a row-percentages table to examine differences. --- count: false ## Chi-square Test .left-panel-chi-user[ ```r *headtail(peng) ## Does sex ratio differ among species of Penguins. ## Use alpha=0.01 ``` ] .right-panel-chi-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Chi-square Test .left-panel-chi-user[ ```r headtail(peng) ## Does sex ratio differ among species of Penguins. ## Use alpha=0.01 *( peng_sex <- xtabs(~species+sex,data=peng) ) ``` ] .right-panel-chi-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` sex species female male Adelie 73 73 Chinstrap 34 34 Gentoo 58 61 ``` ] --- count: false ## Chi-square Test .left-panel-chi-user[ ```r headtail(peng) ## Does sex ratio differ among species of Penguins. ## Use alpha=0.01 ( peng_sex <- xtabs(~species+sex,data=peng) ) *( chi_ps <- chisq.test(peng_sex,correct=FALSE) ) ``` ] .right-panel-chi-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` sex species female male Adelie 73 73 Chinstrap 34 34 Gentoo 58 61 ``` ``` Pearson's Chi-squared test with peng_sex X-squared = 0.0486, df = 2, p-value = 0.976 ``` ] --- count: false ## Chi-square Test .left-panel-chi-user[ ```r headtail(peng) ## Does sex ratio differ among species of Penguins. ## Use alpha=0.01 ( peng_sex <- xtabs(~species+sex,data=peng) ) ( chi_ps <- chisq.test(peng_sex,correct=FALSE) ) *chi_ps$expected ``` ] .right-panel-chi-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` sex species female male Adelie 73 73 Chinstrap 34 34 Gentoo 58 61 ``` ``` Pearson's Chi-squared test with peng_sex X-squared = 0.0486, df = 2, p-value = 0.976 ``` ``` sex species female male Adelie 72.34234 73.65766 Chinstrap 33.69369 34.30631 Gentoo 58.96396 60.03604 ``` ] --- count: false ## Chi-square Test .left-panel-chi-user[ ```r headtail(peng) ## Does sex ratio differ among species of Penguins. ## Use alpha=0.01 ( peng_sex <- xtabs(~species+sex,data=peng) ) ( chi_ps <- chisq.test(peng_sex,correct=FALSE) ) chi_ps$expected *percTable(peng_sex,margin=1) ``` ] .right-panel-chi-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` sex species female male Adelie 73 73 Chinstrap 34 34 Gentoo 58 61 ``` ``` Pearson's Chi-squared test with peng_sex X-squared = 0.0486, df = 2, p-value = 0.976 ``` ``` sex species female male Adelie 72.34234 73.65766 Chinstrap 33.69369 34.30631 Gentoo 58.96396 60.03604 ``` ``` sex species female male Sum Adelie 50.0 50.0 100.0 Chinstrap 50.0 50.0 100.0 Gentoo 48.7 51.3 100.0 ``` ] <style> .left-panel-chi-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-chi-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-chi-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- name: RChiGOF ## Goodness-of-Fit Test - Make an expected table with `c()` with names. - Any values that are proportional to the expected proportions. -- - Create an observed frequency table. - **Raw Data** ... use `xtabs()` (*demonstrated here and in reading*). - **Summmarized Data ** ... use `c()` with names (*demonstrated in reading*). -- - Compute GOF results with `chisq.test()` - Observed table is first argument. -- - Expected table object in `p=` argument. -- - Use `rescale.p=TRUE` to assure proper expected values. -- - Use `correct=FALSE` to "turn-off" continuity correction. -- - Assign (save) result to an object. -- - Obtain expected table by appending `$expected` to object name. -- - Use `gofCI()` to see how expected and observed proportions compare. --- count: false ## Goodness-of-Fit Test .left-panel-gof-user[ ```r *headtail(peng) ## Do samples from islands follow a 3:2:1 ratio ## for Biscoe, Dream, and Torgerson. Use alpha=0.01. ``` ] .right-panel-gof-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ] --- count: false ## Goodness-of-Fit Test .left-panel-gof-user[ ```r headtail(peng) ## Do samples from islands follow a 3:2:1 ratio ## for Biscoe, Dream, and Torgerson. Use alpha=0.01. *( exp <- c("Biscoe"=3,"Dream"=2,"Torgerson"=1) ) ``` ] .right-panel-gof-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` Biscoe Dream Torgerson 3 2 1 ``` ] --- count: false ## Goodness-of-Fit Test .left-panel-gof-user[ ```r headtail(peng) ## Do samples from islands follow a 3:2:1 ratio ## for Biscoe, Dream, and Torgerson. Use alpha=0.01. ( exp <- c("Biscoe"=3,"Dream"=2,"Torgerson"=1) ) *( peng_isle <- xtabs(~island,data=peng) ) ``` ] .right-panel-gof-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` Biscoe Dream Torgerson 3 2 1 ``` ``` island Biscoe Dream Torgersen 168 124 52 ``` ] --- count: false ## Goodness-of-Fit Test .left-panel-gof-user[ ```r headtail(peng) ## Do samples from islands follow a 3:2:1 ratio ## for Biscoe, Dream, and Torgerson. Use alpha=0.01. ( exp <- c("Biscoe"=3,"Dream"=2,"Torgerson"=1) ) ( peng_isle <- xtabs(~island,data=peng) ) *( gof_pi <- chisq.test(peng_isle,p=exp,rescale.p=TRUE,correct=FALSE) ) ``` ] .right-panel-gof-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` Biscoe Dream Torgerson 3 2 1 ``` ``` island Biscoe Dream Torgersen 168 124 52 ``` ``` Chi-squared test for given probabilities with peng_isle X-squared = 1.3488, df = 2, p-value = 0.5095 ``` ] --- count: false ## Goodness-of-Fit Test .left-panel-gof-user[ ```r headtail(peng) ## Do samples from islands follow a 3:2:1 ratio ## for Biscoe, Dream, and Torgerson. Use alpha=0.01. ( exp <- c("Biscoe"=3,"Dream"=2,"Torgerson"=1) ) ( peng_isle <- xtabs(~island,data=peng) ) ( gof_pi <- chisq.test(peng_isle,p=exp,rescale.p=TRUE,correct=FALSE) ) *gof_pi$expected ``` ] .right-panel-gof-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` Biscoe Dream Torgerson 3 2 1 ``` ``` island Biscoe Dream Torgersen 168 124 52 ``` ``` Chi-squared test for given probabilities with peng_isle X-squared = 1.3488, df = 2, p-value = 0.5095 ``` ``` Biscoe Dream Torgersen 172.00000 114.66667 57.33333 ``` ] --- count: false ## Goodness-of-Fit Test .left-panel-gof-user[ ```r headtail(peng) ## Do samples from islands follow a 3:2:1 ratio ## for Biscoe, Dream, and Torgerson. Use alpha=0.01. ( exp <- c("Biscoe"=3,"Dream"=2,"Torgerson"=1) ) ( peng_isle <- xtabs(~island,data=peng) ) ( gof_pi <- chisq.test(peng_isle,p=exp,rescale.p=TRUE,correct=FALSE) ) gof_pi$expected *gofCI(gof_pi,digits=3) ``` ] .right-panel-gof-user[ ``` species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex 1 Adelie Torgersen 39.1 18.7 181 3750 male 2 Adelie Torgersen 39.5 17.4 186 3800 female 3 Adelie Torgersen 40.3 18.0 195 3250 female 342 Chinstrap Dream 49.6 18.2 193 3775 male 343 Chinstrap Dream 50.8 19.0 210 4100 male 344 Chinstrap Dream 50.2 18.7 198 3775 female ``` ``` Biscoe Dream Torgerson 3 2 1 ``` ``` island Biscoe Dream Torgersen 168 124 52 ``` ``` Chi-squared test for given probabilities with peng_isle X-squared = 1.3488, df = 2, p-value = 0.5095 ``` ``` Biscoe Dream Torgersen 172.00000 114.66667 57.33333 ``` ``` p.obs p.LCI p.UCI p.exp Biscoe 0.488 0.436 0.541 0.500 Dream 0.360 0.312 0.412 0.333 Torgersen 0.151 0.117 0.193 0.167 ``` ] <style> .left-panel-gof-user { color: #777; width: 49.0196078431373%; height: 92%; float: left; font-size: 80% } .right-panel-gof-user { width: 49.0196078431373%; float: right; padding-left: 1%; font-size: 80% } .middle-panel-gof-user { width: 0%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle