Background
Kaggle has a database of the price and sales of avocados from many areas around the United States from 2015-2018. The data is here (with metadata under “columns” at this site).1
Make sure (as always) to load tidyverse
.
library(tidyverse)
Their data are loaded below2 and, for this exercise, reduced to only results for the “GreatLakes” region (i.e., filter()
), and type
and year
are converted to groupings (i.e., factor()
within mutate()
).3
#!# Set to your own working directory and have just your filename below.
avoc <- read.csv("https://raw.githubusercontent.com/droglenc/NCData/master/Avocados.csv",
stringsAsFactors=FALSE,quote="") %>%
filter(region=="GreatLakes") %>%
mutate(type=factor(type),
year=factor(year))
str(avoc)
Example
Construct ggplot2
code to match the graph below (as closely as you can).
This CSV file is freely available from Kaggle at this page. However, you must sign-in to download the data file. I did this and made it available from my website at the link above.↩︎
These data were read directly from the webpage. However, the data can be downloaded to your computer and loaded from there into R, which would be similar to how you would load your own data into R. How to load a CSV file into RStudio is described in this video, for which the password is “NCStats” (without the quotes).↩︎
These code can be copied as is, but make sure to set your working directory with
setwd()
and to put just the filename insideread.csv()
.↩︎