Background
Households were asked if they would accept an offer to put solar panels on the roof of their house if they would receive a 50% subsidy from the state. Demographic variables for each household such as income, size, monthly mortgage payment, and age of the head of household were also recorded. The analysis below will examine the willingness of the respondents to accept the offer relative to their annual income (which was recorded in $1000s; i.e., 80 means $80000). The data were recorded in SolarOffer.csv and are loaded below.
> spo <- read.csv("https://raw.githubusercontent.com/droglenc/NCData/master/SolarOffer.csv")
> str(spo)
'data.frame': 30 obs. of 5 variables:
$ income : int 80 60 35 45 29 43 34 104 102 59 ...
$ age : int 30 34 25 27 23 28 24 43 46 36 ...
$ takeoffer: Factor w/ 2 levels "decline","take": 2 2 1 1 1 1 1 2 2 1 ...
$ mortgage : int 2000 2100 1500 1800 1900 1600 1500 2400 2700 2600 ...
$ famsize : int 4 3 2 4 2 3 1 5 3 2 ...
> xlbl <- "Family Income (1000s)"
> ylbl <- "Response to Offer"
The logistic regression model is fit and visualized below. The model was used to answer specific questions further below.
> glm2 <- glm(takeoffer~income,data=spo,family=binomial)
> fitPlot(glm2,xlab=xlbl,ylab=ylbl,breaks=seq(25,135,5))
Figure 1: Fitted line plot for the logistic regression of willingness to accept the solar panel offer on family income.
- Comment on the fit of the logistic regression model.
- The logistic regression fits fairly well as there is a fairly distinct break in whether respondents will accept the offer near a family income of approximately $60000 (Figure 1). However, the model does seem to over predict their willingness to accept the offer just below this break and under predict just above it.
- Describe the relationship between the probability of accepting the offer and family income.
- The probability of accepting the offer is near 0 to about a family income of $45000, where it rises sharply until about $80000, after which the probability of accepting the offer is nearly 1.
- Is there a significant relationship between willingness to accept the offer and family income?
- There is a significant relationship between the log(odds) of accepting the offer and family income (p=0.0414)
> summary(glm2)$coefficients
Estimate Std. Error z value Pr(>|z|)
(Intercept) -12.8450260 6.16398385 -2.083884 0.03717074
income 0.1993402 0.09774425 2.039406 0.04140952
- I am 95% confident that as family income increases by $1000 that the log(odds) of accepting the offer increase by 0.199, with a 95% confidence interval from 0.077 to 0.479.
> (cfs <- cbind(Ests=coef(glm2),confint(glm2)) )
Ests 2.5 % 97.5 %
(Intercept) -12.8450260 -30.54585338 -5.057650
income 0.1993402 0.07744564 0.479132
- I am 95% confident that as family income increases by $1000 that the odds of accepting the offer are 1.221, with a 95% confidence interval from 1.081 to 1.615, TIMES greater.
> exp(cfs)
Ests 2.5 % 97.5 %
(Intercept) 2.639223e-06 5.421312e-14 0.006360486
income 1.220597e+00 1.080523e+00 1.614672303
- The probability that a resident with a family income of $80000 will accept the offer is 0.957, with a 95% confidence interval from 0.814 to 1.000.
> predict(glm2,data.frame(income=80),type="response")
1
0.9569831
> predProb <- function(x,alpha,beta) exp(alpha+beta*x)/(1+exp(alpha+beta*x))
> p80 <- predProb(80,cfs[[1]],cfs[[2]])
> bc2 <- bootCase(glm2) # bootstrapping, be patient!
> p80bc <- predProb(80,bc2[,1],bc2[,2])
> quantile(p80bc,c(0.025,0.975),na.rm=TRUE)
2.5% 97.5%
0.813897 1.000000
- The odds that a family with an income of $80000 will accept the offer is 22.2. Thus, residents with a family income of $80000 are 22.2 times more likely to accept the offer than to not accept the offer.
> p80/(1-p80)
[1] 22.24665
- The offer will be accepted by 25% of the residents when their family income is $58926, with a 95% confidence interval from $53694 to $68477.
> predX <- function(p,alpha,beta) (log(p/(1-p))-alpha)/beta
> x25 <- predX(0.25,bc2[,1],bc2[,2])
> quantile(x25,c(0.025,0.975))
2.5% 97.5%
53.69441 68.47666