Popularity in Middle School
- α=0.05.
- HA: “The distribution of students into the goals categories differs among grades” and H0: “The distribution of students into the goals categories DOES NOT differ among grades”.
- Chi-square test because (i) the response variable (goals category) is categorical and (ii) two or more groups or populations (grades) were sampled.
- This is an observational study because the students were not allocated to grades. The students were not obviously randomly selected (and were likely part of a voluntary response survey).
- All cells in the expected frequency table (see
chi1$expected
results below) were greater than 5; the assumptions for a chi-square test have been met. - The observed frequency table is shown in the
obs
results below. - χ2=1.3121 with 4 df.
- p-value=0.8593.
- Do not reject H0 because the p-value>α.
- It appears that the distributions of students into the goals categories does not differ by grade. Analysis of the row percentage table (results of
percTable()
below) suggests that grades are the primary goal for students in all grades. - Not necessary with a chi-square test.
R Code and Results
> d <- read.csv("https://github.com/droglenc/NCData/raw/master/PopularKids.csv")
> ( obs <- xtabs(~grade+goals,data=d) )
goals
grade Grades Popular Sports
4 63 31 25
5 88 55 33
6 96 55 32
> ( chi1 <- chisq.test(obs,correct=FALSE) )
Pearson's Chi-squared test with obs
X-squared = 1.3121, df = 4, p-value = 0.8593
> chi1$expected
goals
grade Grades Popular Sports
4 61.49163 35.10251 22.40586
5 90.94561 51.91632 33.13808
6 94.56276 53.98117 34.45607
> percTable(obs,margin=1,digits=1)
goals
grade Grades Popular Sports Sum
4 52.9 26.1 21.0 100.0
5 50.0 31.2 18.8 100.0
6 52.5 30.1 17.5 100.1
Bear Habitat Use
- α =0.10.
- H0:“The distribution of bear observations into the habitats is in the same proportions as available habitat” versus HA:“The distribution of bear observations into the habitats is NOT in the same proportions as available habitat”
- A goodness-of-fit test is required because a single categorical variable (habitat use) from a single group or population (these bears) was measured and the proportions are being compared to a theoretical distribution in the null hypothesis.
- This is an observational study (the bears were fit with collars but just to observe where they went) with randomly selected times for observation.
- The expected number of observations in each habitat is in proportion to the GIS analysis of habitat availability. Thus, the expected number in each habitat is shown in the
gof$expected
results below. Note that these percentages did not need to be adjusted to the number of observations because 100 observations were made. The test statistic computed below should reasonably follow a chi-square distribution because all of the expected values are greater than five. - The table of observed frequencies is shown in the
obs
results below. - The χ2 test statistic is 7.7478 with 4 df.
- p-value=0.1013.
- The H0 is not rejected because the p-value>α.
- The bear appears to use the habitats in proportion to the availability of the habitat.
- Not completed with a goodness-of-fit test.
R Code and Results
> obs <- c("LowConif"=47,"Aspen"=12,"Open"=10,"UpHard"=21,"MixUp"=10)
> pexp <- c("LowConif"=34,"Aspen"=17,"Open"=12,"UpHard"=25,"MixUp"=12)
> ( gof <- chisq.test(obs,p=pexp,rescale=TRUE,correct=FALSE) )
Chi-squared test for given probabilities with obs
X-squared = 7.7478, df = 4, p-value = 0.1013
> gof$expected
LowConif Aspen Open UpHard MixUp
34 17 12 25 12