# Popularity in Middle School

- α=0.05.
- H
_{A}: “The distribution of students into the goals categories differs among grades” and H_{0}: “The distribution of students into the goals categories DOES NOT differ among grades”. - Chi-square test because (i) the response variable (goals category) is categorical and (ii) two or more groups or populations (grades) were sampled.
- This is an observational study because the students were not allocated to grades. The students were not obviously randomly selected (and were likely part of a voluntary response survey).
- All cells in the expected frequency table (see
`chi1$expected`

results below) were greater than 5; the assumptions for a chi-square test have been met. - The observed frequency table is shown in the
`obs`

results below. - χ
^{2}=1.3121 with 4 df. - p-value=0.8593.
- Do not reject H
_{0}because the p-value>α. - It appears that the distributions of students into the goals categories does not differ by grade. Analysis of the row percentage table (results of
`percTable()`

below) suggests that grades are the primary goal for students in all grades. - Not necessary with a chi-square test.

#### R Code and Results

```
> d <- read.csv("https://github.com/droglenc/NCData/raw/master/PopularKids.csv")
> ( obs <- xtabs(~grade+goals,data=d) )
```

```
goals
grade Grades Popular Sports
4 63 31 25
5 88 55 33
6 96 55 32
```

`> ( chi1 <- chisq.test(obs,correct=FALSE) )`

```
Pearson's Chi-squared test with obs
X-squared = 1.3121, df = 4, p-value = 0.8593
```

`> chi1$expected`

```
goals
grade Grades Popular Sports
4 61.49163 35.10251 22.40586
5 90.94561 51.91632 33.13808
6 94.56276 53.98117 34.45607
```

`> percTable(obs,margin=1,digits=1)`

```
goals
grade Grades Popular Sports Sum
4 52.9 26.1 21.0 100.0
5 50.0 31.2 18.8 100.0
6 52.5 30.1 17.5 100.1
```

# Bear Habitat Use

- α =0.10.
- H
_{0}:“The distribution of bear observations into the habitats is in the same proportions as available habitat” versus H_{A}:“The distribution of bear observations into the habitats is NOT in the same proportions as available habitat” - A goodness-of-fit test is required because a single categorical variable (habitat use) from a single group or population (these bears) was measured and the proportions are being compared to a theoretical distribution in the null hypothesis.
- This is an observational study (the bears were fit with collars but just to observe where they went) with randomly selected times for observation.
- The expected number of observations in each habitat is in proportion to the GIS analysis of habitat availability. Thus, the expected number in each habitat is shown in the
`gof$expected`

results below. Note that these percentages did not need to be adjusted to the number of observations because 100 observations were made. The test statistic computed below should reasonably follow a chi-square distribution because all of the expected values are greater than five. - The table of observed frequencies is shown in the
`obs`

results below. - The χ
^{2}test statistic is 7.7478 with 4 df. - p-value=0.1013.
- The H
_{0}is not rejected because the p-value>α. - The bear appears to use the habitats in proportion to the availability of the habitat.
- Not completed with a goodness-of-fit test.

#### R Code and Results

```
> obs <- c("LowConif"=47,"Aspen"=12,"Open"=10,"UpHard"=21,"MixUp"=10)
> pexp <- c("LowConif"=34,"Aspen"=17,"Open"=12,"UpHard"=25,"MixUp"=12)
> ( gof <- chisq.test(obs,p=pexp,rescale=TRUE,correct=FALSE) )
```

```
Chi-squared test for given probabilities with obs
X-squared = 7.7478, df = 4, p-value = 0.1013
```

`> gof$expected`

```
LowConif Aspen Open UpHard MixUp
34 17 12 25 12
```