Note:
  • Tables (and figures) should be labeled as described in the homework format description. Table labels go ABOVE the table and figure labels go BELOW the table. Tables (and figures) should be referred to in your answers. See the key below for a model of this.
  • Use complete sentences to answer questions.
  • Use an R appendix to show the code you used to produce results. Do not include R code in any of your other answers.
  • Keep “many” decimals in intermediate calculations … i.e., don’t round until the final answer.
  • Note the explanations in the key below.
  • In question 2 below make sure to use the word “mean” as the hypotheses are testing that the “mean exploration”, not “the amount of exploration”, differs between the two groups of crickets. Hypotheses are about summaries, not individual values. Also make sure the say what is different … say “mean exploration differs between subordinate and dominant crickets” not “the group means differ.” Don’t just say “the null hypothesis is rejected” … explain what that means about mean exploration.
  • Always demonstrate your answers. As one example, don’t just say that F=t2 … show it with the values from the tales.
  • There are three possible MS values – MSTotal, MSWithin, and MSAmong. Don’t use “residuals MS” (e.g., last question) even though that is how R labels it.
  • Don’t say that something is equal if it clearly is not. If your algebra in Question 7 does not equal MSWithin if it does not. If you know that it is supposed to equal MSWithin and you can’t get your algebra straight then SEEK HELP FROM ME.

Cricket Behavior

  1. The p-values for the two-sample t-test (\(p=0.0000054\); Table 1), from the ANOVA table (\(p=0.0000054\); Table 2), and for the slope coefficient (\(p=0.0000054\); Table 3) are all the same. These p-values are all equivalent because the 2-sample t-test null hypothesis of equal means (or difference in means equals zero) is the same as the null hypothesis for the slope (see below about the slope representing the difference in means) which is the same as the null hypothesis for the ANOVA table (i.e., simple model of one mean representing both groups). Obviously, the alternative hypotheses are also the same across the 2-sample t-test, slope, and full model in the alternative hypothesis.
  2. With these p-values, very strong evidence to reject the null hypothesis exists. Thus, mean relative exploration appears to differ between subordinate and dominant crickets.
  3. The mean of the first group (A) in the 2-sample t-test (2.394; Table 1) is the same as the intercept coefficient from the linear model (2.394; Table 3). This occurs because an intercept is defined as the “value of \(Y\) when \(X\)=0, on average”. In this case, \(Y\) is pH and \(X\) is “dominant” crickets because dominant is coded with a zero in lm() (because the levels are code alphabetically). Thus, the intercept is the mean exploration (\(Y\)) for dominant crickets (\(X=0\)).
  4. The difference in the means (i.e., 1.304-2.394 = -1.090; Table 1) is the same as the slope coefficient in the linear model (i.e., -1.090; Table 3). This is because the slope coefficient shows the change in \(Y\) for a one unit change in \(X\). As noted above, dominant crickets is coded with a 0 in lm() as it is alphabetically first in the group names. Subordinate crickets is thus coded as a 1 in lm() as it is alphabetically second. Thus a one unit change in \(X\) is simply a move from dominant to subordinate crickets (i.e., from 0 to 1). Thus, the slope is the difference in mean exploration between the dominant and subordinate crickets (i.e., change in \(Y\)).
  5. The df from the two-sample t-test (78; Table 1) and the within-group df from the ANOVA table (78; Table 2) are identical. The within-group df are equal to the total number of individuals (\(n=n_{1}+n_{2}\)) minus the number of groups (\(I=2\)), which is the same as for the 2-sample t-test (i.e., \(n_{1}+n_{2}-2\)).
  6. The F test statistic (23.860;Table 2) is equal to the square of the t test statistic (4.885\(^2\)=23.860; Table 1). This relationship occurs when the numerator df for the ANOVA is equal to one (i.e., there are only two groups).
  7. The SE for the difference in means is equal to \(\frac{\bar{x_{1}}-\bar{x_{2}}}{t}\) = \(\frac{2.3944-1.3042}{4.8847}\) = 0.2232. The pooled variance (\(s_{p}^{2}\)) is then equal to this value squared and divided by the sum of the reciprocals of the sample sizes – i.e., \(\frac{0.2232^{2}}{\frac{1}{41}+\frac{1}{39}}\) = 0.9957.
  8. The \(s_{p}^{2}\) computed in the previous question is the same (within rounding) as \(MS_{within}\) (Table 2).

Table 1: Results from 2-sample t-test of exploration index by dominant or subordinate cricket.

t = 4.8847, df = 78, p-value = 5.407e-06
95 percent confidence interval:
 0.6458698 1.5345385 
sample estimates:
   mean in group Dominant mean in group Subordinate 
                 2.394417                  1.304213 

Table 2: Analysis of variance table for exploration index by dominant or subordinate cricket

          Df Sum Sq Mean Sq F value    Pr(>F)
group      1 23.756 23.7560   23.86 5.407e-06
Residuals 78 77.660  0.9956                  

Table 3: Coefficient results from the one-way ANOVA for exploration index by dominant or subordinate cricket.

                 Estimate Std. Error t value Pr(>|t|)
(Intercept)        2.3944     0.1558  15.365  < 2e-16
groupSubordinate  -1.0902     0.2232  -4.885 5.41e-06
---
Residual standard error: 0.9978 on 78 degrees of freedom
Multiple R-squared: 0.2342, Adjusted R-squared: 0.2244 
F-statistic: 23.86 on 1 and 78 DF,  p-value: 5.407e-06 

R Appendix.

d <- read.csv("Crickets.csv")
t.test(t.test(explore~group,data=d,var.equal=TRUE)
lm1 <- lm(explore~group,data=d)
anova(lm1)
summary(lm1)
coef(lm1)