Statistical Definitions

The following are definitions for general statistical words. See further below for R-specific definitions.

Word Definition
Accuracy
The tendency of a statistic to come close to the parameter it was intended to estimate.
Alternative Hypothesis
A statistical hypothesis that states that there is a difference between a parameter and a specific value or between two parameters.
Bimodal
The shape of a distribution with two peaks or "humps."
Bivariate
Examining two variables.
Coefficient of Determination
The proportion of the total variability in the response variable that is explained away by knowing the explanatory variable and the best-fit model.
Continuous
A quantitative variable that can assume an uncountable number of values.
Convenience
A sample of individuals who are easiest to reach for the researcher.
Dependent
See response variable.
Discrete
A quantitative variable that can assume a countable number of values.
Factor(s)
In an experiment, the variable(s) that is (are) deliberately manipulated to determine its effect on the response variable.
Independent
See explanatory variable.
Individual
One of the items examined by the researcher.
Inference
The process of forming conclusions about the unknown parameters of a population by computing statistics from the individuals in a sample.
Inter-Quartile Range (IQR)
The difference between the third (Q3) and first (Q1) quartiles.
Intercept
The value of the response variable when the explanatory variable is equal to zero.
Left-Skewed
The left-tail of a distribution is longer or more drawn out than the right-tail.
Levels
In an experiment, the number of categories or groupings of the factor.
Mean
The center of gravity or balance point of the data, i.e., the sum of the data divided by the number of individuals.
Median
The midpoint of the data, i.e., the value of the individual in the position that splits the ordered list of individuals into two equal-sized halves.
Mode
The value or class of values that occurs most often in a data set.
Multivariate
Examining more than two variables.
Natural Variability
The fact that no two individuals are exactly alike.
Nominal
A categorical variable for which a natural order DOES NOT exist among the categories.
Null Hypothesis
A statistical hypothesis that states that there is no difference between a parameter and a specific value or between two parameters.
Ordinal
A categorical variable for which a natural order exists among the categories.
Outlier
An individual whose value is widely separated from the main cluster of values in the sample.
p-value
The probability of the observed statistic or a value of the statistic more extreme assuming the null hypothesis is true.
Parameter
A summary of all individuals in a population.
Population
ALL individuals of interest.
Precision
The tendency to have values clustered closely together. Precision is inversely related to the standard error – the smaller the standard error, the greater the precision.
Quartiles
The values that divide the ordered data into quarters.
Range
The difference between the maximum and minimum value in a data set.
Replicates
In an experiment, the number of individuals in each treatment group.
Research Hypothesis
A general statement about the question or phenomenon being tested.
Residual
The difference between the observed and predicted values of the response variable for an individual. In regression, the vertical difference between the observed and predicted values of the response variable for an individual.
Response
The variable to be predicted or explained.
Right-Skewed
The right-tail of a distribution is longer or more drawn out than the left-tail.
Sample
A subset of the population examined by a researcher.
Sampling Distribution
The distribution of the values of a particular statistic computed from all possible samples of the same size from the same population.
Sampling Variability
The fact that the results (i.e., statistics) from different samples (of the same population) are different.
Simple Random
A probability-based sample where each individual of the population has the same chance of being selected for the sample. Usually abbreviated as SRS.
Slope
The change in value of the response variable for a unit change in value of the explanatory variable.
Standard Deviation
"Essentially" the average deviation or difference of individuals from the mean.
Standard Error
The numerical measure of dispersion used for sampling distributions – i.e., measures the dispersion among statistics from all possible samples.
Statistic
A summary of all individuals in a sample.
Statistics
As a field of study ... The science of collecting, organizing, and interpreting numerical information or data.
Symmetric
The left- and right-tail of a distribution are nearly the same in length and height.
Treatments
In an experiment, he number of combinations of all factors in the experiment.
Unbiased
For statistics, a statistic in which the center of its sampling distribution equals the parameter it is intended to estimate. For samples, a sample that does not systematicall over- or under-represent portions of the population.
Univariate
Examining one variable.
Variable
The characteristic of interest recorded about each individual.
Voluntary Response
A sample of individuals that choose themselves for the sample by responding to a general appeal.

R Definitions

The following definitions are related to R.

Word Definition
Argument
A "directive" provided within the parentheses of a function.
data.frame
A two-dimensional organization of variables (as columns, possibly of different data types) recorded on multiple individuals (as rows).
Factor(s)
A special type of variable that identifies the group to which an individual belongs.
Function
In R, a program that performs a particular task.
Vector
A one-dimensional list of items of the same data type.