# Hypothesis tests = tests for a specific value(s) of the parameter.

The ANOVA tests described above are called one-factor ANOVAs. There is one treatment or grouping factor with k__>__2 levels and we wish to compare the means across the different categories of this factor. The factor might represent different diets, different classifications of risk for disease (e.g., osteoporosis), different medical treatments, different age groups, or different racial/ethnic groups. There are situations where it may be of interest to compare means of a continuous outcome across two or more factors. For example, suppose a clinical trial is designed to compare five different treatments for joint pain in patients with osteoarthritis. Investigators might also hypothesize that there are differences in the outcome by sex. This is an example of a two-factor ANOVA where the factors are treatment (with 5 levels) and sex (with 2 levels). In the two-factor ANOVA, investigators can assess whether there are differences in means due to the treatment, by sex or whether there is a difference in outcomes by the combination or interaction of treatment and sex. Higher order ANOVAs are conducted in the same way as one-factor ANOVAs presented here and the computations are again organized in ANOVA tables with more rows to distinguish the different sources of variation (e.g., between treatments, between men and women). The following example illustrates the approach.

## Example of a complex multiple independent variable hypothesis:

### Example of a complex multiple dependent variable hypothesis:

In a classic study of facial attractiveness, Perrett et al. (1994) tested the Averageness Hypothesis in an effort to establish if averageness really is the critical determinant of the attractiveness of faces. Perrett et al. first collected a full face photographs of 60 young women (these photographs were taken under the same lighting conditions). One group of participants then were shown these images and asked to rate the attractiveness of each face using a 1 (very unattractive) to 7 (very attractive) scale. Next, Perrett et al used computer graphic methods to manufacture a composite face with the average shape of the whole sample (i.e. to construct the average of all 60 female faces) and a second composite face which was the average of the 15 faces that had been judged to be the most attractive by the first group of participants. Perrett et al. then manufactured what they called a ‘hyper-attractive’ composite face (i.e. a version of the composite of the attractive faces in which its attractive qualities were exaggerated or caricatured) by exaggerating (i.e. caricaturing) the physical differences in shape between the composite of all 60 faces and the composite of the most attractive 15 faces using computer graphic methods.

### The hypotheses of interest in an ANOVA are as follows:

Intriguingly, Perrett et al. found that when a new group of participants were asked to rate the attractiveness of each of the 3 composite faces (the average of all 60 faces, the average of the 15 most attractive faces and the ‘hyper-attractive’ that had exaggerated attractive qualities) the hyper-attractive face was considered the most attractive of the 3. This is noteworthy because the hyper attractive face was mathematically the least average of all 3 composites. Because the hyper-attractive face was the least average of the 3 composites judged, but also the most attractive, this finding is very strong evidence that averageness is not necessarily the critical determinant of facial attractiveness. In other words, Perrett et al's findings are evidence against the Averageness Hypothesis of facial attractiveness (which proposes that ‘attractive faces are only average’) because the findings show that highly attractive faces deviate systematically from an average shape.