# Significance Tests / Hypothesis Testing

The exact form of the research hypothesis depends on the investigator's belief about the parameter of interest and whether it has possibly increased, decreased or is different from the null value. The research hypothesis is set up by the investigator before any data are collected.

## The null hypothesis and the alternative:

### When comparing p-values and significance levels, the rule is:

This test for homogeneity of variance provides an *F*-statistic and a significance value (*p*-value). We are primarily concerned with the significance value – if it is greater than 0.05 (i.e., *p* > .05), our group variances can be treated as equal. However, if *p*

### A small (p)-valueis an indication that the null hypothesis is false.

This is not very clear, but apparently we are to check if the average *exceeds* $160, which would mean the ATMs are not "stocked with enough cash." This is what we will try to show, so it should go in the alternative hypothesis.

## Consequently, null hypothesis 5 is rejected.

In all tests of hypothesis, there are two types of errors that can be committed. The first is called a Type I error and refers to the situation where we incorrectly reject H_{0} when in fact it is true. This is also called a false positive result (as we incorrectly conclude that the research hypothesis is true when in fact it is not). When we run a test of hypothesis and decide to reject H_{0} (e.g., because the test statistic exceeds the critical value in an upper tailed test) then either we make a correct decision because the research hypothesis is true or we commit a Type I error. The different conclusions are summarized in the table below. Note that we will never know whether the null hypothesis is really true or false (i.e., we will never know which row of the following table reflects reality).

## Therefore, null hypothesis 8 is declined.

Having said that, there's one key concept from Bayesian statistics that is important for all users of statistics to understand. To illustrate it, imagine that you are testing extracts from 1000 different tropical plants, trying to find something that will kill beetle larvae. The reality (which you don't know) is that 500 of the extracts kill beetle larvae, and 500 don't. You do the 1000 experiments and do the 1000 frequentist statistical tests, and you use the traditional significance level of *P**P**P**P* value, after all), so you have 25 false positives. So you end up with 525 plant extracts that gave you a *P* value less than 0.05. You'll have to do further experiments to figure out which are the 25 false positives and which are the 500 true positives, but that's not so bad, since you know that most of them will turn out to be true positives.