11 Hypothesis testing

Learning objectives

Understand the statistical formalisation of posting- and answering questions, which we know as null hypothesis testing, and that this is the key concept in inferential statistics.
Be able to suggest a relevant metric as a surrogate for testing in relation to the scientific question asked, a so-called test statistic, and how this is related to a null-distribution.
Understand the meaning of the p-value.
Understand the concept of an alternative hypothesis.

Additionally the concepts of population, (random) sample, parameter and estimator-function are central repetitions for this theme.

11.1 Reading material

Video on the ideas behind hypothesis testing:
- The Essential Guide To Hypothesis Testing
Chapter 3 of Introduction to Statistics by Brockhoff
- Especially chapter 3.1 - 3.1.3 and 3.1.6.

11.2 Videos

11.2.1 The Essential Guide To Hypothesis Testing - Very Normal

11.2.2 Exercises

Exercise 11.1

Exercise 11.1 - Hypothesis testing

This exercise is conceptual. The idea is, that you should think about a problem, and try to figure out what the null hypothesis is, and further what could be a relevant metric for measuring the distance to this null hypothesis.

You playing with a six-sided dice, and you are observing abnormally high number of fives. You set up an experiment to test whether the dice is skew, where you trow dice and register the outcome a number of time. What is the null hypothesis in this experiment?
Two response variables in a sample of size $n$ seem to track, i.e. is positively correlated. What is the null hypothesis for testing this relation? and what measures the distance to this hypothesis?
In an experiment you have a treatment with three levels (placebo, treatment A and treatment B) and some relevant response variable. You are interested in whether there is a difference between the treatments.And in particular whether A is different from placebo and whether B is different from placebo. What is the null-hypothesis for the former question? and what would be a relevant metric for measuring the distance to that hypothesis? What is the null hypothesis for the pairwise comparisons and what further relevant metrics for measuring this distance.
According to theory there is a proportional linear relation between $y$ and $x$. You fit a line between the two ($y = a + bx$). What is the null hypothesis concerning proportionality? and what measures the distance to this hypothesis?

Exercise 11.2

Exercise 11.2 - Association or causality?

This exercise is intended to show, that you need to be careful with drawing conclussions solely based on statistical numbers (confidence intervals, p-values,…), and that you need to be critical and think about the study design, biology, life, etc.

A study wants to investigate a certain biomarker in the discovery of cancer. From a population of cancer patients a sample of $n = 123$ patients is taken, and their blood is investigated for a specific biomarker (BMa). The mean and standard deviation of this sample is estimated to $\bar{x} = 3.4$ mg/L and $s_x = 1.5$ mg/L respectively.

Tasks

Calculate the standard error for the mean of the distribution.
Make a confidence interval for the mean of the distribution.

The average in this population seems rather high from a biological point of view. However, the researchers want to verify that this is indeed the case, and therefore go out and recruites a population of healthy individuals of size $n = 130$. The discriptive statistics for this group is $\bar{x} = 2.9$ mg/L and $s_x = 1.3$ mg/L.

Tasks

Make a confidence interval for the mean in the healthy population.
Sketch the two population distributions. Are there an overlap?
Sketch the two confidence intervals and contemplate over similarity/differences between these two populations.

The researchers ask the question of whether the two distributions are similar.

Tasks

Formulate the question as a null- and alternative hypothesis.
Test the hypothesis, and comment on the question raised.

The answer to the question seems to indicate differences between the two populations. Now the researchers take this one step further, and claims, that this must be due to cancer status.

Tasks

What is problematic in drawing the conclusion, that differences in BMa is caused by cancer status?
- Hint: Think about study-design, and other differences between the two populations such as age, lifestyle etc.
In order to be certain about cancer leading to increased levels of BMa, which circumstances must be fullfilled? Is possible to make such studies on humans? Mice?

# Hypothesis testing ::: {.callout-tip} ## Learning objectives * Understand the statistical formalisation of posting- and answering questions, which we know as null hypothesis testing, and that this is the key concept in inferential statistics. * Be able to suggest a relevant metric as a surrogate for testing in relation to the scientific question asked, a so-called test statistic, and how this is related to a null-distribution. * Understand the meaning of the p-value. * Understand the concept of an alternative hypothesis. Additionally the concepts of population, (random) sample, parameter and estimator-function are central repetitions for this theme. ::: ## Reading material * Video on the ideas behind hypothesis testing: * [The Essential Guide To Hypothesis Testing](#sec-nhst-very-normal) * Chapter 3 of [*Introduction to Statistics*](https://02402.compute.dtu.dk/enotes/book-IntroStatistics) by Brockhoff * Especially chapter 3.1 - 3.1.3 and 3.1.6. ## Videos ### The Essential Guide To Hypothesis Testing - *Very Normal* {#sec-nhst-very-normal} {{< video https://www.youtube.com/watch?v=FZ2_hzMyJpY >}} ### Exercises ::: {#ex-hypothesis-testing}  ::: ::: {.callout-exercise} ### Hypothesis testing This exercise is conceptual. The idea is, that you should think about a problem, and try to figure out what the null hypothesis is, and further what could be a relevant metric for measuring the distance to this null hypothesis. 1. You playing with a six-sided dice, and you are observing abnormally high number of fives. You set up an experiment to test whether the dice is skew, where you trow dice and register the outcome a number of time. What is the null hypothesis in this experiment? 2. Two response variables in a sample of size $n$ seem to track, i.e. is positively correlated. What is the null hypothesis for testing this relation? and what measures the distance to this hypothesis? 3. In an experiment you have a treatment with three levels (placebo, treatment A and treatment B) and some relevant response variable. You are interested in whether there is a difference between the treatments.And in particular whether A is different from placebo and whether B is different from placebo. What is the null-hypothesis for the former question? and what would be a relevant metric for measuring the distance to that hypothesis? What is the null hypothesis for the pairwise comparisons and what further relevant metrics for measuring this distance. 4. According to theory there is a proportional linear relation between $y$ and $x$. You fit a line between the two ($y = a + bx$). What is the null hypothesis concerning proportionality? and what measures the distance to this hypothesis? ::: ::: {#ex-association-or-causality}  ::: ::: {.callout-exercise} ### Association or causality? This exercise is intended to show, that you need to be careful with drawing conclussions solely based on statistical numbers (confidence intervals, p-values,...), and that you need to be critical and think about the study design, biology, life, etc. A study wants to investigate a certain biomarker in the discovery of cancer. From a population of cancer patients a sample of $n = 123$ patients is taken, and their blood is investigated for a specific biomarker (BMa). The mean and standard deviation of this sample is estimated to $\bar{x} = 3.4$ mg/L and $s_x = 1.5$ mg/L respectively. ::: {.callout-task} 1. Calculate the standard error for the mean of the distribution. 2. Make a confidence interval for the mean of the distribution. ::: The average in this population seems rather high from a biological point of view. However, the researchers want to verify that this is indeed the case, and therefore go out and recruites a population of healthy individuals of size $n = 130$. The discriptive statistics for this group is $\bar{x} = 2.9$ mg/L and $s_x = 1.3$ mg/L. ::: {.callout-task} 3. Make a confidence interval for the mean in the healthy population. 4. Sketch the two population distributions. Are there an overlap? 5. Sketch the two confidence intervals and contemplate over similarity/differences between these two populations. ::: The researchers ask the question of whether the two distributions are similar. ::: {.callout-task} 6. Formulate the question as a null- and alternative hypothesis. 7. Test the hypothesis, and comment on the question raised. ::: The answer to the question seems to indicate differences between the two populations. Now the researchers take this one step further, and claims, that this must be due to cancer status. ::: {.callout-task} 8. What is problematic in drawing the conclusion, that differences in BMa is caused by cancer status? * Hint: Think about study-design, and other differences between the two populations such as age, lifestyle etc. 9. In order to be certain about cancer leading to increased levels of BMa, which circumstances must be fullfilled? Is possible to make such studies on humans? Mice? ::: :::