# Experiment

In the EDA, an experiment is defined as a controlled procedure carefully designed to test a hypothesis, about the effect of one or more independent variables on an outcome measure. The scope of the EDA is limited to experiments involving living animals.

**Content:**

- The experiment node
- The animal characteristics node
- Hypothesis
- Effect of interest
- Effect size
- Justification for the effect size

## The experiment node

Every experiment diagram should contain an **experiment node**. The properties of this node describe the objective of the experiment; it includes fields for the **hypothesis** the experiment is designed to test and the **effect of interest **(see below). The effect of interest in an experiment is the outcome that will be measured to test whether it changes in response to manipulation of the independent variables of interest (i.e. controlled variables). Additionally, a description of the experiment can be included.

## The animal characteristics node

By definition, *in vivo* experiments involve animals and the characteristics of the animals used in any particular experiment should be indicated on the diagram. Distinct **animal characteristics nodes** are used for distinct sets of animal characteristics, which can be entered in the properties of each node. Species, strain, sex, weight and age would be considered the minimum required, as described in the ARRIVE guidelines.

## Hypothesis

For any particular experiment, both the null and the alternative hypotheses should be specified. If you have multiple hypotheses, the main one should be specified.

### Null hypothesis (H_{0})

The null hypothesis, usually denoted by H_{0}, refers to the postulate that the response being measured is unaffected by the experimental manipulation being tested. The null hypothesis represents the hypothesis of no change or no effect. For example if the effect of a proposed anti-cholesterol drug is being tested, then the null hypothesis could be that the drug treatment has no effect on the measured blood pressure:

H_{0}: The anti-cholesterol drug has no effect; there are no differences among treatment and control means.

In a statistical test, the p-value is a measure of how much evidence there is against the null hypothesis; the smaller the p-value, the more evidence we have against it. The null hypothesis cannot be accepted or proven true, for example, if a study yields a p-value higher than the predefined threshold. This does not mean that the null hypothesis is true and that the drug has no effect on blood pressure, it means that there is not enough evidence against the null hypothesis and that, under the conditions of the experiment, it is not possible to determine that there is a relationship between the drug and blood pressure. This may be because the effect is smaller and/or that the individual variability greater than anticipated.

### Alternative hypothesis (H_{1})

The alternative hypothesis, denoted by H_{1}, refers to the postulate that manipulating the independent variable of interest has an effect on the response measured. For the example given above the alternative hypothesis would be:

H_{1} : The anti-cholesterol drug has an effect on blood pressure

The alternative hypothesis is usually not directional (i.e.two-sided hypothesis) unless the laws of physics, or some equally strong a priori evidence, dictate that only results in one direction will be taken as evidence against the null hypothesis. A non-directional H_{1} would be tested with a two-sided test and an effect of the drug in any direction can be taken as evidence against the null hypothesis. In that respect H_{1} is different from the experimental hypothesis (i.e. what researchers expect will happen), which is usually directional as it is rare to do an experiment with no idea of which way a significant effect might lie.

A directional (one-sided) alternative hypothesis such as H_{1}: The anti-cholesterol drug reduces blood pressure, implies that data will be analysed with a one-sided test, which also has implications for the sample size calculation. Results in the opposite direction (e.g. the treatment increases blood pressure at p<0.0001) will be considered as consistent with the null hypothesis according to which the treatment has no effect. However, in terms of drug discovery, evidence that a potential treatment actually increases blood pressure is a useful piece of information and few researchers would be willing to discard this information and conclude that there is no evidence that the treatment has an effect; thus directional alternative hypotheses should be used carefully.

Examples of experiments where one-sided tests could be appropriate include an experiment looking at the survival of homozygous animals in a heterozygous x heterozygous cross. A rate of 0.25 would be expected if the genotype has no effect on the survival of homozygous animals but this rate can only go lower if genotype has an effect. Another example could be a genetic toxicology experiment, where only DNA damage is of interest.

Both **null** and **alternative hypotheses** should be specified in the properties of the experiment node.

## Effect of interest

The effect of interest is the outcome which will be measured to test whether it changes in response to manipulation of the independent variables of interest. It is based on the primary outcome measure.

For example, in an experiment looking at the effect of a drug on blood pressure, the outcome measure (what you are measuring) could be systolic blood pressure, the independent variable of interest could be the drug (with different categories which you are comparing, e.g. vehicle vs. 10 mg/kg) and the effect of interest could be a change in systolic blood pressure.

## Effect size

Researchers should always have an idea of the minimum effect they want the experiment to be able to detect. The effect size is the minimum difference between two groups under study, which would be of interest biologically. It is not based on prior knowledge of the magnitude of the treatment effect but on a difference which would be would be worth taking forward into further work or clinical trials. The effect size is one of the parameters used in the power analysis to estimate the sample size. Careful consideration of the effect size allows the experiment to be powered to detect only meaningful effects and not generate statistically significant results that are not biologically relevant.

In the example above, the effect size could be 20 mmHg, this implies that researchers are interested in any change in systolic blood pressure greater than 20 mmHg and an effect size smaller than this would not be worth taking forward into further research.

## Justification for effect size

A statement explaining how the effect size was chosen and why it is of biological interest or clinical relevance (e.g. the current gold-standard therapy produces a 20mmHg change in blood pressure and new therapies must at least match this effect).

## References and further reading

DRUMMOND, G. B. & TOM, B. D. 2011. Presenting data: can you follow a recipe? Clin Exp Pharmacol Physiol, 38, 787-90.

FESTING, M. F. & ALTMAN, D. G. 2002. Guidelines for the design and statistical analysis of experiments using laboratory animals. ILAR J, 43, 244-58.

JOHNSON, P. D. & BESSELSEN, D. G. 2002. Practical aspects of experimental design in animal research. ILAR J, 43, 202-6.

SMITH, C. J. & FOX, Z. V. 2010. The use and abuse of hypothesis tests: how to present P values. *Phlebology,* 25**,** 107-12.

http://udel.edu/~mcdonald/stathyptesting.html

http://www.stats.gla.ac.uk/steps/glossary/hypothesis_testing.html#h0