Analysis recommendation
Based on the diagram, this analysis should make a comparison between two or more groups (i.e. the combinations of the categories of the factors of interest) and the analysis includes a single continuous outcome measure, at least one covariate and at least five categorical factors of interest. All groups are independent and if experimental units are repeatedly measured, a summary measure is used in this analysis.
If this description is not accurate, please check your diagram and verify that all nodes are connected properly, all variables and variable categories are indicated and tagged to the relevant interventions or measurements and the information provided in the properties of each node is accurate; then critique it again.
The number of factors of interest is very high. With five factors, a full factorial design would need at least 32 groups (5 factors with at least 2 categories each: 25) and it seems unlikely that the experiment will be large enough (in terms of numbers of animals) to estimate all the interactions reliably. For example, with five factors, there are ten possible two-way interactions. In practice three-way (and above) interactions are unlikely to occur and you should consider reducing the number of factors of interest. Alternatively you could manually amend the statistical model to reduce the number of higher-order interactions included in the model; in this case the best course of action would be to consult a statistician.
Statistical analysis methods compatible with this design include an independent factorial ANCOVA (two-way, three-way or four-way ANCOVA, depending on the number of factors of interest) and an independent factorial ANCOVA on the rank transformed outcome measure.
The ANCOVA approach assumes that the data satisfies these assumptions: residuals are normally distributed, homogeneity of variance, independence of the errors and the outcome is measured on a continuous scale (read more about parametric and non-parametric tests), as well as assuming the covariate(s) should be used in the analysis.
A covariate should be used if:
- The covariate is independent of the treatment.
- There is a strong relationship between the covariate and the outcome measure (i.e. either they both increase together or one increases while the other decreases).
- The relationship between the outcome measure and the covariate is similar for all treatments (i.e. there is no significant treatment by covariate interaction).
The above assumptions can be tested by plotting your data, details of what to look for and example graphs can be found on the independent variables page of the EDA website. If there are multiple covariates you are considering including in your analysis, ensure that the assumptions hold for each of them.
In many cases you will not know if including a particular covariate in your analysis is appropriate when planning your experiment. You should measure the covariate during your study, but only include it in your statistical analysis if the assumptions for covariate inclusion are met.
If you have reasons to think the data are not normally distributed, you should consider transforming the data to normalise it (read more about data transformation) and assess if the transformed data satisfies the normality assumptions. Most data can be normalised using transformations such as log or square root and using parametric tests is preferable as they have more statistical power than non-parametric tests, as long as the required assumptions are met.
Alternatively, if data do not satisfy the normality assumptions and the transformation doesn’t work, a non-parametric test could be used. As there is no non-parametric equivalent to the factorial ANCOVA, data could be ranked and an ANCOVA on the rank transformed outcome measure could be run. Note that there are assumptions associated with non-parametric tests also. For example, to perform a rank transformation the data must be able to be ranked, with only a few ties (e.g. identical values that will end up with the same rank), the observations must be independent and the covariate must have a linear relationship with the rank. If data cannot be rank transformed (e.g. it is mainly zeros with only a few non-zero measures), the data can be recoded into binary responses and then analysed using logistic regression. Another analysis option is ordinal logistic regression. Your local statistician can help advise on this. These approaches will lead to a loss in power due to the categorisation of continuous data.
Interpreting a factorial ANCOVA
There is a common mistake when interpreting the results of a factorial ANCOVA. For example, in a situation with two factors of interest and at least one covariate such as an experiment on the effect of exercise on neuronal density, investigated in animals of both sexes, with baseline locomotor activity as a covariate. A claim that the overall effect of exercise is different in males and females can only be supported by the finding that the interaction between the two factors is statistically significant (i.e. the size of the effect is different in males and females). A significant effect of exercise in one sex but not the other is not appropriate to support the claim that there is a difference between sexes.
Analysis software
Software such as InVivoStat can be used to run either of these statistical tests, and apply data transformations. The tests can be found in the following menu:
- Two-way, three-way or four-way ANCOVA: Statistics>Single Measure Parametric Analysis
References and further reading
Nieuwenhuis, S, Forstmann, BU and Wagenmakers, EJ (2011). Erroneous analyses of interactions in neuroscience: a problem of significance. Nat Neurosci 14(9):1105-7. doi: 10.1038/nn.2886