Statistical experiment design for animal research
Carlos Oscar S. Sorzano
No e9s25, OSF Preprints from Center for Open Science
Abstract:
When we do a scientific experiment, from the statistical point of view, we can distinguish two important periods: before doing the experiment and after having done the experiment. After the experiment, we will be faced with a collection of numbers stemming from the measurements. Our goal, in a biomedical scientific context, will be to show that a drug is effective against a given disease, that a cell type is involved in some physiological process, that a gene is overexpressed under some condition, that a new vaccine is effective to protect the population, ... Statistical tools, most prominently statistical inference, will be used to discriminate the underlying signal we are interested in from the noise coming from measurement errors and biological variability. Whether we find or not the sought effect depends on three things: 1) there is really some biological effect (e.g., the drug is really effective); 2) how much noise there is in our measurements; and 3) how much evidence we have collected to show that there is really some effect, that is, how many times we have observed this difference. From the statistical point of view, we cannot act on Point 1. But we can act on Points 2 and 3 before doing the experiment, we do not need to wait for the experiment to be done to perform a "post-mortem" analysis. Point 2 is addressed by statistical experimental design. This technique tries to arrange the experiment in such a way that we can identify the different sources of variation and determine which part of the variability observed in the measurements comes from our treatment (drug, vaccine, gene, or cell type of interest), the signal, and which part comes from other sources such as sex, age, health condition, the experimenter doing the experiment, etc. The part of variation that we cannot explain will be the noise. We will declare that there is a biological difference if the signal is well (significantly) above the level of noise. By far, the most known experimental design in biomedical sciences is the comparison of the results from a control and a treated group. However, this is not the only one and many other designs can be conceived so that the amount of noise is minimized. Point 3 is addressed by the sample size calculation. That is, how many times we repeat the experiment to determine if there is a biological effect or not. The more we measure, the surer we are about our decision (we see some effect or we do not see any effect). Hypothesis testing is a way to automate this decision in a quantitative way. Of course, we can take wrong decisions: deciding that there is an effect when there is none (false positive) or deciding that there is no effect when there is (false negative). The probability of these two types of mistakes can be controlled at will by simply choosing an appropriate sample size. Points 2 and 3 assume that the experiment is well conducted and that the observed differences are only caused by the variable of interest (treatment, cell type, gene, ...). If there are other uncontrolled variables affecting our results (e.g., males respond to the treatment, but females do not) and these variables are not explicitly taken into account in the experimental design, then we will have biased results. That is, the observed differences are not caused by the treatment, but by something else that we do not know, making us believe that there is a true biological effect. The presence of bias ruins our experiment as we are fooled by data whose true differences are caused by uncontrolled variables. The main three tools to fight bias are blocking (we identify possible variables that might affect or not the results), blinding (to prevent possible biases from the researcher), and randomization (we randomize the samples in such a way that any possible affecting variable that has not been blocked equally affects all samples so that its effects are randomly distributed among the experimental groups). This book addresses the statistical tools needed to tackle Points 2 and 3 as well as avoiding bias. That is, all the steps before doing the experiment. The book does not address data analysis (after having done the experiment). However, we will see that we cannot design our experiment if we do not know how the data going out from it will be analyzed. In this regard, the past and the future of the experiment are tightly linked.
Date: 2023-06-02
New Economics Papers: this item is included in nep-exp and nep-mfd
References: Add references at CitEc
Citations:
Downloads: (external link)
https://osf.io/download/647a3a0f85df48078f775e2a/
Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.
Export reference: BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text
Persistent link: https://EconPapers.repec.org/RePEc:osf:osfxxx:e9s25
DOI: 10.31219/osf.io/e9s25
Access Statistics for this paper
More papers in OSF Preprints from Center for Open Science
Bibliographic data for series maintained by OSF ().