What Can Be Done to Fix the Replication Crisis in Science?
Part of the scientific process involves the reproducibility of experiments. When scientists prepare the materials and methods section of their manuscripts, they should draft it in a manner that helps others reproduce their findings. Recently, scientists began to realize that the results of many scientific experiments were actually not reproducible either by independent researchers or even by the original researchers themselves. Scientists were fearful of the amount of false data that might have been published. This realization was so widespread that by the early 2010’s it was termed the replication crisis (or replicability crisis).
To highlight the nature of this crisis, the journal Nature conducted a poll of 1500 scientists and found that 70% of those polled failed to replicate another scientist’s results, and 50% could not reproduce their own data. In addition, this report indicated that some of the main reasons that data is not reproducible are selective reporting, the pressure to publish, low statistical power or poor analysis, the data was not replicated enough in the original lab and insufficient oversight or mentoring, among others. With this, over 50% of the scientists polled believed that this is a significant crisis. The recognition of this crisis is important for all fields of science.
Traps that Lead to Data Being Irreproducible
Humans have a tendency to fall into “traps” when they are assessing data and results, and these traps occur because humans are good at self-deception. One of these traps involves the way a scientist looks at a hypothesis. When scientists collect only data that supports their hypothesis and ignore other explanations or do not look for them, they are falling to a trap that can lead to unreproducible data. In addition, when scientists capitalize on random patterns in the data and assume that this is an interesting finding, they are also falling into a reproducibility trap. Furthermore, scientists often rigorously assess unexpected data but fail to do this for expected data, and they also find stories to support their findings and rationalize whatever the results will be. Together, these traps contribute to data that is irreproducible, and such situations have to be combated.
Several techniques were proposed to contend with these traps, including explicitly considering other hypothesis and testing them out, inviting academic adversaries to collaborate on studies and conducting blind data analyses. Ultimately, scientists must be more conscious of how they are collecting and assessing their data before they think of publishing it.
The P-value Conundrum
The p-value is a statistical calculation used to determine the likelihood that a result is equal to or greater than what was actually observed when the null-hypothesis is true. The p-value determines the significance of a test, and when p < 0.05 this is usually considered a significant observation. P-values have been “the gold standard” for scientists to assess the significance of their data for a number of decades. The p-value was originally meant to determine whether data was worthy of a second look, and it was not meant to be definitive. Thus, in a sense, the p-value, as it is used in science today, is not being used how it was meant to be used. Scientists rely on their p-values to determine what is worth studying and what is worth publishing, but significant p-values can be easily deemed non-significant when an extra experiment is conducted. That means that without conducting that “extra experiment” a scientist may publish their “significant” finding and then when another scientist goes to replicate it, if they find data more like the “extra experiment” then the significant data will not be reproduced.
Currently, more and more scientists are relying on statisticians to get involved with their data analysis. The statisticians are not pleased with how data is represented with these p-values. They are rigorously trying to establish better statistical models for scientific data that more accurately represent the findings and tell scientists, upfront, how to set up their experimental design. This collaborative effort will eventually mean that these more heavily scrutinized data should be more likely to be reproduced.
How Can Scientists Boost Reproducibility?
Boosting reproducibility starts in the lab. PI’s and senior scientists have to be aware of this crisis and begin to mentor new scientists to follow more rigorous standards. Here are few tips to start this process:
- Learn more about statistics: As mentioned above, if scientists can become more familiar with other statistical methods and also collaborate with statisticians then this will tremendously improve the way data is analyzed and presented, which will lead to more reliable data being published.
- Conduct within-lab validations: Another scientist in the lab should validate an experiment conducted by one scientist in the lab. If the data cannot be reproduced there probably is a problem.
- Have another lab validate the findings: Often one lab can easily reproduce its own data because everything is done with the same reagents and equipment. Before the data is published have a colleague try to replicate your data to make sure that it can be reproduced in a new environment.