The 'gold standard' of preclinical research yields ‘highly unreliable’ results, study finds

27-Feb-2018 - Last updated on 27-Feb-2018 at 15:11 GMT

A rack of mouse cages in an animal facility, where animals are kept under highly standardized conditions. (Image: Hanno Würbel)

Standardization in preclinical animal studies may be a cause of poor reproducibility – making translation to humans in clinical trials difficult, if unlikely, say researchers.

According to the study, increasing study sample diversity can “significantly” improve the reproducibility of experimental results.

However, standardization is the gold standard in animal research – if not a dogma, said lead author Professor Hanno Würbel, director of the Division of Animal Welfare at the University of Bern, who has long hypothesized that standardization may be a cause of poor reproducibility, rather than the antidote.

“The more you standardize the animals and the environment, the more you risk obtaining results that are specific to these standardized conditions but would not be reproducible with different animals or under different conditions,” he explained.

The aim of the study, which was published recently in PLOS Biology, was to examine the extent to which standardization may be a problem in preclinical animal research, and whether a multi-laboratory approach could be a solution, said Würbel.

Researchers from the Universities of Bern and Edinburgh used computer simulations based on 440 pre-clinical studies across 13 different treatments in animal models. The reproducibility of results were then compared between single-laboratory and multi-laboratory studies.

“We found that the gold standard of single-laboratory studies conducted under standardized conditions yields highly unreliable results,” said Würbel.

Conversely, the researchers found that using the same number of animals distributed across up to four laboratories yielded “much more accurate and better reproducible” results.

Translating to human trials

As Würbel explained, there are many threats to reproducibility, in addition to standardization. Other challenges include a lack of scientific rigor, small sample sizes, and analytical flexibility (p-hacking), among others.

“All of this reflects a culture of science where the generation of highly spectacular news is valued higher than the production of solid evidence,” he said.

By the same token, as a researcher would not conduct a clinical trial using only 18-year-olds living in the same home, Würbel said a diverse sample of animals should be used to assess the extent of generalizability in preclinical trials.

“Treatments that don't stand the test of such diversity are unlikely to translate to humans in clinical trials,” he added.

A shift in mindset

Multi-lab studies are one way researchers could create more diverse study samples, but can be logistically demanding, Würbel said.

Other options to increase diversity include using multiple strains or species of animals, animals from various breeders, and housing animals in different conditions.

“However, while we know that multi-lab studies work, we still need to work out how a similar level of diversity can reliably be generated using any of these other methods,” said Würbel.

However, other reasons for poor reproducibility represent violations of good laboratory practice, said Würbel, noting that standardization is considered good laboratory practice and is taught in textbooks and courses.

“Thus, a shift in the researchers' mindset is needed to change current practice, which may explain why change is so difficult and slow,” he said.

Authors: Voelkl B, Vogt L, Sena ES, Würbel H (2018)
Title: Reproducibility of preclinical animal research improves with heterogeneity of study samples
Source: PLoS Biology
DOI: 2003693