Alcove

Science, culture, complexity

Tag: John Ioannidis

Priggish NEJM editorial on data-sharing misses the point it almost made

Twitter outraged like only Twitter could on January 22 over a strange editorial that appeared in the prestigious New England Journal of Medicine, calling for medical researchers to not make their research data public. The call comes at a time when the scientific publishing zeitgeist is slowly but surely shifting toward journals requiring, sometimes mandating, the authors of studies to make their data freely available so that their work can be validated by other researchers.

Through the editorial, written by Dan Longo and Jeffrey Drazen, both doctors and the latter the chief editor, NEJM also cautions medical researchers to be on the lookout for ‘research parasites’, a coinage that the journal says is befitting “of people who had nothing to do with the design and execution of the study but use another group’s data for their own ends, possibly stealing from the research productivity planned by the data gatherers, or even use the data to try to disprove what the original investigators had posited”. As @omgItsEnRIz tweeted, do the authors even science?

https://twitter.com/martibartfast/status/690503478813261824

The choice of words is more incriminating than the overall tone of the text, which also tries to express the more legitimate concern of replicators not getting along with the original performers. However, by saying that the ‘parasites’ may “use the data to try to disprove what the original investigators had posited”, NEJM has crawled into an unwise hole of infallibility of its own making.

In October 2015, a paper published in the Journal of Experimental Psychology pointed out why replication studies are probably more necessary than ever. The misguided publish-or-perish impetus of scientific research, together with publishing in high impact-factor journals being lazily used as a proxy for ‘good research’ by many institutions, has led researchers to hack their results – i.e. prime them (say, by cherry-picking) so that the study ends up reporting sensational results when, really, duller ones exist.

The JEP paper had a funnel plot to demonstrate this. Quoting from the Neuroskeptic blog, which highlighted the plot when the paper was published, “This is a funnel plot, a two-dimensional scatter plot in which each point represents one previously published study. The graph plots the effect size reported by each study against the standard error of the effect size – essentially, the precision of the results, which is mostly determined by the sample size.” Note: the y-axis is running top-down.

The paper concerned itself with 43 previously published studies discussing how people’s choices were perceived to change when they were gently reminded about sex.

As Neuroskeptic goes on to explain, there are three giveaways in this plot. One is obvious – that the distribution of replication studies is markedly separated from that of the original studies. Second: the least precise results from the original studies worked with the larger sample sizes. Third: the original studies all seemed to “hug” the outer edge of the grey triangles, which represents a statistical measure responsible for indicating if some results are reliable. The uniform ‘hugging’ is an indication that all those original studies were likely guilty of cherry-picking from their data to conclude with results that are just about reliable, an act called ‘p-hacking’.

A line of research can appear to progress rapidly but without replication studies it’s difficult to establish if the progress is meaningful for science – a notion famously highlighted by John Ioannidis, a professor of medicine and statistics at Stanford University, in his two landmark papers in 2005 and 2014. Björn Brembs, a professor of neurogenetics at the Universität Regensburg, Bavaria, also pointed out how the top journals’ insistence on sensational results could result in a congregation of unreliability. Together with a conspicuous dearth of systematically conducted replication studies, this ironically implies that the least reliable results are often taken the most seriously thanks to the journals they appear in.

The most accessible sign of this is a plot between the retraction index and the impact factor of journals. The term ‘retraction index’ was coined in the same paper in which the plot first appeared; it stands for “the number of retractions in the time interval from 2001 to 2010, multiplied by 1,000, and divided by the number of published articles with abstracts”.

Impact factor of journals plotted against the retraction index. The highest IF journals – Nature, Cell and Science – are farther along the trend line than they should be. Source: doi: 10.1128/IAI.05661-11

Look where NEJM is. Enough said.

The journal’s first such supplication appeared in 1997, then writing against pre-print copies of medical research papers becoming available and easily accessible – á la the arXiv server for physics. Then, the authors, again two doctors, wrote, “medicine is not physics: the wide circulation of unedited preprints in physics is unlikely to have an immediate effect on the public’s well-being even if the material is biased or false. In medicine, such a practice could have unintended consequences that we all would regret.” Though a reasonable PoV, the overall tone appeared to stand against the principles of open science.

More importantly, both editorials, separated by almost two decades, make one reasonable argument that sadly appears to make sense to the journal only in the context of a wider set of arguments, many of them contemptible. For example, Drazen seems to understand the importance of data being available for studies to be validated but has differing views on different kinds of data. Two days before his editorial was published, another appeared co-authored by 16 medical researchers – Drazen one of them – in the same journal, this time calling for anonymised patient data from clinical trials being made available to other researchers because it would “increase confidence and trust in the conclusions drawn from clinical trials. It will enable the independent confirmation of results, an essential tenet of the scientific process.”

(At the same time, the editorial also says, “Those using data collected by others should seek collaboration with those who collected the data.”)

For another example, NEJM labours under the impression that the data generated by medical experiments will not ever be perfectly communicable to other researchers who were not involved in the generation of it. One reason it provides is that discrepancies in the data between the original group and a new group could arise because of subtle choices made by the former in the selection of parameters to evaluate. However, the solution doesn’t lie in the data being opaque altogether.

A better way to conduct replication studies

An instructive example played out in May 2014, when the journal Social Psychology published a special issue dedicated to replication studies. The issue contained both successful and failed attempts at replicating some previously published results, and the whole process was designed to eliminate biases as much as possible. For example, the journal’s editors Brian Nosek and Daniel Lakens didn’t curate replication studies but instead registered the studies before they were performed so that their outcomes would be published irrespective of whether they turned out positive or negative. For another, all the replications used the same experimental and statistical techniques as in the original study.

One scientist who came out feeling wronged by the special issue was Simone Schnall, the director of the Embodied Cognition and Emotion Laboratory at Cambridge University. The results of a paper co-authored by Schnall in 2008 hadfailed to be replicated, but she believed there had been a mistake in the replication that, when corrected, would corroborate her group’s findings. However, her statements were quickly and widely interpreted to mean she was being a “sore loser”. In one blog, her 2008 findings were called an “epic fail” (though the words were later struck out).

This was soon followed a rebuttal by Schnall, followed by a counter by the replicators, and then Schnall writing two blog posts (here and here). Over time, the core issue became how replication studies were conducted – who performed the peer review, the level of independence the replicators had, the level of access the original group had, and how journals could be divorced from having a choice about which replication studies to publish. But relevant to the NEJM context, the important thing was the level of transparency maintained by Schnall & co. as well as the replicators, which provided a sheen of honesty and legitimacy to the debate.

The Social Psychology issue was able to take the conversation forward, getting authors to talk about the psychology of research reporting. There have been few other such instances – of incidents exploring the proper mechanisms of replication studies – so if the NEJM editorial had stopped itself with calling for better organised collaborations between a study’s original performers and its replicators, it would’ve been great. As Longo and Drazen concluded, “How would data sharing work best? We think it should happen symbiotically … Start with a novel idea, one that is not an obvious extension of the reported work. Second, identify potential collaborators whose collected data may be useful in assessing the hypothesis and propose a collaboration. Third, work together to test the new hypothesis. Fourth, report the new findings with relevant coauthorship to acknowledge both the group that proposed the new idea and the investigative group that accrued the data that allowed it to be tested.”

https://twitter.com/significantcont/status/690507462848450560

The mistake lies in thinking anything else would be parasitic. And the attitude affects not just other scientists but some science communicators as well. Any journalist or blogger who has been reporting on a particular beat for a while stands to become a ‘temporary expert‘ on the technical contents of that beat. And with exploratory/analytical tools like R – which is easier than you think to pick up – the communicator could dig deeper into the data, teasing out issues more relevant to their readers than what the accompanying paper thinks is the highlight. Sure, NEJM remains apprehensive about how medical results could be misinterpreted to terrible consequence. But the solution there would be for the communicators to be more professional and disciplined, not for the journal to be more opaque.

The Wire
January 24, 2016

2016.01.24
India’s OA policy: Learning from Ioannidis

India’s first Open Access policy was drafted by a committee affiliated with the Departments of Biotechnology and Science & Technology (DBT/DST) in early 2014. It hasn’t been implemented yet. Its first draft accepted comments on its form and function on the DBT website until July 25; the second draft was released last week and is open for comments until November 17, 2014. If it comes into effect, it could really expand the prevalence of healthy research practices in the Indian scientific community at a time when the rest of the world is handicapped by economies of scale and complexity to mandate their practice.

The policy aspires to set up a national Open Access repository, akin to PubMed for biomedical sciences and arXiv for physical sciences in the West, that will maintain copies of all research funded in part or in full by DBT/DST grants. And in the spirit of Open Access publishing, its contents will be fully accessible free of charge.

According to the policy, if a scientist applies for a grant, he/she must provide proof that previous research conducted with grants has been uploaded to the repository, and the respective grant IDs must be mentioned in the uploads. Moreover, the policy also requires institutions to set up their own institutional repositories, and asks that the contents of all institutional repositories be interoperable.

The benefits of such an installation are many and great. It would solve a host of problems that are starting to become more intricately interconnected and giving rise to a veritable Gordian knot of stakeholder dynamics. A relatively smaller research community in India can avoid this by implementing a few measures, including the policy.

For one, calls for restructuring the Indian academic hierarchy have already been made. Here, even university faculty appointments are not transparent. The promotion of scientists with mediocre research outputs to top administrative positions stifles better leaders who’ve gone unnoticed, and their protracted tenancy at the helm often stifles new initiatives. As a result, much of scientific research has become the handmaiden of defence research, if not profitability. In the biomedical sector, for example, stakeholders desire reproducible results to determine profitable drug targets but become loth to share data from subsequent stages of the product development cycle because of their investments.

There is also a bottleneck between laboratory prototyping and mass production in the physical sciences because private sector participation has been held at bay by concordats between Indian ministries. In fact, a DST report from 2013 concedes that the government would like to achieve 50-50 investment from private and public sectors only by 2017, while the global norm is already 66-34 in favour of private.

In fact, these concerns have been repeatedly raised by John Ioannidis, the epidemiologist whose landmark paper in 2005 about the unreliability of most published medical findings set off a wave of concern about the efficiency of scientific research worldwide. It criticized scientists’ favouring positive, impactful results even where none could exist in order to secure funding, etc. In doing so, however, they skewed medical literature to paint a more revolutionary picture than prevailed in real life, and wasted an estimated 85% of research resources in the process.

Ioannidis’s paper was provocative not because it proclaimed the uselessness of a lot of medical results but because it exposed the various mechanisms through which researchers could persuade the scientific method to yield more favourable ones.

He has a ‘sequel’ paper published on the 10th anniversary of the Open Access journal PLOS Med on October 19. In this, he goes beyond specific problems – such as small sample sizes, reliance on outdated statistical measures, flexibility in research design, etc. – to showcase what disorganized research can do to undermine itself. The narrative will help scientists and administrators alike design more efficient research methods, and so also help catalyse the broad-scale adoption of some practices that have until now been viewed as desirable only for this or that research area. For India, implementing its Open Access policy could be the first step in this direction.

Making published results – those funded in part or fully by DBT/DST grants – freely accessible has been known to engender practices like post-publication peer-review and sharing of data. Peer-review is the process of getting a paper vetted by a group of experts before publication in a journal. Doing that post-publication is to invite constructive criticism from a wider group of researchers as well as exposing the experimental procedures and statistical analyses. This in turn inculcates a culture of replication – where researchers repeat others’ experiments to see if they can reach the same conclusions – that reduces the prevalence of bias and makes scientific research as a whole more efficient.

Furthermore, requiring multiple institutional repositories to be interoperable will spur the development of standardised definitions and data-sharing protocols. It will also lend itself to effective data-mining for purposes of scientometrics and science communication. In fact, the text and metadata harvester described in the policy is already operational.

Registration of experiments, which is the practice of formally notifying an authority that you’re going to perform an experiment, is also a happy side-effect of having a national Open Access repository because it makes public funds more tractable, which Ioannidis emphasizes on. By declaring sources of funding, scientists automatically register their experiments. This could siphon as-yet invisible null and negative results to the surface.

A Stanford University research team reported in August 2014 that almost 67% of experiments (funded by the National Science Foundation, USA) that yielded null results don’t see the light of day while only 21% of those sent to journals are published. Contrarily, 96% of papers with strong, positive results are read and 62% are published. As a result, without prior registration of experiments, details of how public funds are used for research can be distorted, detrimental to a country that actually requires more oversight.

It is definitely foolish to assume one policy can be panacea. Ioannidis’s proposed interventions cover a range of problems in research practices, and they are all difficult to implement at once – even though they ought to be. But to have a part of the solution capable of reforming the evaluation system in ways considered beneficial for the credibility of scientific research but delaying its implementation will be more foolish. Even if the Open Access policy can’t acknowledge institutional nepotism or the hypocrisy of data-sharing in biomedical research, it provides an integrated mechanism to deal with the rest. It helps adopt common definitions and standards; promotes data-sharing and creates incentives for it; and emphasizes the delivery of reproducible results.

2014.10.21

Tag: John Ioannidis

Priggish NEJM editorial on data-sharing misses the point it almost made

India’s OA policy: Learning from Ioannidis