Evidence Prioritisation can Benefit Reviews and their Conduct
The ever-increasing volume of scientific literature is a growing problem for those in the evidence generation business. Researchers, clinicians and manufacturers alike are all vying to get their clinical trials, prospective cohorts and case series into the public domain. MEDLINE, the bibliographic database behind PubMed, broke the one million mark by some distance in 2021. Tens of thousands of hits from electronic databases rendered from sensible search strategies is considered par for the course. Increasingly, it is unfeasible to consider absolutely everything; reviews need limits. So why does nobody talk about it?
As reported by my colleagues at 2023’s Cochrane Colloquium, an umbrella review of systematic literature reviews (SLRs) canvassing data of Covid-19 vaccines demonstrates my point. Of 103 SLRs identified, not a single one mentioned how evidence could be narrowed down, eligibility criteria aside. Five SLRs mentioned how quality appraisal (QA) of studies could ostensibly be used to narrow down studies prior to extraction. But even then, extraction is the final stage of the review. Where did the rest of those 25,000 abstracts go? And of the initial hits retrieved through the database searches, the median proportion of studies that were ultimately extracted among the 103 SLRs was 1.69%. Taken at face value, that is high attrition. In reviews I have conducted across many disciplines, I would struggle to achieve that without prioritisation strategies following both the title/abstract and full text review stage. Surely these authors are using prioritisation strategies too?
A key tenet of SLRs is that the methodology you report is transparent and reproducible, so other researchers could conceivably attain a similar result. Whilst systematic reviewers are obliged to specify what they did a priori, I can’t help but feel that there’s clearly something going on during the conduct of their reviews that the reader is not privy to. How else can one start with a research question as broad as what is the clinical burden of solid cancers? and somehow end up with 25 studies out of the other end? Perhaps QAs of SLRs should include the question Did the authors use a justifiable evidence prioritisation strategy? as part of their reporting checklists.
This brings me to my hypothesis: evidence prioritisation, rather than being a taboo, is not only essential, but can make reviews better. As alluded to above, having a sensible research question is no guarantee of a sensible number of hits, or a sensible answer to your question. I’d posit that if one could measure the “usefulness” of an SLR’s outputs relative to the number of studies included, it would look a bit like a bell curve, with a median of, say, 20–40 studies corresponding to “peak usefulness”. The fringes, meanwhile, generally represent lower utility, with SLRs identifying either too few or too many studies and conclusions limited by a lack of, or too heterogeneous, evidence, respectively. Resisting the urge to pack an SLR with as much data as possible can produce a focused, more impactful message. Even if your research area is awash with high-quality randomised trials of homogeneously-distributed patients, almost all SLRs will be subject to strictures. Ultimately, prioritisation can allow SLRs to focus on the most relevant studies to best answer the question asked originally.
Furthermore, the larger your review is, the longer it’ll take. This sounds obvious, but is seldom apparent from their results. Even the most diehard systematic reviewers at Costello Medical will be flagging having reviewed abstracts for six months straight, especially if the next stage looks similarly onerous. So what can be done?
The key word from my earlier hypothesis is can; as in, prioritisation can make a review better. However, it must be proportional and discriminate to the information available. Whilst removing all records wherein author surnames start with vowels is a broad yet unorthodox approach, excluding articles failing to report patients in the fourth or fifth treatment line based on information in the titles and abstracts of records is unlikely to be an improvement. Particularly when faced with limited information at the most outsized part of the review, broadly applicable evidence prioritisation coupled with easily identifiable data is key. Date limits are a sensible resort here, especially if these can be tied to a key milestone, such as a change in standard of care or emergence of new consensus. Ten- or even five-year limits have a place in reviews of costs, especially considering how inflation has eroded the applicability of such data collected a few years ago.
Study design or “evidence level” – such as randomised evidence before merely prospective evidence before retrospective data – is again sensible, but not a definitive solution. Even with a full publication to hand, a surprising number of studies in the literature are not explicit or even implicit as to whether they are interventional or observational, much less retrospectively or prospectively conducted, and not all studies are designed equally. So care must be sought.
The particular disposition of a population of interest may not become clear until an SLR is well underway, at which point you should be asking: what is the most important aspect of these patients to focus on? Which patient subgroups are most relevant? And, crucially, what are the most important outcomes to focus on? Often it is insisted at the outset that SLRs should aim to recover absolutely everything, but is there really value in capturing a speculatively-measured biomarker or an anecdotal patient reported outcome? Whilst variety in endpoints is certainly interesting and may give a signal of the future direction of an area, hard, relatively homogeneous clinical endpoints such as response and survival with precedent are more likely to be valued by payers and other key decision-makers. However, the decision to focus outcomes may need to be made following full text review, or else you risk confirmation bias.
Crucially, evidence prioritisation only makes a review better if it is transparently documented and justified. Whilst innovations such as artificial intelligence (AI) and literature-based discovery may reduce the burden of reviewing and yield hypotheses not yet considered, SLRs will continue to require taming, to ensure they are timely and relevant. So let’s embrace evidence prioritisation. Let’s talk about it.