et al
[37], which evaluated seven studies using a combina-
tion of T2WI, DWI, and DCE-MRI, the pooled sensitivity and
specificity were 0.74 (95% CI 0.66–0.81) and 0.88 (95% CI
0.82–0.92), respectively. In a more recent meta-analysis by
Hamoen et al
[6]which analyzed 14 studies using PI-
RADSv1, the pooled sensitivity and specificity were 0.78
(95% CI 0.70–0.84) and 0.79 (95% CI 0.68–0.86), respectively.
However, the comparison between the three studies merely
provided an indirect comparison. In order to address this
issue, we separately assessed a subgroup of studies using
both PI-RADSv1 and PI-RADSv2. In a head-to-head compar-
ison between them, PI-RADSv2 demonstrated higher pooled
sensitivity (0.95) compared with PI-RADSv1 (0.88,
p
= 0.04)
without a statistically significant difference in specificity
(0.73 vs 0.75,
p
= 0.90). This increase in sensitivity
compared with its predecessor may imply that the revisions
undertaken during the development of PI-RADSv2, includ-
ing the introduction of dominant sequences according to
zonal anatomy, limited contribution of DCE-MRI secondary
to DWI and T2WI, and specific guidelines for deriving an
integrated overall score, were, in fact, on the right track.
Especially, we speculate that the use of dominant
sequences, that is, DWI for the PZ and DCE-MRI for the
TZ, may have been crucial for the improved sensitivity
without a loss in specificity, as suggested by Baur et al
[10].
Considering that one of the main intentions for the
generation of PI-RADS was to standardize reporting of
mpMRI in order to decrease variability and bring about
widespread acceptance and implementation in daily
practice, it was promising to find that nearly all (20 of
21) studies used PI-RADSv2 strictly according to published
guidelines
[11]. Only one study formed PI-RADSv2 scores
from existing clinical radiological reports that were based
on PI-RADSv1 or an in-house scoring system
[29]. This is an
improvement when compared with prior studies conducted
using PI-RADSv1, where investigators used varying meth-
ods in determining the overall score (overall five-point
score or sum of the scores from each modality)
[6] .Still,
there is a need for further clarification regarding the cutoff
value for detecting PCa. In the studies included in our meta-
analysis, cutoff values were predefined in only six studies,
while the majority (15/21) were exploratory in nature,
testing multiple criteria. When using a cutoff value of 4,
sensitivity (0.89) and specificity (0.74) were generally good,
whereas using 3 yielded excellent sensitivity (0.95) and
poor specificity (0.47). These results may be taken into
consideration when generating the next updated PI-RADS.
For instance, using the former may be adequate for general
use of PI-RADS, whereas the latter could be proposed to be
indicated when a higher cancer detection rate is clinically
required (ie, persistently high PSA level despite a previously
negative biopsy).
In the current study, subgroup analyses were performed
to account for differences in outcomes (any cancer vs
clinically significant cancer). There was no significant
difference for using either outcome irrespective of whether
the criteria of 3 or 4 were used. However, the definition
of clinically significant cancer was different among the
13 studies. Only three studies defined csPCa strictly
according to the PI-RADSv2 guidelines (Gleason score
>
7
[3 + 4], volume
>
0.5 ml, or extraprostatic extension)
[11] .Most others used one or two of the three criteria.
Including only the former three studies may have provided
more robust results; yet it was not only pragmatic to
include all available studies, but this approach would
present a general overview of the existing literature, as it is
the first meta-analysis of studies currently dealing with PI-
RADSv2.
In this meta-analysis, we looked into the technical
aspects of MRI.
[5_TD$DIFF]
Meta-regression analyses revealed that the
use of endorectal coil was not a statistically significant
factor. Furthermore, although magnet strength showed
statistically significant differences between 3 and 1.5 T, this
did not reveal to be clinically meaningful (sensitivity of
0.90 vs 0.89,
p
= 0.03, respectively). Although there had
been debate over these two issues in the past, both 3 and
1.5 T are now well established, and the overall benefit of
using an endorectal coil is not evident
[38,39]. The PI-
RADSv2 guidelines currently recommend either usage, and
the results of our study provide additional evidence to
support this.
Regarding the methods of analysis in the studies, there
was significant heterogeneity regarding reference standard
and type of analysis. Radical prostatectomy was the
reference standard in five studies, while the majority were
based on a combination of systematic and targeted biopsies.
The possibility of PCa despite negative biopsy results in the
latter group should be kept in mind. In addition, approxi-
mately half of the studies each reported outcomes in a per-
patient (
n
= 11) and per-lesion (
n
= 10) Manner. Per-lesion
analysis is known to take into account the performance of
localizing the disease; however, this was not shown to be a
significant factor in the
[3_TD$DIFF]
meta-regression analysis.
Our meta-analysis had some limitations. Nearly all
studies were retrospective in study design, resulting in a
high risk of bias for patient selection. It is possible that
pooling data from predominantly retrospective studies may
have led to increased diagnostic sensitivity
[40] .In addition,
not only was performing a meta-analysis using only three
prospective studies technically unfeasible, but the derived
results would not be representative of the existing literature
on PI-RADSv2 as well. Furthermore, we used validated
methods for the systematic review and reported the data
using standard reporting guidelines, including PRISMA and
the guidelines of the Handbook for Diagnostic Test Accuracy
Reviews published by the Cochrane Collaboration
[12,41]. Another limitation is considerable heterogeneity
in our pooled analysis, which affected the general applica-
bility of our summary estimates. To explore the heteroge-
neity of our data, we performed
[3_TD$DIFF]
meta-regression and
multiple subgroup analyses. According to the analyses, the
proportion of patients with PCa, the magnetic field strength,
and the reference standard were significant factors affecting
the heterogeneity. Especially, the reference standard
included various methods, including radical prostatectomy
and a combination of systematic and targeted biopsies (ie,
MRI guided, MRI-transrectal ultrasound fusion, or cogni-
tive). Furthermore, the fact that various definitions were
E U R O P E A N U R O L O G Y 7 2 ( 2 0 1 7 ) 1 7 7 – 1 8 8
186




