In reply
Network meta-analyses are not about a single treatment but about sets of regimens
Maiese, E. M., Ainsworth, C., Le Moine, J., Ahdesmӓki, O., Bell, J., & Hawe, E. (2019). In reply: Network meta-analyses are not about a single treatment but about sets of regimens. Clinical Therapeutics, 41(1), 188-190. Advance online publication. https://doi.org/10.1016/j.clinthera.2018.11.011
Abstract
In a seeming effort to advance the doublet of carfilzomib + dexamethasone as the preferred treatment option for relapsed/refractory multiple myeloma, Panjabi and Iskander of Amgen (Amgen, South San Francisco and Thousand Oaks, California) challenge our recent network meta-analysis (NMA) on the comparative efficacy of treatments for previously treated multiple myeloma.1 Considering the paucity and lack of depth of their arguments in addition to the questionable scientific and clinical rationale, we challenge herein Panjabi and Iskander's position that our NMA "should not be used to inform treatment decisions because the analysis was significantly flawed" as well as their innuendo that a carfilzomib-based doublet offers better efficacy than other treatments based on naive, cross-trial comparisons of medians.
Panjabi and Iskander suggest that the trials included in the NMA violated the similarity and consistency of assumptions in NMAs, leading to biased treatment effects. In support, they cite part 2 of the publication from the International Society for Pharmacoeconomics and Outcomes Research (ISPOR) Task Force on Indirect Treatment Comparisons Good Research Practices, Conducting Indirect Treatment Comparison/Network Meta-analysis Studies.2 That report indeed addresses the issue of similarity but with a caution, citing Fu et al,3 that "no commonly accepted standard [defines] which studies are 'similar enough.'" Panjabi and Iskander do not specify their criteria for similarity, nor do they offer evidence as to how our NMA violated the similarity assumption. Similarly, the ISPOR guideline emphasizes that consistency is a property of direct and indirect evidence for treatment comparisons, not of individual comparisons. However, Panjabi and Iskander seem to assume the latter in their rather singular focus on carfilzomib. As we noted within the Discussion section of the article,1 we concur with the literature that NMAs have limitations—and so does ours. However, just as much and despite these limitations, we concur with the field that there is a distinct need to support clinicians in appraising the comparative efficacy of treatment options based on the available evidence base. The synthesis of trial results through NMA is a well-recognized scientific, clinical, and regulatory approach to enabling broader comparative efficacy questions to be addressed when head-to-head evidence is lacking—and most likely never will exist. As part of the analysis, heterogeneity and consistency analyses were performed and did not show any significant differences or indicate bias. Due to the small size of the network, adjustment for baseline characteristics was inconclusive.
We agree that it would be preferable if an NMA like ours could additionally consider the comparative efficacy of treatments using overall survival and quality of life. However, as Panjabi and Iskander noted, quality of life has been inconsistently reported in relapsed/refractory multiple myeloma treatment trials, but they fail to acknowledge that this inconsistency makes an NMA evaluating quality of life infeasible. Overall survival data from the CASTOR (Phase 3 Study Comparing Daratumumab, Bortezomib, and Dexamethasone [DVd] vs Bortezomib and Dexamethasone [Vd] in Subjects With Relapsed or Refractory Multiple Myeloma)4 and POLLUX (Phase 3 Study Comparing Daratumumab, Lenalidomide, and Dexamethasone [DRd] vs Lenalidomide and Dexamethasone [Rd] in Subjects With Relapsed or Refractory Multiple Myeloma)5 trials were immature at the time of the analysis, with median overall survival not yet reached—which has been sufficiently convincing to scientists, clinicians, and regulators. Further, without patient-level data, differences in complex treatment pathways postprogression across trials cannot be fully addressed, preventing comparison analyses of overall survival across trials adjusted for differential postprogression treatments.
In addition, while we agree that considering safety profiles between treatments is of interest, the feasibility of including safety is limited to only those adverse events that are the same across treatments. For example, if one treatment is associated with a particular adverse event that is not observed with another treatment, then the odds ratio estimated in the NMA would be infinite. The NMA would then not provide a clear comparison of adverse–events profiles, when those profiles differ by treatment. Thus, we agree with Panjabi and Iskander that, in principle and in retrospect, data on quality of life, overall survival, and safety would have been of interest. We disagree, however, that NMAs should be held accountable for what was not or was inconsistently included in the trials. NMAs are about the available evidence, not the desired evidence.
Panjabi and Iskander argue that the UK National Institute of Clinical and Health Excellence (NICE) committee has stated that "capping treatment duration instead of time to progression would reduce Vd [bortezomib plus dexamethasone] efficacy"; however, the evidence review group commissioned by NICE to independently appraise available treatment options noted that the method used to evaluate the reduction in efficacy (matching-adjusted indirect comparison) provided unreliable results.6 Therefore, the evidence review group presented an analysis in which it assumed no reduction in efficacy while capping treatment duration to 8 cycles (italics ours).
In stating that "relative efficacy (PFS [progression-free survival]) of a triplet regimen daratumumab plus Vd (DVd) was compared with a doublet carfilzomib plus dexamethasone (Kd)," Panjabi and Iskander seem to imply that the number of agents in a regimen should drive the comparison between therapeutic regimens, not the efficacy that regimens, whether including 2 or 3 agents, may have for a given indication. Our NMA investigated the relative efficacy of treatments that were approved by the US Food and Drug Administration, including doublet and triplet regimens. Comparisons between doublets and triplets are common and have been included consistently in both trials and NMAs of such trials.7, 8 The general consensus of these comparisons is that triplets achieve better PFS than do doublets.9, 10, 11 Furthermore, we caution Panjabi and Iskander against making naive comparisons of median PFS across trials. In part 1 of its report, the ISPOR Task Force referenced above emphasizes that relative effects are a more appropriate metric.12 The PFS hazard ratio (95% CI) was 0.31 (0.24–0.39) for DVd versus Vd in CASTOR4 and 0.53 (0.44–0.65) for Kd versus Vd in ENDEAVOR (Carfilzomib and Dexamethasone Versus Bortezomib and Dexamethasone for Patients With Relapsed or Refractory Multiple Myeloma).13
In relation to the toxicity profile of doublet and triplet therapies, in our NMA, carfilzomib was included in the evidence base as a triplet, carfilzomib, lenalidomide and dexamethasone (KRd), using results from the ASPIRE (Phase 3 Study Comparing Carfilzomib, Lenalidomide, and Dexamethasone [KRd] vs Lenalidomide and Dexamethasone [Rd] in Subjects With Relapsed or Refractory Multiple Myeloma) trial14 and as a doublet, Kd, based on data from the ENDEAVOR trial.13 From the latter, we note that in the Kd arm, patients received twice the dose of carfilzomib (56 mg/m2) than they received in the KRd arm. It would be of interest to investigate whether there is a significant difference between Kd and KRd or other triplets. Furthermore, we note that the percentage of overall adverse events was highly comparable between the Kd and Vd arms in ENDEAVOR (98.3% vs 98.0%, respectively)13 and between the DVd and Vd arms in CASTOR (98.8% and 95.4%).4
Finally, Panjabi and Iskander note that there were "differences in the median PFS for Rd across trials." While we acknowledge that there were some differences in median survival, Panjabi and Iskander seemingly fail to recognize that the approach of NMAs is based on relative differences between PFS values. Similarly, Panjabi and Iskander overlook that we compared baseline characteristics, the results of which are discussed within the article.1 Of note, while differences in some baseline characteristics were observed, a between-trial heterogeneity analysis did not indicate any significant differences between trials—thus rendering this argument by Panjabi and Iskander moot as well.
Our analysis and the conclusions we have drawn are robust and well founded, although we also clearly present the limitations of the analysis. We strongly disagree with Panjabi and Iskander when they state that our NMA "should not be used to inform treatment decisions because the analysis was significantly flawed." Our findings may not be as favorable to carfilzomib as the authors may have wished. However, they are significantly more robust—scientifically, statistically, and clinically—than the broadly generalizing approach taken by Panjabi and Iskander.
In summary, our NMA rigorously shows the comparative efficacy of doublet and triplet regimens for previously treated multiple myeloma. Our conclusions are based on this evidence. In contrast, Panjabi and Iskander have presented, at best, a poorly substantiated dismissal of our NMA in support of a treatment regimen with carfilzomib—evidence for which is wanting.
To contact an RTI author, request a report, or for additional information about publications by our experts, send us your request.