Article type: This is a pre-print. This article has not yet been peer-reviewed.
Authors: David M. Miller1,2, Sophia Z. Shalhout1,2, Farees Saqlain2, Vishal Patel3, Kenneth Y. Tsai4, Ravikumar Komandur5, Bill Louv5 and Michael K. Wong6
1Department of Medicine, Division of Hematology/Oncology and the Department of Dermatology, Massachusetts General Hospital, Boston, MA.
2Harvard Medical School, Boston, MA.
3Department of Dermatology, George Washington School of Medicine & Health Sciences, Washington, DC.
4Department of Anatomic Pathology, Department of Tumor Biology, Donald A. Adam Melanoma and Skin Cancer Center of Excellence, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL.
5Project Data Sphere, Morrisville, NC.
6Department of Melanoma Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, Texas.
*Corresponding author: David M. Miller MD PhD
Massachusetts General Hospital
Boston MA 02114
Funding sources: Project Data Sphere, the American Skin Association and ECOG-ACRIN.
Conflicts of interest: DMM has received honoraria for participating on advisory boards for Checkpoint Therapeutics, EMD Serono, Pfizer, Merck, Regeneron and Sanofi Genzyme. This article reflects the views of the authors only.
Manuscript word count: 2621
Abstract word count: 292
Keywords: Merkel cell carcinoma, tumor registry, real world data
Acknowledgments: We would like to acknowledge Paul Nghiem, Arthur Sober, Manisha Thakuria, Lauren Haydu, Kristina LaChance, Chris Bichakjian, Isaac Brownell, Martha Donoghue, Megan Granda and Johnathan Rine for their participation in Task Force meetings
Abbreviations: CDISC: Clinical Data Interchange Standards Consortium, DAG: data access group, DCI: data collection instrument, EDC: electronic data capture, FDA: U.S. Food and Drug Administration, HITECH: Health Information Technology for Economic and Clinical Health, MCC: Merkel cell carcinoma, MCPyV: Merkel cell polyomavirus, MGB: MassGeneral Brigham, NCDB: National Cancer Database, NIH: national institutes of health, PFS: Progression-Free Survival, REDCap: research electronic data capture, RWD: real-world data
The Merkel Cell Carcinoma (MCC) Patient Registry is a national multi-institutional collaborative effort that will prospectively follow and record outcomes and events in MCC patients. MCC is the prototypical rare tumor, and this Registry will trail blaze new methodologies that will enable multiple investigators to examine real world outcome data in real time. Deliverables from the Registry include: (i) precise patient stratification into risk categories, (ii) identification of best practices, (iii) real-world data for drug development programs, (iv) revelations about optimal sequence and combinations therapies, (v) uncovering low incidence toxicities and (vi) the generation of novel testable hypotheses. Importantly, the Registry offers a way forward in the yet-unsolved dilemma of drug development for rare tumors since the Registry’s design will allow for the creation of highly defined patient-level data that can be used as a robust comparator for single arm Phase I-II clinical trials. The MCC Task Force comprises members from academic medical centers, the drug industry, the NIH and FDA. Project Data Sphere, LLC provides a secure, open-access data sharing platform and comprehensive support to optimize research performance and ensure rigorous and timely results. The Registry is currently in development and is based on a REDCap database integrated into the host institution’s electronic medical record. We plan to have the first patient accessioned on Project Data Sphere’s data platform in Q1 of 2022. Members of the Merkel Cell Carcinoma Registry Task Force represent a joint effort of research and clinical investigators from academia, industry and regulatory science to develop the first publicly held MCC registry on Project Data Sphere’s open-access data platform. Our hope is that this shared repository will allow investigators to identify new approaches, improve treatment outcomes, shorten the time from discovery to implementation and, ultimately, improve patient lives.
Although many thoughtful investigators have contributed important epidemiological data to the MCC field, much of the data we have now originates from either well-curated, single institution databases1,2,3 or large administrative datasets4,5,6,7,8 that lack important clinical details. Consequently, our understanding of the natural history of the disease is confounded by both our inability to make generalizations from single institution data, as well as the lack of nuance in the larger administrative datasets. These dilemmas have been discussed at several recent international Merkel Cell Carcinoma Workshops and Symposia9,10,11. Numerous investigators have raised the point that the data currently circulating in the public arena, and thus frequently accessed by patients, does not match their clinical experience of the disease. Of great concern are outcome estimates that project a disease course more morbid than what leading experts in the field believe to be true12. Many at these workshops agreed that the discrepancy between public information available online, and the clinical experience of expert investigators, is due to wide-spread inaccuracies and information gaps in the large administrative databases upon which the general public too often relies for sources of information about MCC. There is clearly a critical need for an accurate, publicly-accessible repository of information. This research Task Force represents a multi-institutional collaboration between lead investigators from academia, industry and the government to establish a large, well-curated, publicly accessible data set that more accurately defines the natural history of the disease. The need for an open repository of reliable patient-level data on MCC is urgent: it informs clinical decisions about treatment and helps us identify risk categories for clinical trials and follow-up patient care.
The ability to identify precise explanatory variables that predict clinical outcomes is critical in every malignancy, and is particularly challenging in rare diseases such as MCC. Examples of covariates that may prognosticate clinical course include: immunosuppression13,14, Merkel cell polyomavirus (MCPyV) status15,16,17, PD-L1 expression18 and various molecular changes in the tumor16,19. However, inconsistencies - which many speculate are due to natural variations in small data sets - hamper our ability to interpret and apply this information in clinical practice.
A significant example of this inconsistency is evident in the most recent AJCC MCC staging system, which is based heavily on the National Cancer Database (NCDB)20. Unfortunately, the NCDB does not cover the entire spectrum of the disease and lacks critical outcome information, such as recurrence rates or disease-specific survival. Thus, a MCC registry that more precisely captures the clinical characteristics and outcomes of patients is an important unmet need in order to more accurately define predictive and prognostic covariates.
Lack of a consistent biology, natural course of progression and presentation of MCC, makes it difficult to assess the benefits of current neoadjuvant, adjuvant and surveillance maneuvers. A registry of aggregated data sets would help clinician investigators identify the optimal approaches for each of these areas.
MCC possesses the uncommon characteristic of being both an extremely radio- and chemo- sensitive disease. Unfortunately, the high systemic recurrence rate for the former, and the short Progression-Free Survival (PFS) for the latter, make their role as primary therapeutic agents less effective. Registry information would provide investigators the opportunity to ask questions about the role of these maneuvers as preconditioning or salvage options.
An MCC registry captures information across the entire spectrum of cancer care - surgery, radiotherapy, medical interventions and supportive care over time - in ways that a formal clinical trial is not structured to capture. The amalgamation of information across the entire disease trajectory can expose best practices that can be implemented with what is at hand.
To reap the full benefit of the MCC registry would necessitate active oversight and analysis, similar to the effort required to maintain the PROCLAIM (interleukin-2) and Bone Marrow Transplant registries. Such pre-planned and scheduled tasks have been baked into the standard protocol of the MCC registry.
At the time of this writing, there are currently only two systemic therapies with a FDA-labeled indication for Merkel Cell Carcinoma: avelumab (Bavencio™) and pembrolizumab (Keytruda®). The JAVELIN Merkel 200 Trial upon which avelumab was approved, demonstrated a 33% Overall Response Rate (ORR) with only 45% of patients still in response at the end of year 1. In KEYNOTE 017, a study of front-line therapy with pembrolizumab, the ORR was 56% with a response duration spanning 5.9-34.5 months. Thus, the need for additional therapeutic modalities is of high priority.
Rare tumors such as MCC present significant challenges to conventional drug development. As discussed in the sections above, understanding the presentation, natural history and predictive, prognostic covariates are important. However, the low prevalence of the disease precludes the ability to conduct traditional head-to-head comparison trials. Indeed, the two practice-changing trials defining PD-1/1L therapy in MCC, namely JAVELIN and KEYNOTE 017, were uncontrolled, single arm, open-label trials enrolling 88 and 50 patients, respectively.
In order for a clinical trial to be effective, it should be designed with adequate statistical power to render a decision when one intervention is compared against another, and to then draw conclusions with a relative degree of certainty. In the case of rare diseases, such as MCC, it is unlikely that the requisite number of patients needed for robust analysis would enroll within the limited time span of a typical clinical trial.
A well-designed, curated registry with large datasets would offer a solution to this genuine dilemma. Such a registry would serve as a more accurate comparator in analyses of Phase II data, such as JAVELIN 200 and KEYNOTE 017. It could help identify high-risk populations, and may also offer clues to the inherent differences between virus-positive and wild type MCC. Clues to therapeutic synergies could be inferred from its ability to capture subsequent therapies. Most importantly, the identification of unique rare toxicities, as well as best practices (discussed below), will provide immediate improvement to patient treatment and care.
Real-world data (RWD) is an emerging and promising paradigm of clinical evidence generation that has the potential to augment existing strategies of regulatory science. While uptake of health information technologies have increased significantly with the implementation of the 2009 Health Information Technology for Economic and Clinical Health (HITECH)21, current EHR systems have limitation as RWD due to large volumes of unstructured data and the focus on billing and practice management elements for those data that are structured. Thus, collection of clinically relevant variables in structured formats is critical to be able to leverage advances in data science technologies. Furthermore, prospectively executed tumor registries developed to harmonize with data standards required by the US Food and Drug Administration, such as the Clinical Data Interchange Standards Consortium (CDISC), are necessary to optimize extraction, transformation and exchange of the collected analytes.
In addition, prospective tumor registries have the potential to act as a complement to conventional clinical trials, which, through randomized controlled design, attempt to minimize threats to internal validity; however, due to strict and often narrow eligibility criteria, they often have deficits in external validity. Through purposeful prospective data capture at the point of care for research, the MCC tumor registry has the potential to obtain a much broader representation of MCC cases, including patients with poor performance status and significant co-morbidities. Although great care is needed to address threats to internal validity inherent in real-world studies - such as information, confounding, compliance and selection bias – with thoughtful study design, the PDS MCC registry can be used as a platform for proof-of-concept studies using RWD as a way to support a supplemental indication of a labeled medication, or fulfill a postmarketing requirement of an accelerated approval.
An effective tumor registry needs to have several critical attributes. These should include, but not be limited to the following: 1) Low-economic barrier to entry, 2) Research-grade functionality, 3) Data security, 4) Web-based, 5) Accessible by multiple providers, 6) Customizable, 7) Permits mid-study modifications, 8) Allows for external data importation, 9) Exports results to common statistical packages, 10) Compliance to 21 CFR Part 11, FISMA, HIPAA standards, 11) Ability to harmonize to Clinical Data Interchange Standards Consortium (CDISC)
Databases that are in wide use and have broad support will allow for growth and expansion to a multitude of institutions and resist obsolescence. A user-friendly Graphical User Interface (GUI), lowers the workload and is essential for immediate adherence and adaptation. In this ever-more connected world, the ability to work across multiple electronic devices, from smartphones to desktops, has become a necessity.
While there are a variety of electronic data capture (EDC) platforms available that possess several of the important characteristics listed, our Task Force has determined that the Research Electronic Data Capture (REDCap®) system offers several distinct advantages, making it worthy of selection as the Registry’s primary platform.
REDCap® was created by Vanderbilt University in 2004 and is supported by a consortium of 5266 institutions using REDCap® in 146 countries22,23. To date, there are over 1.8 million end-users24. Importantly, REDCap® software and consortium support are available at no charge for non-profit organizations. In addition, REDCap® has several other features that increase the usability as a Registry platform. For instance, secure Data Access Groups (DAGs) can be created within one project, enabling password-protected access to distinct groups. Furthermore, given that REDCap® is browser based, investigators from institutions without REDCap support can join existing projects. Therefore, one centralized project can be utilized by independent investigators with secure access to their own data. This ensures that multi-institutional investigators are capturing data using the same data collection instruments (DCIs). This provides a unified code and structure to the data and optimizes time-to-analysis time-to-insight.
This is the primary task for all tumor registries, as it informs all following tasks, and guides decision-making. Examples of this include documenting: MCC presentation, overall survival, disease-specific survival, progression-free survival, and toxicity. The Task Force has agreed that the primary purpose of the Registry is a natural history study in order to: 1) Precisely characterize the presentation and natural history of MCC; 2) Identify best practices in MCC; and 3) Identify prognostic and predictive biomarkers in MCC. Secondary objectives include developing a platform to explore RWD as real-world evidence for drug development. Specifically, RWD from the MCC patient registry could be used to: 1) Generate hypotheses to be tested in clinical trials; and 2) RWE to support drug repurposing/label modification. Examples of the latter include: 1) Adding safety information to Section 6 of the product label and/or 2) Providing RWE to support of a post-marketing requirement to support a regulatory decision or add/modify a labeled indication.
Table 1: MCC Patient Registry Data Collection Instruments
No registry will survive if it cannot be easily integrated into normal routine clinical practice. Therefore, we designed a multi-step roll out of the Registry, making adjustments based upon user feedback.
The initial roll out is being conducted at a select number of sites in order to assess the robustness of the Registry and to identify and respond to unanticipated problems in real-time. In keeping with the recommendation from the FDA’s Draft Guidance “Rare Diseases: Natural History Studies for Drug Development: a Guidance for Industry”26, the rollout plan is commencing with a Pilot Study. The purpose of the Pilot Study is to clarify: 1) What data elements to collect, 2) How to optimally code the data, and 3) How to standardize the information collection in a way to facilitate analysis. The current Pilot Study is being carried out at Mass General Brigham, George Washington University Hospital and Moffitt Cancer Center. A REDCap® Data Access Group (DAG) has been created for each site and designated MGB, GWU and Moffitt, respectively. Following the Pilot Study, we anticipate additional sites joining (e.g. MD Anderson (MDA)) to increase the sample size of the Registry.
Paramount to successful multi-institutional collaboration is the development of strategies to ensure alignment of incentives for individual stakeholders. A principal consideration for the MCC registry Task Force was the development of a data access model that appropriately incentivized participants to contribute their captured data. Our model incorporates three tiers of data access (Figure 1). Tier 1 is a closed access phase in which only investigators within a given DAG (e.g. MGB) are able to view and perform data analysis on that data. If an investigational team within a specific DAG has a novel hypothesis that they wish to test with their single-institution data set, they have the ability to do so. The Task Force appreciated that there are unique deliverables at individual institutions that necessitate a degree of academic freedom. This maximizes a team’s incentive to collect data and communicate results to meet their individualized needs. Nevertheless, important hypotheses in rare disease can rarely be effectively tested with single-institutional data sets. Consequently, participants have a strong incentive to aggregate their data with other institutions to increase the power of their analyses. Therefore to incentivize researchers, a second tier exists in which de-identified single-institutional data is migrated onto the PDS platform and aggregated with data from other DAGs. This compiled data set will be embargoed for an agreed upon time frame (e.g. 12-24 months). These embargoed data will only be available to those investigators who have contributed data. Given that all the data is collected using the same DCIs, aggregation on the PDS platform does not require harmonization. Following the embargo period data is made freely available to registered users on the PDS platform during Tier 3.
Figure 1: MCC Patient Registry Data Access Model
To maximize insights from individual patient journeys, we created a comprehensive data collection effort. An early insight from the ongoing Pilot Study was the need to better align content delivery with resource allocation. In order to facilitate capture and transfer of RWD that could contribute novel insights in a timely fashion, we developed a phased data release strategy in which investigators are initially focused on a subset of DCIs during each data release (Figure 2). This approach permits collection, validation and transfer of high-yield RWD in a concise and agreed upon time frame.
Figure 2: MCC Patient Registry Phased Data Release Strategy
Multi-institutional, structured RWD is critical in order to improve the outcomes of patients with rare diseases. Here we present a roadmap for the creation and implementation of an MCC Registry to be made publicly available on the PDS platform. We began the process by identifying the goals and objectives of the Registry which allowed us to select an electronic data capture system that best suited those needs. We leveraged multi-disciplinary expertise to develop consensus around data collection instruments across the MCC patient journey. In addition, we developed a data access plan and data release strategy to maximize participation and deliverables to the MCC community. Furthermore, we designed the MCC Patient Registry as an modular platform with the intention that it could be adapted and quickly customized to other rare disease investigators. Our hope is that this platform will not only help improve the lives of patients with MCC, but also potentially improve others suffering from rare tumors.