Article type: Application Note. The version is a pre-print. Please see the journal’s peer-reviewed version at https://academic.oup.com/jamiaopen/article/5/1/ooac013/6542419.
Authors: David M. Miller1* MD PhD, Sophia Z. Shalhout1 PhD
1Department of Medicine, Division of Hematology/Oncology and the Department of Dermatology, Massachusetts General Hospital, Boston, MA
*Corresponding author: David M. Miller MD PhD
Massachusetts General Hospital
Boston MA 02114
Funding sources: The Harvard Cancer Center Merkel Cell Carcinoma patient registry is supported by grants from Project Data Sphere, the American Skin Association and ECOG-ACRIN.
Conflicts of interest: None
Manuscript word count: 1910 [max 2000 excluding abstract/tabs/figs/res/suppl]
Abstract word count: 149 [max 150]
References: 22 [unlimited]
Figures: 8 [max 3]
Tables: 5 [max 2]
Keywords: data visualization, cancer, shiny app, REDCap
Abbreviations: CSV: comma-separated values, CTC: clinical tumor characteristics, GLP: geographical lesion profile, ICD-O-3: International Classification of Diseases for Oncology, 3rd edition, IDE: integrated development environment, MCC: merkel cell carcinoma, RWD: real-world data, REDCap: Research Electronic Data Capture, SNOMED: Systematized Nomenclature of Medicine, UI: user interface.
Objectives: Structured real-world data, such as those found in cancer registries, provide a rich source of information regarding the natural history of cancer. Interactive data visualizations of cancer lesions can provide insights into certain clinical tumor characteristics. Software that can be integrated into an oncological data collection effort and generate anatomical data visualizations of clinical tumor characteristics are limited.
Materials and Methods: We created BodyMapR: an R package and Shiny application that generates anatomical visualizations of cancer lesions from structured data.
Results: BodyMapR is a shiny application that transposes structured data from REDCap® onto an anatomical map to yield an interactive data visualization.
Conclusions: BodyMapR is freely available under the MIT license and can be obtained from GitHub. BodyMapR is executed in R and deployed as a Shiny application. It can be integrated into an existing cancer research platform and produces an interactive data visualization of clinical tumor characteristics.
Large-scale data collection efforts in rare cancers, such as Merkel Cell Carcinoma (MCC), are challenging and uncommon1. The dearth of structured data have limited our understanding of the natural history of rare cancers, such as MCC2,3. Consequently, we lack a comprehensive understanding of clinical tumor characteristics (CTC), such as patterns of metastatic spread and biomarkers predictive of treatment response, for most rare tumors. Data collection efforts that incorporate structured data captured during real-world practice (a.k.a Real-World Data or RWD) can improve our understanding of CTC.
Depicting RWD, e.g. from a cancer registry, onto graphical representations of anatomical structures can provide a user-friendly technique to process information regarding CTC. However, displaying large amounts of RWD onto anatomical data visualizations is labor-intensive and time consuming. While informatic packages that generate modular visualizations of anatograms and tissues are available4, software that fully integrates data collection instruments for real-time anatomical data visualizations of cancer registry data are lacking.
We previously published an overview of a methodology and design of a Research Electronic Data Capture (REDCap®)5-based system to facilitate capture of RWD6,1. That platform incorporates a form entitled the Lesion Information instrument, which provides a structured format for the collection of CTC7. This instrument is freely available and can be incorporated into any existing REDCap project. It is currently being used by the Project Data Sphere led Merkel Cell Carcinoma Patient Registry1.
Here we present BodyMapR, an R package with a Shiny application front-end, which generates an interactive data visualization of CTC. Its software wrangles and transforms structured data from a REDCap® project and provides graphing functions (Figure 1). BodyMapR is executed in R but is deployed as a Shiny application to enhance the user interface for users with limited programming capabilities. In this manuscript we provide (1) instructions on how to obtain and execute BodyMapR, (2) the R code for the server side functions to allow for project-specific adaptations, (3) a Biorender-generated png file in which data is overlaid and displayed, and (4) a sample dataset for demonstration purposes.
Figure 1. Schema of BodyMapR. BodyMapR takes data from a REDCap® project that incorporates the Lesion Information and Genonimcs instruments. This csv file is loaded into the Shiny application and end users engage BodyMapR via a browser-based interface. Server side R code executes the functions of BodyMapR to generate an interactive Plotly visualization of clinical tumor characteristic data displayed onto an anatomical body map. Anatomical images created with BioRender.com.
BodyMapR is written in R (version 4.0.0), organized using roxygen28, and utilizes the following packages dplyr9, tidyr10, readr11, stringr12, purrr13, magrittr14, plotly15, shinydashboard16 and Shiny17. For full details, instructions and examples refer to the video demonstration, or README file, both of which can be viewed on the package GitHub page.
BodyMapR facilitates data visualizations from structured data contained in the Lesion Information instrument stored within REDCap® project. The data dictionary for this form has been previously published18. BodyMapR also integrates clinico-genomic data from the Genomics data capture instrument, which has been previously described6,19 and is freely available on GitHub(https://github.com/TheMillerLab/genetex/blob/main/data-raw/genomics_data_dictionary.csv).
As depicted in Figure 1, BodyMapR takes data from a REDCap® project that has incorporated the Lesion Information and Genomics instruments as the input. The BodyMapR Shiny application is launched via the function
launch_BodyMapR(). This function takes one argument, “Data”, a raw csv file exported from REDCap®.
launch_BodyMapR() is the only function an end user needs to call to execute and utilize BodyMapR. Once launched, clinical researchers interface with BodyMapR in a web browser. The application’s browser-based user interface (UI) facilitates its use by investigators with limited programming skills.
launch_BodyMapR() has a built-in default data set “BodyMapR_mock_dataset”. If the argument “Data” is not specified by an end user, the default data set will be incorporated into the application for demonstration purposes. “BodyMapR_mock_dataset” is a synthetic data set and contains no protected health information.
Figure 2: Browser-Based User Interface. Users control what input is displayed onto the BodyMapR anatomical graphic using the sidebar selectors. Anatomical images created with BioRender.com.
The Body Map includes a skeleton, the anterior and posterior likeness of an androgynous adult, and representations of visceral and lymphatic structures. We designed the Body Map using images from BioRender.com20. Users control what information is displayed onto the Body Map via the application’s UI sidebar. Given that an improved understanding of the geographical lesion profile (GLP) of a cancer type may provide insight into patterns of spread, the default settings of BodyMapR display the GLP of the entire cohort, color-coded by tumor morphology (e.g. primary vs. metastasis vs. recurrence). In contrast, a personalized Body Map at the single-subject level can be obtained by selecting a Record ID from the
selectizeInput() selector “Filter on Record ID” in the application’s sidebar (Figure 3).
Figure 3: Data visualization of clinlical tumor characteristics. Depicted is the output of BodyMapR following selection of a specific subject, “Record ID 1”. Anatomical images created with BioRender.com.
Although cancer types have been historically grouped based on the tissue of origin (e.g. “lung cancer” or “pancreatic cancer”), neoplasms originating from the same tissue can have clinically-relevant distinctions in pathogenesis. For example, in Merkel cell carcinoma, at least two distinct transforming mechanisms (e.g. the Merkel cell polyoma virus and ultraviolet radiation), with distinct underlying mutational landscapes, have been described2. Data demonstrating geographical differences in virus-positive vs. virus-negative MCC has emerged21. Thus, further definition of the relationship between topography and mutational landscape may lead to insights in pathogenesis across cancer types. Therefore, BodyMapR incorporates the Genomics Instruments in order to display over 900 genes found in common clinico-genomics platforms6. Users can select which genes to be visualized on the Body Map using “Filter on Gene Mutations” selector (Figure 2).
launch_BodyMapR(), a set of functions that wrangles, transforms and graphs CTC data from a REDCap® project (Figure 1). Table 1 summarizes the package’s functions and their respective action.
body_map.png <- BodyMapR::BodyMapR_biorender.png grid.2 <- data.frame(x = rep((seq(from = 1, to = 100, by = 1)), each = 100), y = rep((seq(from = 1, to = 100, by = 1)), len = 100)) df.grid <- data.frame() grid.plotly <- ggplot(df.grid) + xlim(0, 100) + ylim(0, 100) + theme_bw() + theme(panel.border = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), axis.line = element_blank()) + theme(axis.text.x = element_blank(), axis.text.y = element_blank(), axis.ticks = element_blank(), axis.title = element_blank()) + annotation_raster(body_map.png, ymin = 0, xmin = 0, xmax = 100, ymax = 100) + geom_point(data = grid.2, aes(x = x, y = y)) ggplotly(grid.plotly)