StoryboardR - An R Package and Shiny Application Designed to Visualize Real-World Data From Clinical Patient Registries

2023
Clinical Informatics
R
REDCap
Author

David M. Miller, Sophia Z. Shalhout

Published

January 5, 2023

Doi
Abstract
We created StoryboardR, an R package and Shiny application facilitates the data visualization of real-world data from tumor registries captured in REDCap®.
Article and Authorship Details

Article type: Application Note. This is the preprint. This article was accepted after peer review on December 8th 2022 by JAMIA Open. Please see the article on the journal’s website: https://academic.oup.com/jamiaopen/article/6/1/ooac109/6972620?searchresult=1.

Authors: David M. Miller1* MD PhD, Sophia Z. Shalhout1 PhD

1Department of Medicine, Division of Hematology/Oncology and the Department of Dermatology, Massachusetts General Hospital, Boston, MA

*Corresponding author: David M. Miller MD PhD
Massachusetts General Hospital
Email:

Funding sources: This work was supported by a grant from Project Data Sphere.

Conflicts of interest: None

Manuscript word count: 1869
Abstract word count: 128
References: 19
Figures: 9
Tables: 4

Keywords: data visualization, patient registries, shiny app, REDCap, clinical informatics

Abbreviations: AWD: alive with disease, CSV: comma-separated values, CTCAE: common terminology criteria for adverse events, DCIs: data collection instruments, EDA: exploratory data analysis, EDC: electronic data capture, NED: no evidence of disease, PCT: primary cutaneous tumor, PDS: project data sphere, REDCap: research electronic data capture, RWD: real-world data, UI: user interface.

ABSTRACT

Objectives: Tumor registries are a rich source of real-world data which can be used to test important hypotheses that inform clinical care. Exploratory data analysis at the level of individual subjects, when enhanced by interactive data visualizations, has the potential to provide novel insights and generate new hypothesis.
Materials and Methods: We created StoryboardR: An R package and Shiny application designed to visualize real-world data from tumor registries.
Results: StoryboardR facilitates the data visualization of real-world data from tumor registries captured in REDCap®.
Conclusions: StoryboardR is freely available under the Massachusetts Institute of Technology license and can be obtained from GitHub. StoryboardR is executed in R and deployed as a Shiny application for non-R users. It produces data visualizations of patient journeys from tumor registries.

Objectives

Tumor registries are a rich source of patient-level data that can lead to important clinical insights. When optimally executed, tumor registries capture highly structured real-world data (RWD) which facilitates time-to-analysis and time-to-insight. There is significant variability in the scope and depth of data captured within various tumor registries, depending on which elements of the patient journey are targeted for capture. For example, the Project Data Sphere led Merkel Cell Carcinoma Patient Registry1,2 currently captures over two thousand data elements across 11 forms (Table 1).

Table 1: StoryboardR Data Collection Instruments (DCIs)
Form Description of Data Captured
Subject Status Vital status, condition trend, performance status
Patient Characteristics Details of initial diagnosis, demographic data, family history, cancer history, pregnancy history, immunosuppression history
Presentation and Initial Staging Physical exam findings of initial tumor, radiographic staging at presentation, pathological evaluation at presentation, AJCC staging
Lesion Information Date lesion detected, clinical description of lesion, photographic evidence of lesion, response to treatment
Pathology Tissue specimen details, immunohistochemical details, histological features, gross pathology details
Surgery Surgical details (e.g., surgical margins, surgical outcomes (e.g., R0, R1, R2)
Radiotherapy Radiotherapy details (e.g., dose, fractions)
Systemic Antineoplastic Therapy Agent, dose amount, doses, treatment outcomes (e.g., best overall response)
Adverse Events CTCAE System organ class, therapeutic attribution, actions taken for AE, outcome of AE
Lab Results Complete blood count, comprehensive metabolic panel, immunological/inflammatory markers
Genomics Lesion assessed, platform used, tumor mutation burden, mismatch repair status, gene name, nucleotide, and amino acid variation, copy number variants
Table 1: StoryboardR Data Collection Instruments (DCIs). The above DCIs are contained in the Merkel Cell Carcinoma Patient Registry and have been incorporated in the StoryboardR package.

While tumor registries can provide large data sets to test important hypotheses, exploratory data analysis (EDA) at the level of individual subjects can lead to novel insights and hypothesis generation. Visualizing patient-level data is a critical part of EDA. Good data visualizations can facilitate the digestion of complex information. Ideal data visualizations leverage superior data, function and design and are thus simple to generate, make data easy to understand, are informative, and visually appealing.

Here we present StoryboardR, an R package with a Shiny application front-end, which facilitates the visualization of real-world data from clinical registries collected in a Research Electronic Data Capture (REDCap®)-based project. REDCap® is a web-based electronic data capture (EDC) system utilized by investigators to capture structured data3. The functions of StoryboardR wrangle and transform data from REDCap®-based tumor registries to produce an interactive data visualization of the patient journey (Figure 1). StoryboardR is executed in R; however, the application is deployed via Shiny to enhance the user interface for non-R users.

In this manuscript we provide: (1) the data dictionary to allow users to adopt the MCC Patient Registry platform; (2) the StoryboardR R-package with installation instructions and examples which can be viewed on the package GitHub page (http://github.com/TheMillerLab/StoryboardR); and (3) a sample data set for demonstration purposes, which is embedded in the R-package. Importantly, these resources may be adopted by other clinical investigators to facilitate development of a variety of disease-specific registries.

Figure 1. Schema of StoryboardR

Figure 1. Schema of StoryboardR. The StoryboardR package takes a csv file of clinical registry data stored in a REDCap® project as input. End users then select a subject to generate an interactive data visualization. Once selected, the package executes a series of server-side functions that wrangles and transforms that data and generates an interactive data visualization of the patient journey.

Methods

Software Dependencies

StoryboardR is written in R (version 4.0.0), organized using roxygen24, and utilizes the following packages dplyr5, tidyr6, readr7, stringr8, TimeWarp9, magrittr10, plotly11, splitstackshape12, Shinydashboard13, and Shiny14. For full details, instructions and examples refer to the video demonstration(https://github.com/TheMillerLab/StoryboardR/blob/main/Video_Demo.md), or README file (https://github.com/TheMillerLab/StoryboardR/blob/main/README.md), both of which can be viewed on the package GitHub page.

Clinical Informatics Dependencies

StoryboardR facilitates data visualizations of patient data from the Merkel Cell Carcinoma Tumor Registry electronic data capture system, a REDCap®-based EDC. The data dictionary for this platform is available on the package GitHub page(https://github.com/TheMillerLab/StoryboardR/blob/main/data-raw/StoryboardR_DataDictionary.csv). While this platform is currently being used by the Merkel Cell Carcinoma Tumor Registry, the fields are generalizable to most solid tumors. Potential customizations of the platform are described below.

Results

StoryboardR Inputs

As shown in Figure 1, StoryboardR takes data from a REDCap® project that has incorporated the instruments from Table 1. The StoryboardR Shiny application is launched via the function launch_StoryboardR(). This function takes two arguments: “Data” and “DateShift”. The “Data” argument is a data frame that contains the raw data from the desired REDCap® project. “DateShift”, which defaults to FALSE, will generate a random and uniform shift of all the dates in the data frame if TRUE is used (this is described in more detail below). launch_StoryboardR() is the only function required to execute and utilize StorybaordR. Once launch_StoryboardR() is called, end users interface with StoryboardR in a web browser.

StoryboardR UI

The Shiny application web browser incorporates a streamlined user interface (UI) with one user input - the subject’s “Record ID” - to maximize usability (Figure 2). The Shiny UI utilizes functions from the shinydashboard package to generate a fully customizable, centralized, easy-to-view dashboard of high-yield clinical information. Patient Characteristics and Initial Staging, Burden of Disease, Genomic Analysis and Therapeutic Interventions are directly pulled from the tumor registry and displayed for the viewer.

Figure 2. User Interface of StoryboardR Figure 2. User Interface of StoryboardR. Depicted is the UI of StoryboardR. End users select a subject’s Record ID from the list auto-populated from the input data frame. Once selected, the server-side applications of StoryboardR will generate both a Subject Dashboard and a Storyboard. To view the Storyboard, users click on “Storyboard” in the left column sidebar.
 

StoryboardR Server Side Functions

The Shiny application’s server side contains the executable code of StoryboardR. The package contains, in addition to launch_StoryboardR(), a set of functions that wrangle data from the tumor registry’s structured forms, transforms that information into data frames with key information from the patient journey, and graphs the output as an interactive storyboard (Figure 1). Table 2 summarizes core functions and their respective actions.
Table 2: Functions utilized by StoryboardR


Function Action
diagnosis() wrangles data from a tumor registry regarding date of initial histological confirmation of the diagnosis, which can then be incorporated into a Patient Storyboard
ss() wrangles data from the Subject Status DCI to produce a dataframe of details about the Subject Status of subjects
clinical_staging() wrangles data from the Presentation and Initial Staging DCI to produce a dataframe of details about the initial clinical staging, which can then be incorporated into a Patient Storyboard
pathological_staging() wrangles data from the Presentation and Initial Staging DCI to produce a dataframe of details about the initial pathological staging, which can then be incorporated into a Patient Storyboard
lesion() wrangles data from the Lesion DCI to produce a dataframe of details about the individual tumors, which can then be incorporated into a Patient Storyboard
surgery() wrangles data from the Surgery DCI to produce a dataframe of details about surgical therapy, which can then be incorporated into a Patient Storyboard
xrt() wrangles data from the Radiotherapy DCI to produce a dataframe of details about radiation therapy, which can then be incorporated into a Patient Storyboard
systemic_therapy() wrangles data from the Systemic Antineoplastic Therapy DCI to produce a dataframe of details about systemic therapy, which can then be incorporated into a Patient Storyboard
genomics() wrangles data from the Genomics DCI to produce a dataframe of details about genomic data from tumors or blood, which can then be incorporated into a Patient Storyboard
adverse_events() wrangles data from the Adverse Events DCI to produce a dataframe of details about adverse events of systemic therapy, which can then be incorporated into a Patient Storyboard
combine_storyboard_dfs() integrates the various storyboard data frames across the patient journey into one final data frame
storyboard_plot() takes the aggregated data frames from `combine_storyboard_dfs` to produce a plotly data visualization of a patient journey
date.shift.df() shifts the dates a unified random number of weeks either forward or back between 1 and 52
launch_StoryboardR() launches the StoryboardR shiny application
Table 2: Functions Utilized by StoryboardR. Abbreviations, DCI: data collection instrument.

Multi-level Functional Processing
In an effort to effectively and efficiently capture clinical data, tumor registries routinely make use of distinct case report forms (CRFs) or data collection instruments (DCIs) for particular elements of a subject’s clinical course. For example, information related to genomic analysis may be captured in a DCI separate from data regarding therapeutic interventions. Furthermore, data regarding multiple therapeutic interventions - e.g., first-line chemotherapy vs. second-line and beyond - may be captured in repeating DCIs. To generate an integrated data visualization of the patient journey, StoryboardR uses a schema that we refer to as “multi-level functional processing” (Figure 3).
Figure 3. Multi-level Functional Processing Approach of StoryboardR

Figure 3. Multi-level Functional Processing Approach of StoryboardR. StoryboardR utilizes a series of multi-level functions to wrangle and transform data from individual data collection instruments in order to construct patient-level data visualizations. At the base, the functions ss(), surgery(), clinical_staging(), pathological_staging(), lesion(), surgery(), systemic_therapy(), adverse_events(), xrt() and genomics()) select fields from their respective case report forms and re-map them to a five vector data frame, containing the variables “record_id”, “description”, “value”, “date”, and “hover”. The mid-level function combine_storyboard_dfs() combines the output of each of the base-level functions into one principal data frame using the function rbind(). The data frame that results from combine_storyboard_dfs() can then be filtered by a subject Record ID to produce a patient-specific interactive data visualization by calling the function storyboard_plot().

At the base of this work flow, individual DCI-level functions (e.g., ss(), surgery(), genomics()) select fields from their respective case report forms (e.g., Subject Status, Surgery, Genomics) and re-map them to a five vector data frame, containing the variables “record_id”, “description”, “value”, “date”, and “hover”. The vector “record_id” holds the individual subject identifier. The vector “description” designates the function from which the data was processed (e.g., genomics). In this data frame, the vector “value” possesses the transformed data contained within the tumor registry (for example, the name of the therapeutic agent captured (e.g., “Avelumab”) or type of lesion detected (e.g., “Metastasis”) that will be displayed on the patient storyboard. The vector “hover” contains additional data from the registry that will be displayed on the storyboard as hover text (e.g., the number of Gray of radiation used for treatment). An example of the output of one of the base-level functions, lesion(), when applied to the sample data set embedded in the StoryboardR package, is shown in Table 3.
Table 3: Output of the Function lesion()
record_id description value date hover
Simulated Patient 2 lesion PCT 2016-09-11 <b>Lesion:</b> Right Buttock Primary<br><b>Date of First Detection:</b> 2016-09-11<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 1 lesion PCT 2017-04-13 <b>Lesion:</b> Left Ankle Skin Primary<br><b>Date of First Detection:</b> 2017-04-13<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 1 lesion Metastasis 2017-10-03 <b>Lesion:</b> Left Calf Shin In-Transit Metastasis<br><b>Date of First Detection:</b> 2017-10-03<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 3 lesion PCT 2017-12-18 <b>Lesion:</b> Left Antecubital Fossa Skin Primary<br><b>Date of First Detection:</b> 2017-12-18<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 3 lesion Metastasis 2017-12-30 <b>Lesion:</b> Left Axillary Lymph Node Metastasis<br><b>Date of First Detection:</b> 2017-12-30<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 1 lesion Metastasis 2018-06-04 <b>Lesion:</b> Left Groin Lymph Node Metastasis<br><b>Date of First Detection:</b> 2018-06-04<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 4 lesion PCT 2018-10-05 <b>Lesion:</b> Right Eyebrow Primary Lesion<br><b>Date of First Detection:</b> 2018-10-05<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 4 lesion Metastasis 2018-11-11 <b>Lesion:</b> Right Parotid Lymph Node Metastases<br><b>Date of First Detection:</b> 2018-11-11<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 5 lesion PCT 2018-12-12 <b>Lesion:</b> Left Leg Primary Lesion<br><b>Date of First Detection:</b> 2018-12-12<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 5 lesion Metastasis 2018-12-15 <b>Lesion:</b> Left Inguinal Lymph Node Metastasis<br><b>Date of First Detection:</b> 2018-12-15<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 5 lesion Metastasis 2019-02-22 <b>Lesion:</b> Left External Iliac Lymph Node Metastasis<br><b>Date of First Detection:</b> 2019-02-22<br><b>Histologically Confirmed:</b> No
Simulated Patient 5 lesion Metastasis 2019-03-24 <b>Lesion:</b> Left Axillary Lymph Node Metastasis<br><b>Date of First Detection:</b> 2019-03-24<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 5 lesion Metastasis 2019-06-04 <b>Lesion:</b> Right Axillary Lymph Node Metastasis<br><b>Date of First Detection:</b> 2019-06-04<br><b>Histologically Confirmed:</b> No
Table 3: Output of the function lesion(). Depicted is the output of the base-level function lesion() after it is applied to the sample data set embedded in the StoryboardR package. Calling lesion() produces a data frame with five vectors: record_id, description, value, data and hover. The data frame can then be combined with the output of other base-level functions using the mid-level function combine_storyboard_dfs().

Each base-level function returns the same structured data frame, with the five vectors “record_id”, “description”, “value”, “date”, “hover”. This uniformity allows it to be aggregated with the mid-level function, combine_storyboard_dfs() (Table 4).
Table 4: Output of the Function combine_storyboard_dfs()
record_id description value date hover
Simulated Patient 2 dx Initial Histological Diagnosis 2016-09-11 <b>Initial Histological Diagnosis</b><br><b>Date:</b> 2016-09-11
Simulated Patient 2 ss AWD 2016-09-11 <b>Subject Status:</b> Alive With Disease<br><b>Clinical Trend:</b> Indeterminate<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> ECOG Grade 1: Symptomatic but completely ambulatory<br><b>Date:</b> 2016-09-11
Simulated Patient 2 lesion PCT 2016-09-11 <b>Lesion:</b> Right Buttock Primary<br><b>Date of First Detection:</b> 2016-09-11<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 2 cStage Initial Clinical Stage 2016-09-17 <b>Initial Clinical Stage:</b> IIA<br><b>Date:</b> 2016-09-17
Simulated Patient 2 ss NED 2016-09-24 <b>Subject Status:</b> No Evidence of Disease<br><b>Clinical Trend:</b> Stable<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> ECOG Grade 0: Asymptomatic<br><b>Date:</b> 2016-09-24
Simulated Patient 2 pStage Initial Pathological Stage 2016-09-24 <b>Initial Pathological Stage:</b> IIA<br><b>Date:</b> 2016-09-24
Simulated Patient 2 surgery Surgery 2016-09-24 <b>Lesion Surgerized:</b> Right Buttock Primary<br><b>Type of Sugery:</b> Excision<br><b>Surgical Margins:</b> 1 cm<br><b>Surgical Outcome:</b> RO<br><b>Date:</b> 2016-09-24
Simulated Patient 2 genomics Genomics 2016-09-24 <b>Lesion Assessed:</b> Right Buttock Primary<br><b>Platform:</b> MGH SNaPshot<br><b>Genetic Alterations:</b> None Detected<br><b>Date:</b> 2016-09-24
Simulated Patient 2 radiotherapy Initiating<br>Adjuvant Radiotherapy 2016-11-21 <b>Setting of Radiotherapy</b> Adjuvant Radiotherapy<br><b>Target of Radiotherapy:</b> Right Buttock Primary<br><b>Start Date:</b> 2016-11-21
Simulated Patient 2 radiotherapy Completed<br>Adjuvant Radiotherapy 2016-12-24 <b>Setting of Radiotherapy</b> Adjuvant Radiotherapy<br><b>Target of Radiotherapy:</b> Right Buttock Primary<br><b>Number of Fractions:</b> 20<br><b>Total Dose Delivered:</b> 4500 cGy<br><b>End Date:</b> 2016-12-24
Simulated Patient 1 lesion PCT 2017-04-13 <b>Lesion:</b> Left Ankle Skin Primary<br><b>Date of First Detection:</b> 2017-04-13<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 1 dx Initial Histological Diagnosis 2017-05-09 <b>Initial Histological Diagnosis</b><br><b>Date:</b> 2017-05-09
Simulated Patient 1 ss AWD 2017-06-08 <b>Subject Status:</b> Alive With Disease<br><b>Clinical Trend:</b> Indeterminate<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> ECOG Grade 0: Asymptomatic<br><b>Date:</b> 2017-06-08
Simulated Patient 1 pStage Initial Pathological Stage 2017-06-08 <b>Initial Pathological Stage:</b> I<br><b>Date:</b> 2017-06-08
Simulated Patient 1 genomics Genomics 2017-06-08 <b>Lesion Assessed:</b> Left Ankle Skin Primary<br><b>Platform:</b> MGH SNaPshot<br><b>Genetic Alterations:</b> TP53<br><b>Date:</b> 2017-06-08
Simulated Patient 1 surgery Surgery 2017-07-10 <b>Lesion Surgerized:</b> Left Ankle Skin Primary<br><b>Type of Sugery:</b> Excision<br><b>Surgical Margins:</b> 1 cm<br><b>Surgical Outcome:</b> RO<br><b>Date:</b> 2017-07-10
Simulated Patient 1 ss NED 2017-07-16 <b>Subject Status:</b> No Evidence of Disease<br><b>Clinical Trend:</b> Improving<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> ECOG Grade 0: Asymptomatic<br><b>Date:</b> 2017-07-16
Simulated Patient 1 cStage Initial Clinical Stage 2017-08-21 <b>Initial Clinical Stage:</b> I<br><b>Date:</b> 2017-08-21
Simulated Patient 1 ss AWD 2017-10-03 <b>Subject Status:</b> Alive With Disease<br><b>Clinical Trend:</b> Worsening<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> ECOG Grade 1: Symptomatic but completely ambulatory<br><b>Date:</b> 2017-10-03
Simulated Patient 1 lesion Metastasis 2017-10-03 <b>Lesion:</b> Left Calf Shin In-Transit Metastasis<br><b>Date of First Detection:</b> 2017-10-03<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 1 ss AWD 2017-10-18 <b>Subject Status:</b> Alive With Disease<br><b>Clinical Trend:</b> Stable<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> ECOG Grade 0: Asymptomatic<br><b>Date:</b> 2017-10-18
Simulated Patient 1 radiotherapy Initiating<br>Definitive Radiotherapy 2017-11-02 <b>Setting of Radiotherapy</b> Definitive Radiotherapy<br><b>Target of Radiotherapy:</b> Left Calf Shin In-Transit Metastasis<br><b>Start Date:</b> 2017-11-02
Simulated Patient 1 radiotherapy Completed<br>Definitive Radiotherapy 2017-12-09 <b>Setting of Radiotherapy</b> Definitive Radiotherapy<br><b>Target of Radiotherapy:</b> Left Calf Shin In-Transit Metastasis<br><b>Number of Fractions:</b> 30<br><b>Total Dose Delivered:</b> 6000 cGy<br><b>End Date:</b> 2017-12-09
Simulated Patient 3 dx Initial Histological Diagnosis 2017-12-18 <b>Initial Histological Diagnosis</b><br><b>Date:</b> 2017-12-18
Simulated Patient 3 ss AWD 2017-12-18 <b>Subject Status:</b> Alive With Disease<br><b>Clinical Trend:</b> Indeterminate<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> Not Reported<br><b>Date:</b> 2017-12-18
Simulated Patient 3 lesion PCT 2017-12-18 <b>Lesion:</b> Left Antecubital Fossa Skin Primary<br><b>Date of First Detection:</b> 2017-12-18<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 3 cStage Initial Clinical Stage 2017-12-30 <b>Initial Clinical Stage:</b> III<br><b>Date:</b> 2017-12-30
Simulated Patient 3 lesion Metastasis 2017-12-30 <b>Lesion:</b> Left Axillary Lymph Node Metastasis<br><b>Date of First Detection:</b> 2017-12-30<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 3 ss AWD 2018-01-08 <b>Subject Status:</b> Alive With Disease<br><b>Clinical Trend:</b> Worsening<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> Not Reported<br><b>Date:</b> 2018-01-08
Simulated Patient 3 ss NED 2018-01-21 <b>Subject Status:</b> No Evidence of Disease<br><b>Clinical Trend:</b> Improving<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> Not Reported<br><b>Date:</b> 2018-01-21
Simulated Patient 3 pStage Initial Pathological Stage 2018-01-21 <b>Initial Pathological Stage:</b> IIIB<br><b>Date:</b> 2018-01-21
Simulated Patient 3 surgery Surgery 2018-01-21 <b>Lesion Surgerized:</b> Left Antecubital Fossa Skin Primary<br><b>Type of Sugery:</b> Excision<br><b>Surgical Margins:</b> 2 cm<br><b>Surgical Outcome:</b> RO<br><b>Date:</b> 2018-01-21
Simulated Patient 3 radiotherapy Initiating<br>Adjuvant Radiotherapy 2018-02-17 <b>Setting of Radiotherapy</b> Adjuvant Radiotherapy<br><b>Target of Radiotherapy:</b> Left Antecubital Fossa Skin Primary, Left Axillary Lymph Node Metastasis<br><b>Start Date:</b> 2018-02-17
Simulated Patient 3 radiotherapy Completed<br>Adjuvant Radiotherapy 2018-03-26 <b>Setting of Radiotherapy</b> Adjuvant Radiotherapy<br><b>Target of Radiotherapy:</b> Left Antecubital Fossa Skin Primary, Left Axillary Lymph Node Metastasis<br><b>Number of Fractions:</b> 23<br><b>Total Dose Delivered:</b> 4600 cGy<br><b>End Date:</b> 2018-03-26
Simulated Patient 3 ss NED 2018-04-13 <b>Subject Status:</b> No Evidence of Disease<br><b>Clinical Trend:</b> Stable<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> Not Reported<br><b>Date:</b> 2018-04-13
Simulated Patient 3 genomics Genomics 2018-04-30 <b>Lesion Assessed:</b> Liquid Biopsy<br><b>Platform:</b> Guardant 360<br><b>Genetic Alterations:</b> HRAS, STK11, TP53, TP53, TP53<br><b>Date:</b> 2018-04-30
Simulated Patient 1 lesion Metastasis 2018-06-04 <b>Lesion:</b> Left Groin Lymph Node Metastasis<br><b>Date of First Detection:</b> 2018-06-04<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 1 Systemic Therapy Avelumab 2018-06-27 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 1<br><b>Date Administered:</b> 2018-06-27
Simulated Patient 1 Systemic Therapy Avelumab 2018-07-11 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 2<br><b>Date Administered:</b> 2018-07-11
Simulated Patient 1 Systemic Therapy Avelumab 2018-07-25 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 3<br><b>Date Administered:</b> 2018-07-25
Simulated Patient 1 Systemic Therapy Avelumab 2018-08-08 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 4<br><b>Date Administered:</b> 2018-08-08
Simulated Patient 1 Systemic Therapy Avelumab 2018-08-22 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 5<br><b>Date Administered:</b> 2018-08-22
Simulated Patient 1 Systemic Therapy Avelumab 2018-09-05 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 6<br><b>Date Administered:</b> 2018-09-05
Simulated Patient 1 ss AWD 2018-09-06 <b>Subject Status:</b> Alive With Disease<br><b>Clinical Trend:</b> Improving<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> ECOG Grade 1: Symptomatic but completely ambulatory<br><b>Date:</b> 2018-09-06
Simulated Patient 1 Systemic Therapy Avelumab 2018-09-19 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 7<br><b>Date Administered:</b> 2018-09-19
Simulated Patient 1 Systemic Therapy Avelumab 2018-10-03 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 8<br><b>Date Administered:</b> 2018-10-03
Simulated Patient 4 dx Initial Histological Diagnosis 2018-10-05 <b>Initial Histological Diagnosis</b><br><b>Date:</b> 2018-10-05
Simulated Patient 4 ss AWD 2018-10-05 <b>Subject Status:</b> Alive With Disease<br><b>Clinical Trend:</b> Indeterminate<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> Not Reported<br><b>Date:</b> 2018-10-05
Simulated Patient 4 lesion PCT 2018-10-05 <b>Lesion:</b> Right Eyebrow Primary Lesion<br><b>Date of First Detection:</b> 2018-10-05<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 1 Systemic Therapy Avelumab 2018-10-17 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 9<br><b>Date Administered:</b> 2018-10-17
Simulated Patient 4 cStage Initial Clinical Stage 2018-10-21 <b>Initial Clinical Stage:</b> IIB<br><b>Date:</b> 2018-10-21
Simulated Patient 1 Systemic Therapy Avelumab 2018-10-31 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 10<br><b>Date Administered:</b> 2018-10-31
Simulated Patient 4 ss NED 2018-11-11 <b>Subject Status:</b> No Evidence of Disease<br><b>Clinical Trend:</b> Improving<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> ECOG Grade 0: Asymptomatic<br><b>Date:</b> 2018-11-11
Simulated Patient 4 pStage Initial Pathological Stage 2018-11-11 <b>Initial Pathological Stage:</b> IIIA<br><b>Date:</b> 2018-11-11
Simulated Patient 4 lesion Metastasis 2018-11-11 <b>Lesion:</b> Right Parotid Lymph Node Metastases<br><b>Date of First Detection:</b> 2018-11-11<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 4 surgery Surgery 2018-11-11 <b>Lesion Surgerized:</b> Right Eyebrow Primary Lesion<br><b>Type of Sugery:</b> Excision<br><b>Surgical Margins:</b> 1 cm<br><b>Surgical Outcome:</b> RO<br><b>Date:</b> 2018-11-11
Simulated Patient 4 genomics Genomics 2018-11-11 <b>Lesion Assessed:</b> Right Eyebrow Primary Lesion<br><b>Platform:</b> MGH SNaPshot<br><b>Genetic Alterations:</b> ALK, AR, ARID1A, BRAF, BRCA2, CHEK2, CTNNB1, DDR2, FANCA, GNA11, MPL, MTOR, NOTCH1, NOTCH1, NTRK1, NTRK2, PALB2, PIK3CA, PTEN, PTEN, RB1, TP53<br><b>Date:</b> 2018-11-11
Simulated Patient 1 Systemic Therapy Avelumab 2018-11-14 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 11<br><b>Date Administered:</b> 2018-11-14
Simulated Patient 1 Systemic Therapy Avelumab 2018-11-28 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 12<br><b>Date Administered:</b> 2018-11-28
Simulated Patient 4 radiotherapy Initiating<br>Adjuvant Radiotherapy 2018-12-09 <b>Setting of Radiotherapy</b> Adjuvant Radiotherapy<br><b>Target of Radiotherapy:</b> Right Eyebrow Primary Lesion, Right Parotid Lymph Node Metastases<br><b>Start Date:</b> 2018-12-09
Simulated Patient 5 dx Initial Histological Diagnosis 2018-12-12 <b>Initial Histological Diagnosis</b><br><b>Date:</b> 2018-12-12
Simulated Patient 5 ss AWD 2018-12-12 <b>Subject Status:</b> Alive With Disease<br><b>Clinical Trend:</b> Indeterminate<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> Not Reported<br><b>Date:</b> 2018-12-12
Simulated Patient 5 lesion PCT 2018-12-12 <b>Lesion:</b> Left Leg Primary Lesion<br><b>Date of First Detection:</b> 2018-12-12<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 1 Systemic Therapy Avelumab 2018-12-12 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 13<br><b>Date Administered:</b> 2018-12-12
Simulated Patient 5 cStage Initial Clinical Stage 2018-12-15 <b>Initial Clinical Stage:</b> III<br><b>Date:</b> 2018-12-15
Simulated Patient 5 lesion Metastasis 2018-12-15 <b>Lesion:</b> Left Inguinal Lymph Node Metastasis<br><b>Date of First Detection:</b> 2018-12-15<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 5 pStage Initial Pathological Stage 2018-12-21 <b>Initial Pathological Stage:</b> IIIB<br><b>Date:</b> 2018-12-21
Simulated Patient 5 surgery Surgery 2018-12-21 <b>Lesion Surgerized:</b> Left Leg Primary Lesion<br><b>Type of Sugery:</b> Excision<br><b>Surgical Margins:</b> 1 cm<br><b>Surgical Outcome:</b> RO<br><b>Date:</b> 2018-12-21
Simulated Patient 4 ss NED 2018-12-23 <b>Subject Status:</b> No Evidence of Disease<br><b>Clinical Trend:</b> Stable<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> Not Reported<br><b>Date:</b> 2018-12-23
Simulated Patient 1 Systemic Therapy Avelumab 2018-12-26 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 14<br><b>Date Administered:</b> 2018-12-26
Simulated Patient 1 Systemic Therapy Avelumab 2019-01-09 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 15<br><b>Date Administered:</b> 2019-01-09
Simulated Patient 4 radiotherapy Completed<br>Adjuvant Radiotherapy 2019-01-12 <b>Setting of Radiotherapy</b> Adjuvant Radiotherapy<br><b>Target of Radiotherapy:</b> Right Eyebrow Primary Lesion, Right Parotid Lymph Node Metastases<br><b>Number of Fractions:</b> 25<br><b>Total Dose Delivered:</b> 5000 cGy<br><b>End Date:</b> 2019-01-12
Simulated Patient 1 Systemic Therapy Avelumab 2019-01-23 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 16<br><b>Date Administered:</b> 2019-01-23
Simulated Patient 1 Systemic Therapy Avelumab 2019-02-06 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 17<br><b>Date Administered:</b> 2019-02-06
Simulated Patient 1 Systemic Therapy Avelumab 2019-02-20 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 18<br><b>Date Administered:</b> 2019-02-20
Simulated Patient 5 ss AWD 2019-02-22 <b>Subject Status:</b> Alive With Disease<br><b>Clinical Trend:</b> Worsening<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> Not Reported<br><b>Date:</b> 2019-02-22
Simulated Patient 5 lesion Metastasis 2019-02-22 <b>Lesion:</b> Left External Iliac Lymph Node Metastasis<br><b>Date of First Detection:</b> 2019-02-22<br><b>Histologically Confirmed:</b> No
Simulated Patient 5 Systemic Therapy Pembrolizumab 2019-03-02 <b>Therapeutic:</b> Pembrolizumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 1<br><b>Date Administered:</b> 2019-03-02
Simulated Patient 1 Systemic Therapy Avelumab 2019-03-06 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 19<br><b>Date Administered:</b> 2019-03-06
Simulated Patient 1 ss NED 2019-03-12 <b>Subject Status:</b> No Evidence of Disease<br><b>Clinical Trend:</b> Stable<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> ECOG Grade 1: Symptomatic but completely ambulatory<br><b>Date:</b> 2019-03-12
Simulated Patient 5 Systemic Therapy Pembrolizumab 2019-03-19 <b>Therapeutic:</b> Pembrolizumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 2<br><b>Date Administered:</b> 2019-03-19
Simulated Patient 1 Systemic Therapy Avelumab 2019-03-20 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 20<br><b>Date Administered:</b> 2019-03-20
Simulated Patient 5 lesion Metastasis 2019-03-24 <b>Lesion:</b> Left Axillary Lymph Node Metastasis<br><b>Date of First Detection:</b> 2019-03-24<br><b>Histologically Confirmed:</b> Yes
Simulated Patient 1 Systemic Therapy Avelumab 2019-04-03 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 21<br><b>Date Administered:</b> 2019-04-03
Simulated Patient 5 Systemic Therapy Pembrolizumab 2019-04-08 <b>Therapeutic:</b> Pembrolizumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 3<br><b>Date Administered:</b> 2019-04-08
Simulated Patient 1 Systemic Therapy Avelumab 2019-04-17 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 22<br><b>Date Administered:</b> 2019-04-17
Simulated Patient 5 Systemic Therapy Pembrolizumab 2019-04-24 <b>Therapeutic:</b> Pembrolizumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 4<br><b>Date Administered:</b> 2019-04-24
Simulated Patient 1 Systemic Therapy Avelumab 2019-05-01 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 23<br><b>Date Administered:</b> 2019-05-01
Simulated Patient 1 Systemic Therapy Avelumab 2019-05-15 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 24<br><b>Date Administered:</b> 2019-05-15
Simulated Patient 5 Systemic Therapy Pembrolizumab 2019-05-18 <b>Therapeutic:</b> Pembrolizumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 5<br><b>Date Administered:</b> 2019-05-18
Simulated Patient 1 Systemic Therapy Avelumab 2019-05-29 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 25<br><b>Date Administered:</b> 2019-05-29
Simulated Patient 5 lesion Metastasis 2019-06-04 <b>Lesion:</b> Right Axillary Lymph Node Metastasis<br><b>Date of First Detection:</b> 2019-06-04<br><b>Histologically Confirmed:</b> No
Simulated Patient 5 Systemic Therapy Pembrolizumab 2019-06-06 <b>Therapeutic:</b> Pembrolizumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 6<br><b>Date Administered:</b> 2019-06-06
Simulated Patient 1 Systemic Therapy Avelumab 2019-06-12 <b>Therapeutic:</b> Avelumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 26<br><b>Date Administered:</b> 2019-06-12
Simulated Patient 5 Systemic Therapy Pembrolizumab 2019-06-23 <b>Therapeutic:</b> Pembrolizumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 7<br><b>Date Administered:</b> 2019-06-23
Simulated Patient 5 Systemic Therapy Pembrolizumab 2019-07-14 <b>Therapeutic:</b> Pembrolizumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 8<br><b>Date Administered:</b> 2019-07-14
Simulated Patient 5 Systemic Therapy Pembrolizumab 2019-07-30 <b>Therapeutic:</b> Pembrolizumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 9<br><b>Date Administered:</b> 2019-07-30
Simulated Patient 5 Systemic Therapy Pembrolizumab 2019-08-17 <b>Therapeutic:</b> Pembrolizumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 10<br><b>Date Administered:</b> 2019-08-17
Simulated Patient 5 ss AWD 2019-08-19 <b>Subject Status:</b> Alive With Disease<br><b>Clinical Trend:</b> Stable<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> Not Reported<br><b>Date:</b> 2019-08-19
Simulated Patient 5 Systemic Therapy Pembrolizumab 2019-09-07 <b>Therapeutic:</b> Pembrolizumab<br><b>Treatment Setting:</b> Primary Therapy<br><b>Dose Number:</b> 11<br><b>Date Administered:</b> 2019-09-07
Simulated Patient 5 ss Deceased 2019-09-25 <b>Subject Status:</b> Deceased<br><b>Clinical Trend:</b> Worsening<br><b>Trend Details:</b> Not Reported<br><b>Performance Status:</b> Not Reported<br><b>Date:</b> 2019-09-25
Table 4: Output of the Function combine_storyboard_dfs(). Depicted is the output of combine_storyboard_dfs() after it is applied to the sample data set embedded in the StoryboardR package. The function combine_storyboard_dfs() takes one argument, “data”, a data frame that contains the raw data from the desired REDCap® project. Contained within the mid-level function combine_storyboard_dfs() is all of the base-level functions (e.g. lesion(), etc.). Calling combine_storyboard_dfs() aggregates the output of all of the base-level functions via rbind() to produce one principal data frame with the five vectors: record_id, description, value, data and hover. The output of combine_storyboard_dfs() can then be filtered by a subject Record ID to produce a patient-specific interactive data visualization by calling the function storyboard_plot().

Finally, the top-level function storyboard_plot() generates an interactive plotly storyboard (see next section). Importantly, the multi-level functional processing design permits a high degree of customization and therefore maximizes extensibility. Developers can either modify base-level functions or create new ones to incorporate additional data from their clinical registry project.

Outputs

Patient Overview

The StoryboardR Shiny application returns two outputs: a subject Dashboard and Storyboard. As stated above, the subject Dashboard centralizes high-yield data from the tumor registry in tabular form (Figure 4). This provides an important overview of patient-level information and is fully customizable by the end user.

Figure 4. Subject Dashboard for Simulated Patient 1 Figure 4. Subject Dashboard for Simulated Patient 1. Depicted is the Subject Dashboard for Simulated Patient 1. This dashboard provides an overview of salient patient-level data from the EDC in tabular form.

Patient Storyboard

To visualize the temporal relationship between patient-level data elements, StoryboardR generates an interactive timeline. This creates a method of EDA to allow for a visual interpretation of the relationship between certain potential prognostic and/or predictive biomarkers (e.g., tumor genetics) and outcomes (e.g., overall survival, response to therapy) (Figure 5).
Figure 5. Interactive Subject Storyboard
Figure 5. Interactive Subject Storyboard.The centerpiece of StoryboardR is an interactive patient storyboard. Depicted is a timeline of patient-level data captured in a REDCap®-based cancer registry. Using plotly functionality, salient data elements are stored as hover text. Abbreviations: AWD: alive with disease, NED: no evidence of disease, PCT: primary cutaneous tumor.

StoryboardR was designed to incorporate patient journeys of varying length and complexity and produce a uniform-appearing data visualization with a consistent structure. This is accomplished via the main graphing function, storyboard_plot(), which takes one argument, “data”. This argument is a data frame downstream of the function combine_storyboard_df(), which, as described above, compiles desired data elements from a tumor registry’s DCIs into a uniform five-vector data frame. By selecting a specific Record ID in the StoryboardR UI of the web browser, the data frame downstream of combine_storyboard_df() is filtered by the dplyr function filter() for the Record ID of interest. This record_id-filtered data frame is then used as the “data” argument for storyboard_plot(). The first operation of storyboard_plot() is to build the scaffold of the timeline for the interactive plot (Figure 6).
Figure 6 - Source Code to Create a Uniform Timeline


storyboard_plot <- function(data){
#################################################################################################################
# Load Data
#################################################################################################################
timeline <- data

#################################################################################################################
# Create the Line Segments that will populate the Storyboard
#################################################################################################################
# Each row in the data frame created by `combine_storyboard_dfs` is an event that will populate the Storyboard. 
## Thus, we need to generate a line segment that will attach that event to the center line of the Storyboard. 
### We need a method that allows for a different number of line segments that is completely dependent on the 
#### number of observations that are found in each individual patient story.
##### Therefore, create a R object that is the length of the data frame once it has been filtered by record_id
###### (since this will vary from patient to patient). This filtered data frame has been titled "timeline"
timeline.x <- 1:nrow(timeline)

# Create an empty integer object that we will fill with a for loop
out <- vector(mode = "integer")
# Use a for loop to populate the above integer object that will serve as the number of repeating line segments
for(i in 1:4) {  # thus there will be four line segments
  out[i] <- i*11  ## that are spaced out by a factor of 11 
                  ### (that actual length of the line segments will be determined later)
}
# Now that there is an object "out", repeat it twice as this will serve as segments above and below the central 
## line of the storyboard
### Create a R object "timeline.rep" that repeats the above df "out" twice out a length of the subject's 
#### timeline "timeline.x"
timeline.rep <- base::rep(x = out,
                          each = 2,
                          len = length(timeline.x))

# Because the objective is to have lines of the same length on either side of the central storyboard line to 
## establish an aesthetically appealing graph, create a numeric R object "timeline.neg_rep" of -1 and 1 that 
### is the length of the subject's timeline "timeline." 
timeline.neg_rep <- rep_len(c(-1,1),
                            length.out = length(timeline.x))

# Create a new R object "vec" that is a df combining "timeline.x", "timeline.rep", "timeline.neg_rep"
vec <- data.frame(timeline.c = timeline.x,
                  timeline.d = timeline.rep,
                  timeline.e = timeline.neg_rep)

# Add a vector to "vec" that is the product of timeline.rep and timeline.neg_rep which be used as a base for 
## the length of the segments above and below the central storyboard line
vec$timeline.f <- timeline.rep*timeline.neg_rep

# Add another vector to "vec" that now takes the vector just made "timeline.f" and add "50" to it, as 50 
## is the center of a 100 pixel plot. This should only have positive numbers at this point in a range of 1-100.
vec$timeline.line.coord <- vec$timeline.f+50

# Add that above vector "timeline.text.coord", which has the coordinates of where the lines will go to the 
## "timeline" dataframe
timeline$y <- vec$timeline.line.coord

#################################################################################################################
# Create the Coordinates for the text labels for the Storyboard
#################################################################################################################
# Make the text of each storyboard offset a small amount from the line segment
## Create a new vector that is a multiple of "timeline.e" by some amount (here 3.6)
vec$timeline.text.coord.pre <- 3.6*vec$timeline.e

# Add vec$timeline.text.coord.pre  with vec$timeline.line.coord to get the position on the storyboard where 
## the text will go
vec$timeline.text.coord <- vec$timeline.text.coord.pre + vec$timeline.line.coord

# Add this vector "timeline.text.coord" to the "timeline" data frame and call it "label"
timeline$label <- vec$timeline.text.coord

#################################################################################################################
# Develop the "x-axis' of the time
#################################################################################################################
# Define the start of the timeline
start_date.a <- timeline %>% slice_head()

# Build in a 20 day buffer before the start date
start_date <- start_date.a$date - 20

# Define the end of the timeline
end_date.a <- timeline %>% slice_tail()

# Build in a 20 day buffer after the start date
end_date <- end_date.a$date + 20

# Create an R object that is a Date vector that spans the start_date to the end_date by "days"
dateVec <- seq(from = start_date,
               to = end_date,
               by = "days")

# Create an R object that is a Date vector that spans the start_date to the end_date by "4 months" which will 
## serve as the labels for the graph
x_axis_label <- seq(from = start_date,
                    to = end_date,
                    by = "4 months")
Figure 6 - Source Code to Create a Uniform Timeline. Shown is the portion of the annotated source code of the graphing function storyboard_plot() that generates the scaffold of the interactive timeline. The executable code (black font) creates a uniform coordinate system in which a central horizontal line forms the core scaffold of the interactive timeline. Individual data elements from the patient journey are mapped onto the horizontal line with vertical lines that alternate above and below the central line to minimize overlapping data points. Specific design considerations of each line of code are embedded in the script as non-executable notes (demarcated by the “#” symbol).

Subsequently, storyboard_plot() creates a gg object “plot.storyboard” by calling the ggplot2 function ggplot(). Important design aspects of “plot.storyboard” include incorporating hover text to display information, which minimizes data visualization clutter and permits a high-degree of customizability. In addition, grouped data elements (e.g., lab results or systemic anti-neoplastic therapies) are color-coded to aid in identification. Lastly, storyboard_plot() creates a plotly object by calling the plotly function ggplotly() (Figure 7).
Figure 7 - Source Code for Interactive Storyboard
#################################################################################################################
# Create the Text for the Subtitle of the Graph
#################################################################################################################
timeline$subject_id.label <- paste("<b>Subject:</b>",
                                     timeline$record_id)

#################################################################################################################
# Create the Storyboard Plot
#################################################################################################################
# Create a blank data frame that will serve as the substrate for the graph
plot.storyboard.blank <- data.frame()
# Build the Storyboard plot using ggplot2
plot.storybaord <- ggplot(plot.storyboard.blank) +
  ##### Define the x and y boundaries as 0 - 100
  xlim(0, 100) +
  ylim(0, 100) +
  theme_bw() +
  theme(panel.border = element_blank(),
        panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        axis.line.y = element_blank(),
        axis.line.x = element_blank())+
  theme(axis.text.x = element_text(face = "bold", size = 12),
        axis.text.y = element_blank(),
        axis.ticks.x = element_line(),
        axis.ticks.y = element_blank(),
        axis.title = element_blank()) +
  ##### Set a horizontal line at 50, midway through the plot which serves as the timeline
  geom_hline(yintercept = 50) +
  ##### Graph the vertical segments that span the midline to the text label
  geom_segment(data = timeline,
               aes(x = date,
                   xend = date,
                   y = y,
                   yend = 50
               )) +
  ##### Graph the text labels
  geom_text(data = timeline,
            aes(x = date,
                y = label,
                label = value,
                text = hover,
                color = description),
            size = 4.0,
            position = position_nudge()) +
  ##### Format the X Axis
  scale_x_date(breaks = x_axis_label, # this is the name of a vector i created to set the x-axis, otherwise R was use a default setting
               labels = date_format("%m/%Y"), # This will control the format the above vector will be displayed "%Y" ill be a 4 digit date
               limits = c(min(dateVec), max=max(dateVec)))  # create a separate vector called dateVec" to set the limits on the x
#################################################################################################################
# Convert the ggplot object to a plotly graphy
#################################################################################################################
plot.storybaord.plotly <- plotly::ggplotly(p = plot.storybaord,
         tooltip = "text") %>%
  layout(title = list(text = paste0("<b>Patient Storyboard<b>",
                                    "<br>",
                                    "<sup>",
                                    timeline$subject_id.label,
                                    "</sup>")),
         titlefont = list(size = 28,
                          color = "black",
                          family = "Arial")) %>%
  layout(subtitle = "testing") %>%
  layout(showlegend = FALSE) %>%
  layout(hoverlabel = list(font=list(size=18))) %>%
  layout(xaxis = list(tickfont = list(size = 16))) %>%
  layout(margin = list(
    l = 10,
    r = 10,
    b = 10,
    t = 50))
#################################################################################################################
# Return the plotly graph of the Storyboard
#################################################################################################################
return(plot.storybaord.plotly)
Figure 7 - Source Code for Interactive Storyboard. Depicted is the portion of the annotated source code of the graphing function storyboard_plot() that generates the ggplot and ggplotly interactive storyboard. The executable code (black font) creates a ggplot object that incorporates the central horizontal timeline created by the code shown in Supplemental Figure 2. Hover text functionality is produced by the ggplotly function to maximize the amount of information contained in the data visualization, while minimizing data clutter. Specific design considerations of each line of code are embedded in the script as non-executable notes (demarcated by the “#” symbol).

StoryboardR Customizations

Date Shifting

Given that dates of events such as diagnoses and treatment are potential patient identifiers, StoryboardR contains a date shifting function, date.shift.df(). This function takes one argument “data”, a data frame downstream of the function combine_storyboard_df(). date.shift.df() incorporates the dateShift() function from the TimeWarp package to create a unified shift of a random number of weeks either forward or back between 1 and 52 of all of the dates in the date vector. date.shift.df() is found on the server-side of the shiny application. Users can activate this function by changing the DateShift argument in the launch_StoryboardR() function to TRUE.

Project-Specific Alterations
The source code for StoryboardR is available on GitHub to provide users the ability to customize the UI and server-side functions to accommodate heterogeneity in research needs. For example, StoryboardR has been incorporated in the clinical informatics pipeline of the Project Data Sphere supported MCC patient registry1,2. Therefore, modifications have been made to the source code to incorporate specific parameters salient to the clinical practice of caring for patients with MCC. As an example, the MCC StoryboardR application has an additional base-level function, amerk(), which wrangles data from tje Lab Results form2 to produce a data frame of details about the AMERK Merkel Cell Polyoma Virus Antibody Test15, which can then be incorporated into a Patient Storyboard (Figure 8).
Figure 8. Interactive patient journey with customizations specific for the MCC patient registry
Interactive-patient-journey-with-AMERK.knit
Figure 8. Interactive patient journey with customizations specific for the MCC patient registry. Depicted is a patient storyboard that has been customized to the Merkel Cell Carcinoma patient registry. Given the multi-level functional processing aspect of StoryboardR, additional base-level functions can be incorporated to highlight data elements specific for a given research project. In this graph, a function called amerk() was created to extract elements from the Lab Results instrument that pertain to the laboratory test AMERK, which is used in clinical practice as a tool for active surveillance for disease recurrence. By adding amerk() to the mid-level function combine_storyboard_dfs(), these data are incorporated into the principal data frame that is then used as the “data” argument for the graphing function storyboard_plot().

StoryboardR as a Stand-Alone R Package

Furthermore, because StoryboardR can function as a free-standing R package (i.e. outside of a shiny application), end users are able to make additional customizations inside an integrated development environment (IDE), such as such as RStudio®16. For example, if an investigator desired a more pared-down Patient Storyboard without any Systemic Therapy data, they could the following code inside an IDE:

StoryboardR::storyboard_dataset %>% 
  dplyr::filter(record_id == "Simulated Patient 1") %>% 
  StoryboardR::combine_storyboard_dfs() %>% 
  dplyr::filter(description != "Systemic Therapy") %>% 
  StoryboardR::storyboard_plot()
The above code will generate a Storyboard for data visualization within an IDE (Figure 9).
Figure 9. Patient Storyboard without Systemic Therapy

Click Here for Figure 9

Figure 9. Patient Storyboard without Systemic Therapy. Shown above is a patient Storyboard that has undergone a user-specified customization. The data elements corresponding to Systemic Therapy have been removed by using a combination of StoryboardR and dplyr functions. By executing Storyboard functions in an integrated development environment outside of a shiny application, users can customize the output of the storyboard_plot() to fit their data visualization needs.

As demonstrated above, the core StoryboardR functions combine_storyboard_dfs() and storyboard_plot() are compatible with tidyverse syntax and use of the magrittr pipe operator10.

Limitations and Solutions

StoryboardR is to be used in conjunction with a specific EDC (e.g., the PDS MCC registry) and thus, it is dependent on that series of forms being installed into a REDCap® project. As a solution, we have made the data dictionary freely available so that others may adopt a similar platform. Importantly, our platform can be modified by other developers to customize both their EDC forms as well as the corresponding R code for the StoryboardR package.

Conclusions

Tumor registries are an important form of RWD and the data they contain can be used for both hypothesis generation as well as hypothesis testing. Exploratory data analysis is instrumental in developing new insights and generating novel hypothesis. StoryboardR is an R package with a Shiny application front-end that produces an interactive data visualization of patient-level data. When built around a core data capture system such as a cancer registry, R-based packages like GENETEX17, eLAB18, BodyMapR19 and StoryboardR can combine to form a powerful data informatics ecosystem to both augment data abstraction as well as facilitate data analysis with the goal of accelerating time-to-action for patients with rare tumors.

References

1.
Project data sphere. https://www.projectdatasphere.org/research/programs/rare-tumor-registries/merkel-cell-carcinoma.
2.
3.
4.
H., W., P., D. & M., E. roxygen2: In-source documentation for r https://cran.r-project.org/package=roxygen2 (version 6.0.1. R package Version 7.1.1, (2013).
5.
Wickham, H., Francois, R., Henry, L. & Muller, K. Dplyr: A grammar of data manipulation. R Package Version 1.0.5, (2021).
6.
Wickham, H. Tidyr: Tidy messy data. R package Version 1.1.3, (2013).
7.
Wickham, H., Hester, J. & Francois, R. Readr: Read rectangular text data. R package Version 1.4.0, (2020).
8.
Wickham, H. Stringr: Simple, consistent wrappers for common string. R package Version 1.4.0, (2019).
9.
Plate, T., Horner, J. & Hansen, L. TimeWarp: Date Calculations and Manipulation. (2016).
10.
Bache, SM., Wickham, H. & Henry, L. Magrittr: A forward-pipe operator for r. R package Version 2.0.1, (2020).
11.
Sievert, C., Parmer, C., Plotly Technologies Inc & etal. Plotly: Create interactive web graphics via ’plotly.js’. R Package Version 4.9.4.1, (2021).
12.
Mahto, A. Splitstackshape: Stack and reshape datasets after splitting concatenated values. R package Version 1.4.8, (2019).
13.
Chang, W., Ribeiro, B. B., RStudio, Studio, A. & Incorporated, A. S. Shinydashboard: Create dashboards with ’shiny’. R Package Version 1.0.5, (2021).
14.
Chang, W., Cheng, J., Allaire, JJ., Xie, Y. & McPherson, J. Shiny: Web application framework for r. R package Version 1.6.0, (2018).
15.
16.
RStudio Team. RStudio: Integrated Development Environment for r. (RStudio, PBC., Boston, MA, 2020).
17.
18.
Shalhout, S. Z., Saqlain, F., Wright, K., Akinyemi, O. & Miller, D. M. Generalizable EHR-R-REDCap pipeline for a national multi-institutional rare tumor patient registry. JAMIA Open 5, (2022).
19.