Visualizing Tumor Response using Waterfall Charts with R

Key points

  • This monograph provides a reference resource for creating waterfall plots in R which may be useful in depicting the tumor response to treatment of each patient for example, enrolled in an oncology clinical trial.
    • We provide an overview of how to create waterfall plots which may be useful for publications, and presentations using R.
    • We outline the steps to presenting the patients nonrandomly, but in descending order, on the plot from the worst response to the best response for appropriately visualizing the waterfall plot.
    • In oncology, a waterfall plot will present the response of each subject with the x- axis set at baseline measurements, and vertical bars added per subject either above or below the horizontal axis representing increase in tumor burden or decrease in tumor burdern respectively.
  • Skill Level: Intermediate
    • Assumption made by this post is that readers have some familiarity with basic R.

Let’s load the packages we will use.

library(tidyverse)
library(dplyr)
library(knitr)

Merkel Cell Carcinoma- Example Oncology Clinical Trial Data Set

  • Let’s first create an “example” data set for demonstrative purposes for subjects with locally advanced or distantly metastatic Merkel Cell Carcinoma (MCC) enrolled on a clinical trial that aims to test the effects of treatment A on tumor response at two different doses.
# We will first create a data set that specifies patients with locally advanced/unresectable MCC and metastatic disease.

# Waterfall plots are displayed in descending order from the worst tumor response from baseline value on the left to the best value on the right side of the plot
## Therefore, we will create data for 55 subjects and order them in decreasing order since a positive value here represents increase in tumor size from baseline and a negative value represents a decrease in size.

# We will randomly assign the two doses, 80 mg or 150 mg, to the 56 subjects
Merkel <- data.frame(
  id=c(1:56), 
  type = sample((rep(c("laMCC", "metMCC"), times =28))), 
  response = c(30, sort(runif(n=53,min=-10,max=19), decreasing=TRUE),-25,-31), 
  dose= sample(rep(c(80, 150), 28)))

# Let's assign Best Overall Response (BOR)
Merkel$BOR= (c("PD", rep(c("SD"), times =54),"PR"))

Let’s view the data set

kable((Merkel))
id type response dose BOR
1 laMCC 30.0000000 150 PD
2 metMCC 17.6961745 150 SD
3 laMCC 17.2015960 80 SD
4 laMCC 16.7730635 150 SD
5 laMCC 15.5085249 80 SD
6 laMCC 15.3942027 80 SD
7 metMCC 14.0977012 150 SD
8 metMCC 13.4637892 150 SD
9 metMCC 12.4359153 150 SD
10 laMCC 12.1690936 80 SD
11 metMCC 11.4677425 80 SD
12 metMCC 10.6797382 80 SD
13 laMCC 10.5087281 150 SD
14 metMCC 10.1770751 150 SD
15 laMCC 10.0543297 80 SD
16 laMCC 9.2379874 150 SD
17 metMCC 9.0836325 80 SD
18 laMCC 8.2809651 150 SD
19 metMCC 7.4873860 80 SD
20 metMCC 7.2668907 80 SD
21 laMCC 6.9725678 80 SD
22 laMCC 5.9926356 150 SD
23 metMCC 4.6037429 150 SD
24 metMCC 4.2029589 150 SD
25 metMCC 2.9150319 150 SD
26 laMCC 2.8063051 150 SD
27 metMCC 2.7887468 150 SD
28 metMCC 1.4311800 150 SD
29 laMCC 0.7588718 80 SD
30 metMCC 0.5579626 80 SD
31 metMCC -0.9722140 80 SD
32 laMCC -1.0056756 80 SD
33 metMCC -1.0244492 150 SD
34 laMCC -1.5689199 150 SD
35 metMCC -1.6348179 150 SD
36 metMCC -2.0239850 80 SD
37 laMCC -2.3932789 80 SD
38 laMCC -2.6160127 80 SD
39 metMCC -2.6792397 80 SD
40 metMCC -2.9282342 150 SD
41 laMCC -3.7178805 80 SD
42 laMCC -3.7718234 80 SD
43 laMCC -4.2864507 80 SD
44 laMCC -4.5529480 150 SD
45 laMCC -4.9128444 150 SD
46 metMCC -5.0150386 150 SD
47 metMCC -5.0685028 80 SD
48 metMCC -6.0561962 80 SD
49 laMCC -6.5596004 80 SD
50 metMCC -6.9792274 80 SD
51 laMCC -7.3095199 80 SD
52 laMCC -8.2953187 150 SD
53 metMCC -8.6023783 150 SD
54 laMCC -9.0655416 150 SD
55 metMCC -25.0000000 150 SD
56 laMCC -31.0000000 80 PR

Let’s create the waterfall plot.

MCC<- barplot(Merkel$response, 
              col="blue", 
              border="blue", 
              space=0.5, ylim=c(-50,50), 
              main = "Waterfall plot for Target Lesion Tumor Size", 
              ylab="Change from baseline (%)",
              cex.axis=1.5, cex.lab=1.5)

We can now add color by Dose and a legend

col <- ifelse(Merkel$dose == 80, 
              "steelblue", # if dose = 80 mg, then the color will be steel blue
              "cadetblue") # if dose != 80 mg (i.e. 150 mg here), then the color will be cadet blue

MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size", 
              ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "80mg", "150mg"),
              args.legend=list(title="Treatment Dose", fill=c("steelblue", "cadetblue"), border=NA, cex=0.9))

Or by disease type…

col <- ifelse(Merkel$type == "laMCC", 
              "#BC5A42", # if type of disease = locally MCC, then the color will be #BC5A42 (deep red)
              "#009296") # if type of disease != locaally MCC (i.e. mMCC), then the color will be ##009296 (greenish-blue)

MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size", 
              ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "locally advanced MCC", "Metastatic MCC"),
              args.legend=list(title="Disease", fill=c("#BC5A42", "#009296"), border=NA, cex=0.9))

Or by Best overall response and provide a legend

col <- ifelse(Merkel$BOR == "CR", 
              "green", # if a subject had a CR the bar will be green, if they did not have a CR....
              ifelse(Merkel$BOR == "PR", 
                     "steelblue", # then, if a subject had a PR the bar will be steel blue, if they did not have a PR or CR....
                     ifelse(Merkel$BOR == "PD", 
                            "red", # then, if a subject had a PD the bar will be red, otherwise if they did not have a PR or CR or PD.... 
                            ifelse(Merkel$BOR == "SD", 
                                   "cadetblue", # then they must have ahd a SD, so the bar will be cadetblue, otherwise....
                                   "") # the color will be blank (which is not really an option, b/c they must have either a CR, PR, PD or SD)
                            )))

              
MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size", ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "CR: Complete Response", "PR: Partial Response", "SD: Stable Disease", "PD: Progressive Disease"),
              args.legend=list(title="Best Overall Response", fill=c("green","steelblue",  "cadetblue", "red"), border=NA, cex=0.9))

Let’s add horizontal lines at 20% and -30%

  • The majority of clinical trials use the Response Evaluation Criteria in Solid Tumors (RECIST) to assess tumor responses to a therapeutic intervention
  • Per RECIST, tumor responses are adjudicated based on the following observations:
    • Complete Response (CR) Disappearance of all target lesions (sum of all taget lesions = 0)
      • Of note, any pathological lymph nodes (whether target or non-target) must have a reduction in size to < 10 mm
    • Partial Response (PR) >= 30% decrease (vs baseline) of sum of all target lesions dimension
    • Progressive Disease (PD) new lesions or >= 20% increase (vs smallest sum of target lesions or nadir)
    • Stable Disease (SD) when sum of all target lesions does not qualify for CR/PR/PD
  • Thus, it is often useful to have lines to denote “Progressive Disease” and “Partial Response” on waterfall plots
    • Those bars between the PD and PR lines denote “Stable Disease”
col <- ifelse(Merkel$BOR == "CR", 
              "green",
              ifelse(Merkel$BOR == "PR", 
                     "steelblue",
                     ifelse(Merkel$BOR == "PD", 
                            "red", 
                            ifelse(Merkel$BOR == "SD", 
                                   "cadetblue", ""))))

MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size",
              ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "CR: Complete Response", "PR: Partial Response", "SD: Stable Disease", "PD: Progressive Disease"),
              args.legend=list(title="Best Overall Response", fill=c("green","steelblue",  "cadetblue", "red"), border=NA, cex=1.0))

# Use the abline() function
## The abline() function is a simple way to add lines in R
### It takes the arguments: abline(a=NULL, b=NULL, h=NULL, v=NULL, ...)
#### a, b : single values denoting the intercept and the slope of the line
#### h : the y-value(s) for horizontal line(s)
#### v : the x-value(s) for vertical line(s)

abline(h=20, col = "black", lwd=0.5) # The "PD" line
abline(h=-30, col = "black", lwd=0.5) # This "PR" line

Take Home Points

  • High-quality data visualization of individual patient tumor responses to a study treatment can be displayed effectively for the cohort using waterfall plots.

As always, please reach out to us with thoughts and feedback

Session Info

sessionInfo()
## R version 4.0.0 (2020-04-24)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Mojave 10.14.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] knitr_1.28      forcats_0.5.0   stringr_1.4.0   dplyr_1.0.0    
##  [5] purrr_0.3.4     readr_1.3.1     tidyr_1.1.0     tibble_3.0.1   
##  [9] ggplot2_3.3.1   tidyverse_1.3.0
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.1.0 xfun_0.13        haven_2.2.0      lattice_0.20-41 
##  [5] colorspace_1.4-1 vctrs_0.3.1      generics_0.0.2   htmltools_0.4.0 
##  [9] yaml_2.2.1       rlang_0.4.6      pillar_1.4.4     glue_1.4.1      
## [13] withr_2.2.0      DBI_1.1.0        dbplyr_1.4.3     modelr_0.1.7    
## [17] readxl_1.3.1     lifecycle_0.2.0  munsell_0.5.0    blogdown_0.18   
## [21] gtable_0.3.0     cellranger_1.1.0 rvest_0.3.5      evaluate_0.14   
## [25] fansi_0.4.1      highr_0.8        broom_0.5.6      Rcpp_1.0.4.6    
## [29] scales_1.1.1     backports_1.1.7  jsonlite_1.6.1   fs_1.4.1        
## [33] hms_0.5.3        digest_0.6.25    stringi_1.4.6    bookdown_0.18   
## [37] grid_4.0.0       cli_2.0.2        tools_4.0.0      magrittr_1.5    
## [41] crayon_1.3.4     pkgconfig_2.0.3  ellipsis_0.3.1   xml2_1.3.2      
## [45] reprex_0.3.0     lubridate_1.7.8  assertthat_0.2.1 rmarkdown_2.1   
## [49] httr_1.4.1       rstudioapi_0.11  R6_2.4.1         nlme_3.1-147    
## [53] compiler_4.0.0
Avatar
Sophia Shalhout
Cutaneous Oncology Research Fellow

My research interests include clinical and translational research in advanced skin cancers.

Related