Visualizing Tumor Response using Waterfall Charts with R

Key points

  • This monograph provides a reference resource for creating waterfall plots in R which may be useful in depicting the tumor response to treatment of each patient for example, enrolled in an oncology clinical trial.
    • We provide an overview of how to create waterfall plots which may be useful for publications, and presentations using R.
    • We outline the steps to presenting the patients nonrandomly, but in descending order, on the plot from the worst response to the best response for appropriately visualizing the waterfall plot.
    • In oncology, a waterfall plot will present the response of each subject with the x- axis set at baseline measurements, and vertical bars added per subject either above or below the horizontal axis representing increase in tumor burden or decrease in tumor burdern respectively.
  • Skill Level: Intermediate
    • Assumption made by this post is that readers have some familiarity with basic R.

Let’s load the packages we will use.

library(tidyverse)
library(dplyr)
library(knitr)

Merkel Cell Carcinoma- Example Oncology Clinical Trial Data Set

  • Let’s first create an “example” data set for demonstrative purposes for subjects with locally advanced or distantly metastatic Merkel Cell Carcinoma (MCC) enrolled on a clinical trial that aims to test the effects of treatment A on tumor response at two different doses.
# We will first create a data set that specifies patients with locally advanced/unresectable MCC and metastatic disease.

# Waterfall plots are displayed in descending order from the worst tumor response from baseline value on the left to the best value on the right side of the plot
## Therefore, we will create data for 55 subjects and order them in decreasing order since a positive value here represents increase in tumor size from baseline and a negative value represents a decrease in size.

# We will randomly assign the two doses, 80 mg or 150 mg, to the 56 subjects
Merkel <- data.frame(
  id=c(1:56), 
  type = sample((rep(c("laMCC", "metMCC"), times =28))), 
  response = c(30, sort(runif(n=53,min=-10,max=19), decreasing=TRUE),-25,-31), 
  dose= sample(rep(c(80, 150), 28)))

# Let's assign Best Overall Response (BOR)
Merkel$BOR= (c("PD", rep(c("SD"), times =54),"PR"))

Let’s view the data set

kable((Merkel))
id type response dose BOR
1 laMCC 30.0000000 150 PD
2 laMCC 17.5671225 80 SD
3 metMCC 16.1331415 150 SD
4 metMCC 15.9186232 150 SD
5 metMCC 13.9119951 150 SD
6 metMCC 13.7867230 150 SD
7 metMCC 13.5341242 150 SD
8 laMCC 13.5172076 150 SD
9 laMCC 13.0014255 150 SD
10 laMCC 12.4950237 80 SD
11 metMCC 12.3037785 150 SD
12 metMCC 11.8891326 150 SD
13 laMCC 11.3015469 150 SD
14 metMCC 10.6927948 80 SD
15 laMCC 10.6829551 80 SD
16 metMCC 10.6190570 80 SD
17 metMCC 9.5838351 150 SD
18 metMCC 9.0666457 150 SD
19 metMCC 8.6616828 80 SD
20 laMCC 8.5517749 80 SD
21 laMCC 7.6797107 150 SD
22 metMCC 7.6509149 80 SD
23 metMCC 7.6301331 80 SD
24 laMCC 6.7955936 80 SD
25 laMCC 6.7245764 80 SD
26 laMCC 6.0454854 150 SD
27 metMCC 5.8234628 150 SD
28 metMCC 4.0487224 80 SD
29 laMCC 3.6643637 150 SD
30 laMCC 3.6567846 150 SD
31 metMCC 3.4124079 80 SD
32 laMCC 2.9359065 80 SD
33 metMCC 2.6350576 80 SD
34 metMCC 2.5579252 80 SD
35 laMCC 2.2091889 80 SD
36 laMCC 2.1006811 150 SD
37 laMCC 2.0541793 80 SD
38 metMCC 1.1781217 80 SD
39 laMCC 1.0067931 150 SD
40 laMCC 0.7738708 150 SD
41 metMCC -0.1876224 80 SD
42 metMCC -0.5236366 80 SD
43 laMCC -1.1229957 80 SD
44 metMCC -3.2156236 150 SD
45 metMCC -4.0398077 150 SD
46 metMCC -4.1842316 150 SD
47 laMCC -5.6627451 80 SD
48 laMCC -5.7195241 150 SD
49 laMCC -5.9504499 80 SD
50 laMCC -6.1187903 150 SD
51 laMCC -6.9270395 80 SD
52 laMCC -8.1002454 80 SD
53 metMCC -8.8201170 80 SD
54 metMCC -9.6850823 80 SD
55 metMCC -25.0000000 150 SD
56 laMCC -31.0000000 150 PR

Let’s create the waterfall plot.

MCC<- barplot(Merkel$response, 
              col="blue", 
              border="blue", 
              space=0.5, ylim=c(-50,50), 
              main = "Waterfall plot for Target Lesion Tumor Size", 
              ylab="Change from baseline (%)",
              cex.axis=1.5, cex.lab=1.5)

We can now add color by Dose and a legend

col <- ifelse(Merkel$dose == 80, 
              "steelblue", # if dose = 80 mg, then the color will be steel blue
              "cadetblue") # if dose != 80 mg (i.e. 150 mg here), then the color will be cadet blue

MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size", 
              ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "80mg", "150mg"),
              args.legend=list(title="Treatment Dose", fill=c("steelblue", "cadetblue"), border=NA, cex=0.9))

Or by disease type…

col <- ifelse(Merkel$type == "laMCC", 
              "#BC5A42", # if type of disease = locally MCC, then the color will be #BC5A42 (deep red)
              "#009296") # if type of disease != locaally MCC (i.e. mMCC), then the color will be ##009296 (greenish-blue)

MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size", 
              ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "locally advanced MCC", "Metastatic MCC"),
              args.legend=list(title="Disease", fill=c("#BC5A42", "#009296"), border=NA, cex=0.9))

Or by Best overall response and provide a legend

col <- ifelse(Merkel$BOR == "CR", 
              "green", # if a subject had a CR the bar will be green, if they did not have a CR....
              ifelse(Merkel$BOR == "PR", 
                     "steelblue", # then, if a subject had a PR the bar will be steel blue, if they did not have a PR or CR....
                     ifelse(Merkel$BOR == "PD", 
                            "red", # then, if a subject had a PD the bar will be red, otherwise if they did not have a PR or CR or PD.... 
                            ifelse(Merkel$BOR == "SD", 
                                   "cadetblue", # then they must have ahd a SD, so the bar will be cadetblue, otherwise....
                                   "") # the color will be blank (which is not really an option, b/c they must have either a CR, PR, PD or SD)
                            )))

              
MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size", ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "CR: Complete Response", "PR: Partial Response", "SD: Stable Disease", "PD: Progressive Disease"),
              args.legend=list(title="Best Overall Response", fill=c("green","steelblue",  "cadetblue", "red"), border=NA, cex=0.9))

Let’s add horizontal lines at 20% and -30%

  • The majority of clinical trials use the Response Evaluation Criteria in Solid Tumors (RECIST) to assess tumor responses to a therapeutic intervention
  • Per RECIST, tumor responses are adjudicated based on the following observations:
    • Complete Response (CR) Disappearance of all target lesions (sum of all taget lesions = 0)
      • Of note, any pathological lymph nodes (whether target or non-target) must have a reduction in size to < 10 mm
    • Partial Response (PR) >= 30% decrease (vs baseline) of sum of all target lesions dimension
    • Progressive Disease (PD) new lesions or >= 20% increase (vs smallest sum of target lesions or nadir)
    • Stable Disease (SD) when sum of all target lesions does not qualify for CR/PR/PD
  • Thus, it is often useful to have lines to denote “Progressive Disease” and “Partial Response” on waterfall plots
    • Those bars between the PD and PR lines denote “Stable Disease”
col <- ifelse(Merkel$BOR == "CR", 
              "green",
              ifelse(Merkel$BOR == "PR", 
                     "steelblue",
                     ifelse(Merkel$BOR == "PD", 
                            "red", 
                            ifelse(Merkel$BOR == "SD", 
                                   "cadetblue", ""))))

MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size",
              ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "CR: Complete Response", "PR: Partial Response", "SD: Stable Disease", "PD: Progressive Disease"),
              args.legend=list(title="Best Overall Response", fill=c("green","steelblue",  "cadetblue", "red"), border=NA, cex=1.0))

# Use the abline() function
## The abline() function is a simple way to add lines in R
### It takes the arguments: abline(a=NULL, b=NULL, h=NULL, v=NULL, ...)
#### a, b : single values denoting the intercept and the slope of the line
#### h : the y-value(s) for horizontal line(s)
#### v : the x-value(s) for vertical line(s)

abline(h=20, col = "black", lwd=0.5) # The "PD" line
abline(h=-30, col = "black", lwd=0.5) # This "PR" line

Take Home Points

  • High-quality data visualization of individual patient tumor responses to a study treatment can be displayed effectively for the cohort using waterfall plots.

As always, please reach out to us with thoughts and feedback

Session Info

sessionInfo()
## R version 4.0.0 (2020-04-24)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Mojave 10.14.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] knitr_1.28      forcats_0.5.0   stringr_1.4.0   dplyr_1.0.0    
##  [5] purrr_0.3.4     readr_1.3.1     tidyr_1.1.0     tibble_3.0.1   
##  [9] ggplot2_3.3.1   tidyverse_1.3.0
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.1.0 xfun_0.13        haven_2.2.0      lattice_0.20-41 
##  [5] colorspace_1.4-1 vctrs_0.3.1      generics_0.0.2   htmltools_0.4.0 
##  [9] yaml_2.2.1       rlang_0.4.6      pillar_1.4.4     glue_1.4.1      
## [13] withr_2.2.0      DBI_1.1.0        dbplyr_1.4.3     modelr_0.1.7    
## [17] readxl_1.3.1     lifecycle_0.2.0  munsell_0.5.0    blogdown_0.18   
## [21] gtable_0.3.0     cellranger_1.1.0 rvest_0.3.5      evaluate_0.14   
## [25] fansi_0.4.1      highr_0.8        broom_0.5.6      Rcpp_1.0.4.6    
## [29] scales_1.1.1     backports_1.1.7  jsonlite_1.6.1   fs_1.4.1        
## [33] hms_0.5.3        digest_0.6.25    stringi_1.4.6    bookdown_0.18   
## [37] grid_4.0.0       cli_2.0.2        tools_4.0.0      magrittr_1.5    
## [41] crayon_1.3.4     pkgconfig_2.0.3  ellipsis_0.3.1   xml2_1.3.2      
## [45] reprex_0.3.0     lubridate_1.7.8  assertthat_0.2.1 rmarkdown_2.1   
## [49] httr_1.4.1       rstudioapi_0.11  R6_2.4.1         nlme_3.1-147    
## [53] compiler_4.0.0
Avatar
Sophia Shalhout
Cutaneous Oncology Research Fellow

My research interests include clinical and translational research in advanced skin cancers.

Related