Visualizing Tumor Response using Waterfall Charts with R

Key points

  • This monograph provides a reference resource for creating waterfall plots in R which may be useful in depicting the tumor response to treatment of each patient for example, enrolled in an oncology clinical trial.
    • We provide an overview of how to create waterfall plots which may be useful for publications, and presentations using R.
    • We outline the steps to presenting the patients nonrandomly, but in descending order, on the plot from the worst response to the best response for appropriately visualizing the waterfall plot.
    • In oncology, a waterfall plot will present the response of each subject with the x- axis set at baseline measurements, and vertical bars added per subject either above or below the horizontal axis representing increase in tumor burden or decrease in tumor burdern respectively.
  • Skill Level: Intermediate
    • Assumption made by this post is that readers have some familiarity with basic R.

Let’s load the packages we will use.

library(tidyverse)
library(dplyr)
library(knitr)

Merkel Cell Carcinoma- Example Oncology Clinical Trial Data Set

  • Let’s first create an “example” data set for demonstrative purposes for subjects with locally advanced or distantly metastatic Merkel Cell Carcinoma (MCC) enrolled on a clinical trial that aims to test the effects of treatment A on tumor response at two different doses.
# We will first create a data set that specifies patients with locally advanced/unresectable MCC and metastatic disease.

# Waterfall plots are displayed in descending order from the worst tumor response from baseline value on the left to the best value on the right side of the plot
## Therefore, we will create data for 55 subjects and order them in decreasing order since a positive value here represents increase in tumor size from baseline and a negative value represents a decrease in size.

# We will randomly assign the two doses, 80 mg or 150 mg, to the 56 subjects
Merkel <- data.frame(
  id=c(1:56), 
  type = sample((rep(c("laMCC", "metMCC"), times =28))), 
  response = c(30, sort(runif(n=53,min=-10,max=19), decreasing=TRUE),-25,-31), 
  dose= sample(rep(c(80, 150), 28)))

# Let's assign Best Overall Response (BOR)
Merkel$BOR= (c("PD", rep(c("SD"), times =54),"PR"))

Let’s view the data set

kable((Merkel))
id type response dose BOR
1 laMCC 30.0000000 150 PD
2 metMCC 18.4949522 80 SD
3 laMCC 18.1117082 150 SD
4 laMCC 17.6435155 150 SD
5 laMCC 17.3488051 150 SD
6 metMCC 16.8720296 150 SD
7 laMCC 16.6972961 80 SD
8 laMCC 16.2042303 150 SD
9 metMCC 15.9671342 150 SD
10 metMCC 15.5740679 150 SD
11 laMCC 13.7480282 80 SD
12 metMCC 12.5132693 80 SD
13 laMCC 12.4162928 80 SD
14 metMCC 12.1226071 80 SD
15 metMCC 11.7312235 150 SD
16 laMCC 10.5044538 80 SD
17 metMCC 10.2268747 150 SD
18 laMCC 8.2162247 80 SD
19 laMCC 7.4702930 150 SD
20 metMCC 7.1305159 80 SD
21 laMCC 7.1166470 150 SD
22 metMCC 6.4560227 80 SD
23 metMCC 6.4278444 150 SD
24 metMCC 5.4569569 150 SD
25 metMCC 5.2830644 80 SD
26 metMCC 5.1467644 80 SD
27 laMCC 5.0340250 80 SD
28 metMCC 4.7315164 80 SD
29 metMCC 4.1866879 150 SD
30 laMCC 3.7576881 80 SD
31 laMCC 3.4478367 150 SD
32 laMCC 3.1045804 150 SD
33 laMCC 3.0494558 80 SD
34 laMCC 2.9017910 80 SD
35 metMCC 2.4763337 150 SD
36 laMCC 2.1888214 150 SD
37 metMCC 1.2820381 80 SD
38 laMCC 0.3143123 80 SD
39 metMCC 0.2622020 150 SD
40 laMCC -0.1403696 150 SD
41 metMCC -1.1212082 80 SD
42 laMCC -1.4494472 150 SD
43 metMCC -1.4965893 80 SD
44 metMCC -2.3950758 80 SD
45 metMCC -2.8177287 150 SD
46 laMCC -3.0089303 150 SD
47 metMCC -3.0363226 80 SD
48 laMCC -3.3496043 150 SD
49 metMCC -4.3581363 80 SD
50 laMCC -4.4612100 80 SD
51 metMCC -6.6431801 80 SD
52 laMCC -7.3363308 80 SD
53 metMCC -8.2237557 150 SD
54 laMCC -9.7372845 150 SD
55 metMCC -25.0000000 80 SD
56 laMCC -31.0000000 150 PR

Let’s create the waterfall plot.

MCC<- barplot(Merkel$response, 
              col="blue", 
              border="blue", 
              space=0.5, ylim=c(-50,50), 
              main = "Waterfall plot for Target Lesion Tumor Size", 
              ylab="Change from baseline (%)",
              cex.axis=1.5, cex.lab=1.5)

We can now add color by Dose and a legend

col <- ifelse(Merkel$dose == 80, 
              "steelblue", # if dose = 80 mg, then the color will be steel blue
              "cadetblue") # if dose != 80 mg (i.e. 150 mg here), then the color will be cadet blue

MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size", 
              ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "80mg", "150mg"),
              args.legend=list(title="Treatment Dose", fill=c("steelblue", "cadetblue"), border=NA, cex=0.9))

Or by disease type…

col <- ifelse(Merkel$type == "laMCC", 
              "#BC5A42", # if type of disease = locally MCC, then the color will be #BC5A42 (deep red)
              "#009296") # if type of disease != locaally MCC (i.e. mMCC), then the color will be ##009296 (greenish-blue)

MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size", 
              ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "locally advanced MCC", "Metastatic MCC"),
              args.legend=list(title="Disease", fill=c("#BC5A42", "#009296"), border=NA, cex=0.9))

Or by Best overall response and provide a legend

col <- ifelse(Merkel$BOR == "CR", 
              "green", # if a subject had a CR the bar will be green, if they did not have a CR....
              ifelse(Merkel$BOR == "PR", 
                     "steelblue", # then, if a subject had a PR the bar will be steel blue, if they did not have a PR or CR....
                     ifelse(Merkel$BOR == "PD", 
                            "red", # then, if a subject had a PD the bar will be red, otherwise if they did not have a PR or CR or PD.... 
                            ifelse(Merkel$BOR == "SD", 
                                   "cadetblue", # then they must have ahd a SD, so the bar will be cadetblue, otherwise....
                                   "") # the color will be blank (which is not really an option, b/c they must have either a CR, PR, PD or SD)
                            )))

              
MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size", ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "CR: Complete Response", "PR: Partial Response", "SD: Stable Disease", "PD: Progressive Disease"),
              args.legend=list(title="Best Overall Response", fill=c("green","steelblue",  "cadetblue", "red"), border=NA, cex=0.9))

Let’s add horizontal lines at 20% and -30%

  • The majority of clinical trials use the Response Evaluation Criteria in Solid Tumors (RECIST) to assess tumor responses to a therapeutic intervention
  • Per RECIST, tumor responses are adjudicated based on the following observations:
    • Complete Response (CR) Disappearance of all target lesions (sum of all taget lesions = 0)
      • Of note, any pathological lymph nodes (whether target or non-target) must have a reduction in size to < 10 mm
    • Partial Response (PR) >= 30% decrease (vs baseline) of sum of all target lesions dimension
    • Progressive Disease (PD) new lesions or >= 20% increase (vs smallest sum of target lesions or nadir)
    • Stable Disease (SD) when sum of all target lesions does not qualify for CR/PR/PD
  • Thus, it is often useful to have lines to denote “Progressive Disease” and “Partial Response” on waterfall plots
    • Those bars between the PD and PR lines denote “Stable Disease”
col <- ifelse(Merkel$BOR == "CR", 
              "green",
              ifelse(Merkel$BOR == "PR", 
                     "steelblue",
                     ifelse(Merkel$BOR == "PD", 
                            "red", 
                            ifelse(Merkel$BOR == "SD", 
                                   "cadetblue", ""))))

MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size",
              ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "CR: Complete Response", "PR: Partial Response", "SD: Stable Disease", "PD: Progressive Disease"),
              args.legend=list(title="Best Overall Response", fill=c("green","steelblue",  "cadetblue", "red"), border=NA, cex=1.0))

# Use the abline() function
## The abline() function is a simple way to add lines in R
### It takes the arguments: abline(a=NULL, b=NULL, h=NULL, v=NULL, ...)
#### a, b : single values denoting the intercept and the slope of the line
#### h : the y-value(s) for horizontal line(s)
#### v : the x-value(s) for vertical line(s)

abline(h=20, col = "black", lwd=0.5) # The "PD" line
abline(h=-30, col = "black", lwd=0.5) # This "PR" line

Take Home Points

  • High-quality data visualization of individual patient tumor responses to a study treatment can be displayed effectively for the cohort using waterfall plots.

As always, please reach out to us with thoughts and feedback

Session Info

sessionInfo()
## R version 4.0.0 (2020-04-24)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Mojave 10.14.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] knitr_1.28      forcats_0.5.0   stringr_1.4.0   dplyr_0.8.5    
##  [5] purrr_0.3.4     readr_1.3.1     tidyr_1.0.3     tibble_3.0.1   
##  [9] ggplot2_3.3.0   tidyverse_1.3.0
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.0.0 xfun_0.13        haven_2.2.0      lattice_0.20-41 
##  [5] colorspace_1.4-1 vctrs_0.2.4      generics_0.0.2   htmltools_0.4.0 
##  [9] yaml_2.2.1       rlang_0.4.6      pillar_1.4.4     glue_1.4.0      
## [13] withr_2.2.0      DBI_1.1.0        dbplyr_1.4.3     modelr_0.1.7    
## [17] readxl_1.3.1     lifecycle_0.2.0  munsell_0.5.0    blogdown_0.18   
## [21] gtable_0.3.0     cellranger_1.1.0 rvest_0.3.5      evaluate_0.14   
## [25] fansi_0.4.1      highr_0.8        broom_0.5.6      Rcpp_1.0.4.6    
## [29] scales_1.1.0     backports_1.1.6  jsonlite_1.6.1   fs_1.4.1        
## [33] hms_0.5.3        digest_0.6.25    stringi_1.4.6    bookdown_0.18   
## [37] grid_4.0.0       cli_2.0.2        tools_4.0.0      magrittr_1.5    
## [41] crayon_1.3.4     pkgconfig_2.0.3  ellipsis_0.3.0   xml2_1.3.2      
## [45] reprex_0.3.0     lubridate_1.7.8  assertthat_0.2.1 rmarkdown_2.1   
## [49] httr_1.4.1       rstudioapi_0.11  R6_2.4.1         nlme_3.1-147    
## [53] compiler_4.0.0
Avatar
Sophia Shalhout
Cutaneous Oncology Research Fellow

My research interests include clinical and translational research in advanced skin cancers.

Related