Visualizing Tumor Response using Waterfall Charts with R

Key points

  • This monograph provides a reference resource for creating waterfall plots in R which may be useful in depicting the tumor response to treatment of each patient for example, enrolled in an oncology clinical trial.
    • We provide an overview of how to create waterfall plots which may be useful for publications, and presentations using R.
    • We outline the steps to presenting the patients nonrandomly, but in descending order, on the plot from the worst response to the best response for appropriately visualizing the waterfall plot.
    • In oncology, a waterfall plot will present the response of each subject with the x- axis set at baseline measurements, and vertical bars added per subject either above or below the horizontal axis representing increase in tumor burden or decrease in tumor burdern respectively.
  • Skill Level: Intermediate
    • Assumption made by this post is that readers have some familiarity with basic R.

Let’s load the packages we will use.

library(tidyverse)
library(dplyr)
library(knitr)

Merkel Cell Carcinoma- Example Oncology Clinical Trial Data Set

  • Let’s first create an “example” data set for demonstrative purposes for subjects with locally advanced or distantly metastatic Merkel Cell Carcinoma (MCC) enrolled on a clinical trial that aims to test the effects of treatment A on tumor response at two different doses.
# We will first create a data set that specifies patients with locally advanced/unresectable MCC and metastatic disease.

# Waterfall plots are displayed in descending order from the worst tumor response from baseline value on the left to the best value on the right side of the plot
## Therefore, we will create data for 55 subjects and order them in decreasing order since a positive value here represents increase in tumor size from baseline and a negative value represents a decrease in size.

# We will randomly assign the two doses, 80 mg or 150 mg, to the 56 subjects
Merkel <- data.frame(
  id=c(1:56), 
  type = sample((rep(c("laMCC", "metMCC"), times =28))), 
  response = c(30, sort(runif(n=53,min=-10,max=19), decreasing=TRUE),-25,-31), 
  dose= sample(rep(c(80, 150), 28)))

# Let's assign Best Overall Response (BOR)
Merkel$BOR= (c("PD", rep(c("SD"), times =54),"PR"))

Let’s view the data set

kable((Merkel))
id type response dose BOR
1 laMCC 30.0000000 150 PD
2 metMCC 18.4702450 150 SD
3 laMCC 18.1390020 150 SD
4 metMCC 17.6328437 80 SD
5 metMCC 17.4456027 80 SD
6 laMCC 17.4441127 150 SD
7 laMCC 17.4011409 80 SD
8 laMCC 17.2975253 150 SD
9 laMCC 17.1518897 80 SD
10 laMCC 16.9891219 80 SD
11 laMCC 16.8570823 80 SD
12 metMCC 15.5276250 150 SD
13 metMCC 15.1340933 150 SD
14 metMCC 13.2778613 80 SD
15 laMCC 13.1581340 80 SD
16 metMCC 12.7728309 150 SD
17 laMCC 12.4202991 150 SD
18 laMCC 12.0675820 80 SD
19 laMCC 11.5727669 150 SD
20 laMCC 11.1376327 150 SD
21 laMCC 10.9558932 150 SD
22 laMCC 10.9473952 150 SD
23 laMCC 10.4559432 150 SD
24 metMCC 9.9577314 80 SD
25 metMCC 9.7534430 150 SD
26 laMCC 9.5086571 150 SD
27 laMCC 8.4196814 80 SD
28 laMCC 5.7579361 80 SD
29 laMCC 5.7349792 150 SD
30 laMCC 4.2068320 80 SD
31 laMCC 4.0614000 150 SD
32 metMCC 4.0419571 150 SD
33 metMCC 2.3991257 80 SD
34 laMCC 0.9919350 80 SD
35 laMCC 0.5858858 150 SD
36 laMCC 0.3775504 80 SD
37 metMCC -0.5185630 80 SD
38 metMCC -0.6167274 80 SD
39 metMCC -0.6737597 80 SD
40 laMCC -1.1156273 150 SD
41 metMCC -2.4151064 80 SD
42 metMCC -3.2322551 150 SD
43 metMCC -3.3595556 150 SD
44 metMCC -3.5507582 150 SD
45 metMCC -3.6740398 80 SD
46 metMCC -3.6754359 80 SD
47 metMCC -3.6849109 150 SD
48 metMCC -3.7168413 80 SD
49 metMCC -4.5279277 80 SD
50 laMCC -6.3070991 80 SD
51 metMCC -6.3390935 80 SD
52 laMCC -7.3252436 150 SD
53 metMCC -9.0093121 150 SD
54 metMCC -9.0156899 80 SD
55 metMCC -25.0000000 150 SD
56 metMCC -31.0000000 80 PR

Let’s create the waterfall plot.

MCC<- barplot(Merkel$response, 
              col="blue", 
              border="blue", 
              space=0.5, ylim=c(-50,50), 
              main = "Waterfall plot for Target Lesion Tumor Size", 
              ylab="Change from baseline (%)",
              cex.axis=1.5, cex.lab=1.5)

We can now add color by Dose and a legend

col <- ifelse(Merkel$dose == 80, 
              "steelblue", # if dose = 80 mg, then the color will be steel blue
              "cadetblue") # if dose != 80 mg (i.e. 150 mg here), then the color will be cadet blue

MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size", 
              ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "80mg", "150mg"),
              args.legend=list(title="Treatment Dose", fill=c("steelblue", "cadetblue"), border=NA, cex=0.9))

Or by disease type…

col <- ifelse(Merkel$type == "laMCC", 
              "#BC5A42", # if type of disease = locally MCC, then the color will be #BC5A42 (deep red)
              "#009296") # if type of disease != locaally MCC (i.e. mMCC), then the color will be ##009296 (greenish-blue)

MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size", 
              ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "locally advanced MCC", "Metastatic MCC"),
              args.legend=list(title="Disease", fill=c("#BC5A42", "#009296"), border=NA, cex=0.9))

Or by Best overall response and provide a legend

col <- ifelse(Merkel$BOR == "CR", 
              "green", # if a subject had a CR the bar will be green, if they did not have a CR....
              ifelse(Merkel$BOR == "PR", 
                     "steelblue", # then, if a subject had a PR the bar will be steel blue, if they did not have a PR or CR....
                     ifelse(Merkel$BOR == "PD", 
                            "red", # then, if a subject had a PD the bar will be red, otherwise if they did not have a PR or CR or PD.... 
                            ifelse(Merkel$BOR == "SD", 
                                   "cadetblue", # then they must have ahd a SD, so the bar will be cadetblue, otherwise....
                                   "") # the color will be blank (which is not really an option, b/c they must have either a CR, PR, PD or SD)
                            )))

              
MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size", ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "CR: Complete Response", "PR: Partial Response", "SD: Stable Disease", "PD: Progressive Disease"),
              args.legend=list(title="Best Overall Response", fill=c("green","steelblue",  "cadetblue", "red"), border=NA, cex=0.9))

Let’s add horizontal lines at 20% and -30%

  • The majority of clinical trials use the Response Evaluation Criteria in Solid Tumors (RECIST) to assess tumor responses to a therapeutic intervention
  • Per RECIST, tumor responses are adjudicated based on the following observations:
    • Complete Response (CR) Disappearance of all target lesions (sum of all taget lesions = 0)
      • Of note, any pathological lymph nodes (whether target or non-target) must have a reduction in size to < 10 mm
    • Partial Response (PR) >= 30% decrease (vs baseline) of sum of all target lesions dimension
    • Progressive Disease (PD) new lesions or >= 20% increase (vs smallest sum of target lesions or nadir)
    • Stable Disease (SD) when sum of all target lesions does not qualify for CR/PR/PD
  • Thus, it is often useful to have lines to denote “Progressive Disease” and “Partial Response” on waterfall plots
    • Those bars between the PD and PR lines denote “Stable Disease”
col <- ifelse(Merkel$BOR == "CR", 
              "green",
              ifelse(Merkel$BOR == "PR", 
                     "steelblue",
                     ifelse(Merkel$BOR == "PD", 
                            "red", 
                            ifelse(Merkel$BOR == "SD", 
                                   "cadetblue", ""))))

MCC<- barplot(Merkel$response, 
              col=col, 
              border=col, 
              space=0.5, 
              ylim=c(-50,50),
              main = "Waterfall plot for Target Lesion Tumor Size",
              ylab="Change from baseline (%)",
              cex.axis=1.5, 
              legend.text= c( "CR: Complete Response", "PR: Partial Response", "SD: Stable Disease", "PD: Progressive Disease"),
              args.legend=list(title="Best Overall Response", fill=c("green","steelblue",  "cadetblue", "red"), border=NA, cex=1.0))

# Use the abline() function
## The abline() function is a simple way to add lines in R
### It takes the arguments: abline(a=NULL, b=NULL, h=NULL, v=NULL, ...)
#### a, b : single values denoting the intercept and the slope of the line
#### h : the y-value(s) for horizontal line(s)
#### v : the x-value(s) for vertical line(s)

abline(h=20, col = "black", lwd=0.5) # The "PD" line
abline(h=-30, col = "black", lwd=0.5) # This "PR" line

Take Home Points

  • High-quality data visualization of individual patient tumor responses to a study treatment can be displayed effectively for the cohort using waterfall plots.

As always, please reach out to us with thoughts and feedback

Session Info

sessionInfo()
## R version 4.0.0 (2020-04-24)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Mojave 10.14.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] knitr_1.30      forcats_0.5.0   stringr_1.4.0   dplyr_1.0.2    
##  [5] purrr_0.3.4     readr_1.4.0     tidyr_1.1.2     tibble_3.0.4   
##  [9] ggplot2_3.3.2   tidyverse_1.3.0
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.5        highr_0.8         cellranger_1.1.0  pillar_1.4.7     
##  [5] compiler_4.0.0    dbplyr_2.0.0      tools_4.0.0       digest_0.6.27    
##  [9] lubridate_1.7.9.2 jsonlite_1.7.1    evaluate_0.14     lifecycle_0.2.0  
## [13] gtable_0.3.0      pkgconfig_2.0.3   rlang_0.4.8       reprex_0.3.0     
## [17] cli_2.2.0         rstudioapi_0.13   DBI_1.1.0         yaml_2.2.1       
## [21] blogdown_0.21     haven_2.3.1       xfun_0.19         withr_2.3.0      
## [25] xml2_1.3.2        httr_1.4.2        fs_1.5.0          hms_0.5.3        
## [29] generics_0.1.0    vctrs_0.3.5       grid_4.0.0        tidyselect_1.1.0 
## [33] glue_1.4.2        R6_2.5.0          fansi_0.4.1       readxl_1.3.1     
## [37] rmarkdown_2.5     bookdown_0.21     modelr_0.1.8      magrittr_2.0.1   
## [41] backports_1.2.0   scales_1.1.1      ellipsis_0.3.1    htmltools_0.5.0  
## [45] rvest_0.3.6       assertthat_0.2.1  colorspace_2.0-0  stringi_1.5.3    
## [49] munsell_0.5.0     broom_0.7.2       crayon_1.3.4
Avatar
Sophia Shalhout
Cutaneous Oncology Research Fellow

My research interests include clinical and translational research in advanced skin cancers.

Related