Visualizing Tumor Response using Waterfall Charts with R

Key points
- This monograph provides a reference resource for creating waterfall plots in R which may be useful in depicting the tumor response to treatment of each patient for example, enrolled in an oncology clinical trial.
- We provide an overview of how to create waterfall plots which may be useful for publications, and presentations using
R
. - We outline the steps to presenting the patients nonrandomly, but in descending order, on the plot from the worst response to the best response for appropriately visualizing the waterfall plot.
- In oncology, a waterfall plot will present the response of each subject with the x- axis set at baseline measurements, and vertical bars added per subject either above or below the horizontal axis representing increase in tumor burden or decrease in tumor burdern respectively.
- We provide an overview of how to create waterfall plots which may be useful for publications, and presentations using
- Skill Level: Intermediate
- Assumption made by this post is that readers have some familiarity with basic
R
.
- Assumption made by this post is that readers have some familiarity with basic
Let’s load the packages we will use.
library(tidyverse)
library(dplyr)
library(knitr)
Merkel Cell Carcinoma- Example Oncology Clinical Trial Data Set
- Let’s first create an “example” data set for demonstrative purposes for subjects with locally advanced or distantly metastatic Merkel Cell Carcinoma (MCC) enrolled on a clinical trial that aims to test the effects of treatment A on tumor response at two different doses.
# We will first create a data set that specifies patients with locally advanced/unresectable MCC and metastatic disease.
# Waterfall plots are displayed in descending order from the worst tumor response from baseline value on the left to the best value on the right side of the plot
## Therefore, we will create data for 55 subjects and order them in decreasing order since a positive value here represents increase in tumor size from baseline and a negative value represents a decrease in size.
# We will randomly assign the two doses, 80 mg or 150 mg, to the 56 subjects
Merkel <- data.frame(
id=c(1:56),
type = sample((rep(c("laMCC", "metMCC"), times =28))),
response = c(30, sort(runif(n=53,min=-10,max=19), decreasing=TRUE),-25,-31),
dose= sample(rep(c(80, 150), 28)))
# Let's assign Best Overall Response (BOR)
Merkel$BOR= (c("PD", rep(c("SD"), times =54),"PR"))
Let’s view the data set
kable((Merkel))
id | type | response | dose | BOR |
---|---|---|---|---|
1 | laMCC | 30.0000000 | 150 | PD |
2 | metMCC | 18.4702450 | 150 | SD |
3 | laMCC | 18.1390020 | 150 | SD |
4 | metMCC | 17.6328437 | 80 | SD |
5 | metMCC | 17.4456027 | 80 | SD |
6 | laMCC | 17.4441127 | 150 | SD |
7 | laMCC | 17.4011409 | 80 | SD |
8 | laMCC | 17.2975253 | 150 | SD |
9 | laMCC | 17.1518897 | 80 | SD |
10 | laMCC | 16.9891219 | 80 | SD |
11 | laMCC | 16.8570823 | 80 | SD |
12 | metMCC | 15.5276250 | 150 | SD |
13 | metMCC | 15.1340933 | 150 | SD |
14 | metMCC | 13.2778613 | 80 | SD |
15 | laMCC | 13.1581340 | 80 | SD |
16 | metMCC | 12.7728309 | 150 | SD |
17 | laMCC | 12.4202991 | 150 | SD |
18 | laMCC | 12.0675820 | 80 | SD |
19 | laMCC | 11.5727669 | 150 | SD |
20 | laMCC | 11.1376327 | 150 | SD |
21 | laMCC | 10.9558932 | 150 | SD |
22 | laMCC | 10.9473952 | 150 | SD |
23 | laMCC | 10.4559432 | 150 | SD |
24 | metMCC | 9.9577314 | 80 | SD |
25 | metMCC | 9.7534430 | 150 | SD |
26 | laMCC | 9.5086571 | 150 | SD |
27 | laMCC | 8.4196814 | 80 | SD |
28 | laMCC | 5.7579361 | 80 | SD |
29 | laMCC | 5.7349792 | 150 | SD |
30 | laMCC | 4.2068320 | 80 | SD |
31 | laMCC | 4.0614000 | 150 | SD |
32 | metMCC | 4.0419571 | 150 | SD |
33 | metMCC | 2.3991257 | 80 | SD |
34 | laMCC | 0.9919350 | 80 | SD |
35 | laMCC | 0.5858858 | 150 | SD |
36 | laMCC | 0.3775504 | 80 | SD |
37 | metMCC | -0.5185630 | 80 | SD |
38 | metMCC | -0.6167274 | 80 | SD |
39 | metMCC | -0.6737597 | 80 | SD |
40 | laMCC | -1.1156273 | 150 | SD |
41 | metMCC | -2.4151064 | 80 | SD |
42 | metMCC | -3.2322551 | 150 | SD |
43 | metMCC | -3.3595556 | 150 | SD |
44 | metMCC | -3.5507582 | 150 | SD |
45 | metMCC | -3.6740398 | 80 | SD |
46 | metMCC | -3.6754359 | 80 | SD |
47 | metMCC | -3.6849109 | 150 | SD |
48 | metMCC | -3.7168413 | 80 | SD |
49 | metMCC | -4.5279277 | 80 | SD |
50 | laMCC | -6.3070991 | 80 | SD |
51 | metMCC | -6.3390935 | 80 | SD |
52 | laMCC | -7.3252436 | 150 | SD |
53 | metMCC | -9.0093121 | 150 | SD |
54 | metMCC | -9.0156899 | 80 | SD |
55 | metMCC | -25.0000000 | 150 | SD |
56 | metMCC | -31.0000000 | 80 | PR |
Let’s create the waterfall plot.
MCC<- barplot(Merkel$response,
col="blue",
border="blue",
space=0.5, ylim=c(-50,50),
main = "Waterfall plot for Target Lesion Tumor Size",
ylab="Change from baseline (%)",
cex.axis=1.5, cex.lab=1.5)
We can now add color by Dose and a legend
col <- ifelse(Merkel$dose == 80,
"steelblue", # if dose = 80 mg, then the color will be steel blue
"cadetblue") # if dose != 80 mg (i.e. 150 mg here), then the color will be cadet blue
MCC<- barplot(Merkel$response,
col=col,
border=col,
space=0.5,
ylim=c(-50,50),
main = "Waterfall plot for Target Lesion Tumor Size",
ylab="Change from baseline (%)",
cex.axis=1.5,
legend.text= c( "80mg", "150mg"),
args.legend=list(title="Treatment Dose", fill=c("steelblue", "cadetblue"), border=NA, cex=0.9))
Or by disease type…
col <- ifelse(Merkel$type == "laMCC",
"#BC5A42", # if type of disease = locally MCC, then the color will be #BC5A42 (deep red)
"#009296") # if type of disease != locaally MCC (i.e. mMCC), then the color will be ##009296 (greenish-blue)
MCC<- barplot(Merkel$response,
col=col,
border=col,
space=0.5,
ylim=c(-50,50),
main = "Waterfall plot for Target Lesion Tumor Size",
ylab="Change from baseline (%)",
cex.axis=1.5,
legend.text= c( "locally advanced MCC", "Metastatic MCC"),
args.legend=list(title="Disease", fill=c("#BC5A42", "#009296"), border=NA, cex=0.9))
Or by Best overall response and provide a legend
col <- ifelse(Merkel$BOR == "CR",
"green", # if a subject had a CR the bar will be green, if they did not have a CR....
ifelse(Merkel$BOR == "PR",
"steelblue", # then, if a subject had a PR the bar will be steel blue, if they did not have a PR or CR....
ifelse(Merkel$BOR == "PD",
"red", # then, if a subject had a PD the bar will be red, otherwise if they did not have a PR or CR or PD....
ifelse(Merkel$BOR == "SD",
"cadetblue", # then they must have ahd a SD, so the bar will be cadetblue, otherwise....
"") # the color will be blank (which is not really an option, b/c they must have either a CR, PR, PD or SD)
)))
MCC<- barplot(Merkel$response,
col=col,
border=col,
space=0.5,
ylim=c(-50,50),
main = "Waterfall plot for Target Lesion Tumor Size", ylab="Change from baseline (%)",
cex.axis=1.5,
legend.text= c( "CR: Complete Response", "PR: Partial Response", "SD: Stable Disease", "PD: Progressive Disease"),
args.legend=list(title="Best Overall Response", fill=c("green","steelblue", "cadetblue", "red"), border=NA, cex=0.9))
Let’s add horizontal lines at 20% and -30%
- The majority of clinical trials use the Response Evaluation Criteria in Solid Tumors (RECIST) to assess tumor responses to a therapeutic intervention
- Per RECIST, tumor responses are adjudicated based on the following observations:
- Complete Response (CR) Disappearance of all target lesions (sum of all taget lesions = 0)
- Of note, any pathological lymph nodes (whether target or non-target) must have a reduction in size to < 10 mm
- Of note, any pathological lymph nodes (whether target or non-target) must have a reduction in size to < 10 mm
- Partial Response (PR) >= 30% decrease (vs baseline) of sum of all target lesions dimension
- Progressive Disease (PD) new lesions or >= 20% increase (vs smallest sum of target lesions or nadir)
- Stable Disease (SD) when sum of all target lesions does not qualify for CR/PR/PD
- Complete Response (CR) Disappearance of all target lesions (sum of all taget lesions = 0)
- Thus, it is often useful to have lines to denote “Progressive Disease” and “Partial Response” on waterfall plots
- Those bars between the PD and PR lines denote “Stable Disease”
col <- ifelse(Merkel$BOR == "CR",
"green",
ifelse(Merkel$BOR == "PR",
"steelblue",
ifelse(Merkel$BOR == "PD",
"red",
ifelse(Merkel$BOR == "SD",
"cadetblue", ""))))
MCC<- barplot(Merkel$response,
col=col,
border=col,
space=0.5,
ylim=c(-50,50),
main = "Waterfall plot for Target Lesion Tumor Size",
ylab="Change from baseline (%)",
cex.axis=1.5,
legend.text= c( "CR: Complete Response", "PR: Partial Response", "SD: Stable Disease", "PD: Progressive Disease"),
args.legend=list(title="Best Overall Response", fill=c("green","steelblue", "cadetblue", "red"), border=NA, cex=1.0))
# Use the abline() function
## The abline() function is a simple way to add lines in R
### It takes the arguments: abline(a=NULL, b=NULL, h=NULL, v=NULL, ...)
#### a, b : single values denoting the intercept and the slope of the line
#### h : the y-value(s) for horizontal line(s)
#### v : the x-value(s) for vertical line(s)
abline(h=20, col = "black", lwd=0.5) # The "PD" line
abline(h=-30, col = "black", lwd=0.5) # This "PR" line
Take Home Points
- High-quality data visualization of individual patient tumor responses to a study treatment can be displayed effectively for the cohort using waterfall plots.
As always, please reach out to us with thoughts and feedback
Session Info
sessionInfo()
## R version 4.0.0 (2020-04-24)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Mojave 10.14.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] knitr_1.30 forcats_0.5.0 stringr_1.4.0 dplyr_1.0.2
## [5] purrr_0.3.4 readr_1.4.0 tidyr_1.1.2 tibble_3.0.4
## [9] ggplot2_3.3.2 tidyverse_1.3.0
##
## loaded via a namespace (and not attached):
## [1] Rcpp_1.0.5 highr_0.8 cellranger_1.1.0 pillar_1.4.7
## [5] compiler_4.0.0 dbplyr_2.0.0 tools_4.0.0 digest_0.6.27
## [9] lubridate_1.7.9.2 jsonlite_1.7.1 evaluate_0.14 lifecycle_0.2.0
## [13] gtable_0.3.0 pkgconfig_2.0.3 rlang_0.4.8 reprex_0.3.0
## [17] cli_2.2.0 rstudioapi_0.13 DBI_1.1.0 yaml_2.2.1
## [21] blogdown_0.21 haven_2.3.1 xfun_0.19 withr_2.3.0
## [25] xml2_1.3.2 httr_1.4.2 fs_1.5.0 hms_0.5.3
## [29] generics_0.1.0 vctrs_0.3.5 grid_4.0.0 tidyselect_1.1.0
## [33] glue_1.4.2 R6_2.5.0 fansi_0.4.1 readxl_1.3.1
## [37] rmarkdown_2.5 bookdown_0.21 modelr_0.1.8 magrittr_2.0.1
## [41] backports_1.2.0 scales_1.1.1 ellipsis_0.3.1 htmltools_0.5.0
## [45] rvest_0.3.6 assertthat_0.2.1 colorspace_2.0-0 stringi_1.5.3
## [49] munsell_0.5.0 broom_0.7.2 crayon_1.3.4