# Visualizing Tumor Response using Waterfall Charts with R

# Key points

- This monograph provides a reference resource for creating
**waterfall plots**in R which may be useful in depicting the tumor response to treatment of each patient for example, enrolled in an oncology clinical trial.- We provide an overview of how to create
**waterfall plots**which may be useful for publications, and presentations using`R`

. - We outline the steps to presenting the patients nonrandomly, but in descending order, on the plot from the worst response to the best response for appropriately visualizing the
**waterfall plot**. - In oncology, a waterfall plot will present the response of each subject with the x- axis set at baseline measurements, and vertical bars added per subject either above or below the horizontal axis representing increase in tumor burden or decrease in tumor burdern respectively.

- We provide an overview of how to create
- Skill Level: Intermediate
- Assumption made by this post is that readers have some familiarity with basic
`R`

.

- Assumption made by this post is that readers have some familiarity with basic

### Let’s load the packages we will use.

```
library(tidyverse)
library(dplyr)
library(knitr)
```

### Merkel Cell Carcinoma- Example Oncology Clinical Trial Data Set

- Let’s first create an “example” data set for demonstrative purposes for subjects with locally advanced or distantly metastatic Merkel Cell Carcinoma (MCC) enrolled on a clinical trial that aims to test the effects of treatment A on tumor response at two different doses.

```
# We will first create a data set that specifies patients with locally advanced/unresectable MCC and metastatic disease.
# Waterfall plots are displayed in descending order from the worst tumor response from baseline value on the left to the best value on the right side of the plot
## Therefore, we will create data for 55 subjects and order them in decreasing order since a positive value here represents increase in tumor size from baseline and a negative value represents a decrease in size.
# We will randomly assign the two doses, 80 mg or 150 mg, to the 56 subjects
Merkel <- data.frame(
id=c(1:56),
type = sample((rep(c("laMCC", "metMCC"), times =28))),
response = c(30, sort(runif(n=53,min=-10,max=19), decreasing=TRUE),-25,-31),
dose= sample(rep(c(80, 150), 28)))
# Let's assign Best Overall Response (BOR)
Merkel$BOR= (c("PD", rep(c("SD"), times =54),"PR"))
```

#### Let’s view the data set

`kable((Merkel))`

id | type | response | dose | BOR |
---|---|---|---|---|

1 | laMCC | 30.0000000 | 150 | PD |

2 | laMCC | 17.5671225 | 80 | SD |

3 | metMCC | 16.1331415 | 150 | SD |

4 | metMCC | 15.9186232 | 150 | SD |

5 | metMCC | 13.9119951 | 150 | SD |

6 | metMCC | 13.7867230 | 150 | SD |

7 | metMCC | 13.5341242 | 150 | SD |

8 | laMCC | 13.5172076 | 150 | SD |

9 | laMCC | 13.0014255 | 150 | SD |

10 | laMCC | 12.4950237 | 80 | SD |

11 | metMCC | 12.3037785 | 150 | SD |

12 | metMCC | 11.8891326 | 150 | SD |

13 | laMCC | 11.3015469 | 150 | SD |

14 | metMCC | 10.6927948 | 80 | SD |

15 | laMCC | 10.6829551 | 80 | SD |

16 | metMCC | 10.6190570 | 80 | SD |

17 | metMCC | 9.5838351 | 150 | SD |

18 | metMCC | 9.0666457 | 150 | SD |

19 | metMCC | 8.6616828 | 80 | SD |

20 | laMCC | 8.5517749 | 80 | SD |

21 | laMCC | 7.6797107 | 150 | SD |

22 | metMCC | 7.6509149 | 80 | SD |

23 | metMCC | 7.6301331 | 80 | SD |

24 | laMCC | 6.7955936 | 80 | SD |

25 | laMCC | 6.7245764 | 80 | SD |

26 | laMCC | 6.0454854 | 150 | SD |

27 | metMCC | 5.8234628 | 150 | SD |

28 | metMCC | 4.0487224 | 80 | SD |

29 | laMCC | 3.6643637 | 150 | SD |

30 | laMCC | 3.6567846 | 150 | SD |

31 | metMCC | 3.4124079 | 80 | SD |

32 | laMCC | 2.9359065 | 80 | SD |

33 | metMCC | 2.6350576 | 80 | SD |

34 | metMCC | 2.5579252 | 80 | SD |

35 | laMCC | 2.2091889 | 80 | SD |

36 | laMCC | 2.1006811 | 150 | SD |

37 | laMCC | 2.0541793 | 80 | SD |

38 | metMCC | 1.1781217 | 80 | SD |

39 | laMCC | 1.0067931 | 150 | SD |

40 | laMCC | 0.7738708 | 150 | SD |

41 | metMCC | -0.1876224 | 80 | SD |

42 | metMCC | -0.5236366 | 80 | SD |

43 | laMCC | -1.1229957 | 80 | SD |

44 | metMCC | -3.2156236 | 150 | SD |

45 | metMCC | -4.0398077 | 150 | SD |

46 | metMCC | -4.1842316 | 150 | SD |

47 | laMCC | -5.6627451 | 80 | SD |

48 | laMCC | -5.7195241 | 150 | SD |

49 | laMCC | -5.9504499 | 80 | SD |

50 | laMCC | -6.1187903 | 150 | SD |

51 | laMCC | -6.9270395 | 80 | SD |

52 | laMCC | -8.1002454 | 80 | SD |

53 | metMCC | -8.8201170 | 80 | SD |

54 | metMCC | -9.6850823 | 80 | SD |

55 | metMCC | -25.0000000 | 150 | SD |

56 | laMCC | -31.0000000 | 150 | PR |

### Let’s create the waterfall plot.

```
MCC<- barplot(Merkel$response,
col="blue",
border="blue",
space=0.5, ylim=c(-50,50),
main = "Waterfall plot for Target Lesion Tumor Size",
ylab="Change from baseline (%)",
cex.axis=1.5, cex.lab=1.5)
```

#### We can now add color by **Dose** and a legend

```
col <- ifelse(Merkel$dose == 80,
"steelblue", # if dose = 80 mg, then the color will be steel blue
"cadetblue") # if dose != 80 mg (i.e. 150 mg here), then the color will be cadet blue
MCC<- barplot(Merkel$response,
col=col,
border=col,
space=0.5,
ylim=c(-50,50),
main = "Waterfall plot for Target Lesion Tumor Size",
ylab="Change from baseline (%)",
cex.axis=1.5,
legend.text= c( "80mg", "150mg"),
args.legend=list(title="Treatment Dose", fill=c("steelblue", "cadetblue"), border=NA, cex=0.9))
```

#### Or by disease type…

```
col <- ifelse(Merkel$type == "laMCC",
"#BC5A42", # if type of disease = locally MCC, then the color will be #BC5A42 (deep red)
"#009296") # if type of disease != locaally MCC (i.e. mMCC), then the color will be ##009296 (greenish-blue)
MCC<- barplot(Merkel$response,
col=col,
border=col,
space=0.5,
ylim=c(-50,50),
main = "Waterfall plot for Target Lesion Tumor Size",
ylab="Change from baseline (%)",
cex.axis=1.5,
legend.text= c( "locally advanced MCC", "Metastatic MCC"),
args.legend=list(title="Disease", fill=c("#BC5A42", "#009296"), border=NA, cex=0.9))
```

#### Or by Best overall response and provide a legend

```
col <- ifelse(Merkel$BOR == "CR",
"green", # if a subject had a CR the bar will be green, if they did not have a CR....
ifelse(Merkel$BOR == "PR",
"steelblue", # then, if a subject had a PR the bar will be steel blue, if they did not have a PR or CR....
ifelse(Merkel$BOR == "PD",
"red", # then, if a subject had a PD the bar will be red, otherwise if they did not have a PR or CR or PD....
ifelse(Merkel$BOR == "SD",
"cadetblue", # then they must have ahd a SD, so the bar will be cadetblue, otherwise....
"") # the color will be blank (which is not really an option, b/c they must have either a CR, PR, PD or SD)
)))
MCC<- barplot(Merkel$response,
col=col,
border=col,
space=0.5,
ylim=c(-50,50),
main = "Waterfall plot for Target Lesion Tumor Size", ylab="Change from baseline (%)",
cex.axis=1.5,
legend.text= c( "CR: Complete Response", "PR: Partial Response", "SD: Stable Disease", "PD: Progressive Disease"),
args.legend=list(title="Best Overall Response", fill=c("green","steelblue", "cadetblue", "red"), border=NA, cex=0.9))
```

#### Let’s add horizontal lines at 20% and -30%

- The majority of clinical trials use the Response Evaluation Criteria in Solid Tumors (RECIST) to assess tumor responses to a therapeutic intervention

- Per
**RECIST**, tumor responses are adjudicated based on the following observations:**Complete Response**(CR) Disappearance of all target lesions (sum of all taget lesions = 0)- Of note, any pathological lymph nodes (whether target or non-target) must have a reduction in size to < 10 mm

- Of note, any pathological lymph nodes (whether target or non-target) must have a reduction in size to < 10 mm
**Partial Response**(PR) >= 30% decrease (vs baseline) of sum of all target lesions dimension

**Progressive Disease**(PD) new lesions or >= 20% increase (vs smallest sum of target lesions or nadir)

**Stable Disease**(SD) when sum of all target lesions does not qualify for CR/PR/PD

- Thus, it is often useful to have lines to denote “Progressive Disease” and “Partial Response” on
**waterfall plots**- Those bars between the PD and PR lines denote “Stable Disease”

```
col <- ifelse(Merkel$BOR == "CR",
"green",
ifelse(Merkel$BOR == "PR",
"steelblue",
ifelse(Merkel$BOR == "PD",
"red",
ifelse(Merkel$BOR == "SD",
"cadetblue", ""))))
MCC<- barplot(Merkel$response,
col=col,
border=col,
space=0.5,
ylim=c(-50,50),
main = "Waterfall plot for Target Lesion Tumor Size",
ylab="Change from baseline (%)",
cex.axis=1.5,
legend.text= c( "CR: Complete Response", "PR: Partial Response", "SD: Stable Disease", "PD: Progressive Disease"),
args.legend=list(title="Best Overall Response", fill=c("green","steelblue", "cadetblue", "red"), border=NA, cex=1.0))
# Use the abline() function
## The abline() function is a simple way to add lines in R
### It takes the arguments: abline(a=NULL, b=NULL, h=NULL, v=NULL, ...)
#### a, b : single values denoting the intercept and the slope of the line
#### h : the y-value(s) for horizontal line(s)
#### v : the x-value(s) for vertical line(s)
abline(h=20, col = "black", lwd=0.5) # The "PD" line
abline(h=-30, col = "black", lwd=0.5) # This "PR" line
```

# Take Home Points

- High-quality data visualization of individual patient tumor responses to a study treatment can be displayed effectively for the cohort using waterfall plots.

**As always, please reach out to us with thoughts and feedback**

# Session Info

`sessionInfo()`

```
## R version 4.0.0 (2020-04-24)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Mojave 10.14.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] knitr_1.28 forcats_0.5.0 stringr_1.4.0 dplyr_1.0.0
## [5] purrr_0.3.4 readr_1.3.1 tidyr_1.1.0 tibble_3.0.1
## [9] ggplot2_3.3.1 tidyverse_1.3.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.1.0 xfun_0.13 haven_2.2.0 lattice_0.20-41
## [5] colorspace_1.4-1 vctrs_0.3.1 generics_0.0.2 htmltools_0.4.0
## [9] yaml_2.2.1 rlang_0.4.6 pillar_1.4.4 glue_1.4.1
## [13] withr_2.2.0 DBI_1.1.0 dbplyr_1.4.3 modelr_0.1.7
## [17] readxl_1.3.1 lifecycle_0.2.0 munsell_0.5.0 blogdown_0.18
## [21] gtable_0.3.0 cellranger_1.1.0 rvest_0.3.5 evaluate_0.14
## [25] fansi_0.4.1 highr_0.8 broom_0.5.6 Rcpp_1.0.4.6
## [29] scales_1.1.1 backports_1.1.7 jsonlite_1.6.1 fs_1.4.1
## [33] hms_0.5.3 digest_0.6.25 stringi_1.4.6 bookdown_0.18
## [37] grid_4.0.0 cli_2.0.2 tools_4.0.0 magrittr_1.5
## [41] crayon_1.3.4 pkgconfig_2.0.3 ellipsis_0.3.1 xml2_1.3.2
## [45] reprex_0.3.0 lubridate_1.7.8 assertthat_0.2.1 rmarkdown_2.1
## [49] httr_1.4.1 rstudioapi_0.11 R6_2.4.1 nlme_3.1-147
## [53] compiler_4.0.0
```