Removing higher-order aggregation text in {reactable}

A quick & dirty workaround

Often when I’m building a reactable I have potentially many aggregation levels but only want to display data in a lower subset of those. Here is a quick way to eliminate text using cell styling.

First let’s load some packages and look at an example data set.

library(tidyverse)
library(reactable)
dat <- cheese::heart_disease
dat
## # A tibble: 303 × 9
##      Age Sex    ChestPain           BP Cholest…¹ Blood…² Maxim…³ Exerc…⁴ Heart…⁵
##    <dbl> <fct>  <fct>            <dbl>     <dbl> <lgl>     <dbl> <fct>   <fct>  
##  1    63 Male   Typical angina     145       233 TRUE        150 No      No     
##  2    67 Male   Asymptomatic       160       286 FALSE       108 Yes     Yes    
##  3    67 Male   Asymptomatic       120       229 FALSE       129 Yes     Yes    
##  4    37 Male   Non-anginal pain   130       250 FALSE       187 No      No     
##  5    41 Female Atypical angina    130       204 FALSE       172 No      No     
##  6    56 Male   Atypical angina    120       236 FALSE       178 No      No     
##  7    62 Female Asymptomatic       140       268 FALSE       160 No      Yes    
##  8    57 Female Asymptomatic       120       354 FALSE       163 Yes     No     
##  9    63 Male   Asymptomatic       130       254 FALSE       147 No      Yes    
## 10    53 Male   Asymptomatic       140       203 TRUE        155 Yes     Yes    
## # … with 293 more rows, and abbreviated variable names ¹​Cholesterol,
## #   ²​BloodSugar, ³​MaximumHR, ⁴​ExerciseInducedAngina, ⁵​HeartDisease

Suppose we are interested in these outcomes:

  1. The rate of heart disease
  2. The rate of patients with a maximum heart rate > 150

And we want to summarize those, along with a few other patient characteristics, within a handful of important clinical factors. We might pivot our data into a long format to start:

pivot_dat <-
  dat |> 
  
  # Convert to character types
  mutate(
    across(
      c(ChestPain, ExerciseInducedAngina, BloodSugar),
      as.character
    ),
    MaximumHR =
      case_when(
        MaximumHR > 150 ~ "Yes",
        TRUE ~ "No"
      )
  ) |>
  
  # Send "important clinical factors" down the rows
  pivot_longer(
    cols = c(ChestPain, ExerciseInducedAngina, BloodSugar),
    names_to = "ClinicalFactor",
    values_to = "Level"
  ) |> 
  
  # Send outcomes down the rows
  pivot_longer(
    cols = c(HeartDisease, MaximumHR),
    names_to = "Outcome",
    values_to = "HasOutcome"
  )
pivot_dat
## # A tibble: 1,818 × 8
##      Age Sex      BP Cholesterol ClinicalFactor        Level     Outcome HasOu…¹
##    <dbl> <fct> <dbl>       <dbl> <chr>                 <chr>     <chr>   <chr>  
##  1    63 Male    145         233 ChestPain             Typical … HeartD… No     
##  2    63 Male    145         233 ChestPain             Typical … Maximu… No     
##  3    63 Male    145         233 ExerciseInducedAngina No        HeartD… No     
##  4    63 Male    145         233 ExerciseInducedAngina No        Maximu… No     
##  5    63 Male    145         233 BloodSugar            TRUE      HeartD… No     
##  6    63 Male    145         233 BloodSugar            TRUE      Maximu… No     
##  7    67 Male    160         286 ChestPain             Asymptom… HeartD… Yes    
##  8    67 Male    160         286 ChestPain             Asymptom… Maximu… No     
##  9    67 Male    160         286 ExerciseInducedAngina Yes       HeartD… Yes    
## 10    67 Male    160         286 ExerciseInducedAngina Yes       Maximu… No     
## # … with 1,808 more rows, and abbreviated variable name ¹​HasOutcome

We can then calculate some summary statistics. Our full patient population is replicated for each ClinicalFactor and Outcome, so we should account for that.

summary_dat <-
  pivot_dat |> 
  
  # Summarize characteristics
  summarize(
    Patients = n(),
    OutcomeRate = mean(HasOutcome == "Yes"),
    Age = mean(Age),
    Female = mean(Sex == "Female"),
    .by = 
      c(
        ClinicalFactor,
        Level,
        Outcome,
        HasOutcome
      )
  ) |> 
  
  # Add counts within groups
  mutate(
    PercentOfPatients = Patients / sum(Patients),
    .by = c(ClinicalFactor, Outcome)
  ) |> 
  
  # Make clean levels/labels
  mutate(
    ClinicalFactor = 
      case_when(
        ClinicalFactor == "BloodSugar" ~ "Blood sugar > 120 mg/dl?",
        ClinicalFactor == "ChestPain" ~ "Chest pain type",
        ClinicalFactor == "ExerciseInducedAngina" ~ "Exercise-induced angina?"
      ),
    Level = 
      case_when(
        Level == "TRUE" ~ "Yes",
        Level == "FALSE" ~ "No",
        TRUE ~ Level
      ),
    Outcome = 
      case_when(
        Outcome == "HeartDisease" ~ "Heart disease",
        Outcome == "MaximumHR" ~ "Maximum HR > 150"
      )
  ) |>
  
  # Rearrange
  relocate(PercentOfPatients, .after = Patients) |>
  relocate(Outcome, .before = everything()) |>
  arrange(
    Outcome,
    ClinicalFactor,
    Level,
    HasOutcome
  )
summary_dat
## # A tibble: 32 × 9
##    Outcome       ClinicalFa…¹ Level HasOu…² Patie…³ Perce…⁴ Outco…⁵   Age Female
##    <chr>         <chr>        <chr> <chr>     <int>   <dbl>   <dbl> <dbl>  <dbl>
##  1 Heart disease Blood sugar… No    No          141  0.465        0  52   0.468 
##  2 Heart disease Blood sugar… No    Yes         117  0.386        1  56.4 0.162 
##  3 Heart disease Blood sugar… Yes   No           23  0.0759       0  56.2 0.261 
##  4 Heart disease Blood sugar… Yes   Yes          22  0.0726       1  57.9 0.273 
##  5 Heart disease Chest pain … Asym… No           39  0.129        0  54.1 0.462 
##  6 Heart disease Chest pain … Asym… Yes         105  0.347        1  56.3 0.210 
##  7 Heart disease Chest pain … Atyp… No           41  0.135        0  50.0 0.390 
##  8 Heart disease Chest pain … Atyp… Yes           9  0.0297       1  57.4 0.222 
##  9 Heart disease Chest pain … Non-… No           68  0.224        0  52.4 0.5   
## 10 Heart disease Chest pain … Non-… Yes          18  0.0594       1  58.5 0.0556
## # … with 22 more rows, and abbreviated variable names ¹​ClinicalFactor,
## #   ²​HasOutcome, ³​Patients, ⁴​PercentOfPatients, ⁵​OutcomeRate

Now we’re ready to build the table (note the use of the zildge::rectbl_agg_wtd() function to compute the weighted average, see my other post for more information about this).

reactable(
  data = summary_dat,
  groupBy = c("Outcome", "ClinicalFactor", "Level"),
  columns = 
    list(
      ClinicalFactor = colDef(name = "Clinical factor"),
      HasOutcome = colDef(name = "Has Outcome?"),
      Patients = colDef(name = "Count", aggregate = "sum", align = "center"),
      PercentOfPatients = colDef(name = "Percent", aggregate = "sum", align = "center", format = colFormat(digits = 2, percent = TRUE)),
      OutcomeRate = colDef(name = "Outcome rate (%)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2, percent = TRUE)),
      Age = colDef(name = "Avg. age (years)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2)),
      Female = colDef(name = "Female (%)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2, percent = TRUE))
    ),
  columnGroups = 
    list(
      colGroup(
        name = "Patients",
        columns = c("Patients", "PercentOfPatients")
      )
    ),
  striped = TRUE,
  highlight = TRUE,
  bordered = TRUE,
  resizable = TRUE,
  theme = reactablefmtr::sandstone()
) |>
  reactablefmtr::add_source("Use arrows to expand table", font_size = 12, font_style = "italic")

Use arrows to expand table


There are a few problems with this table:

  1. The redundancy of the top-level (and second-level) summary statistics is unappealing
  2. The top-level aggregation is just flat-out wrong because we’ve duplicated counts over the data set

Maybe 1 is not a huge deal, but I’ve probably already established somewhere that there are 303 patients in the data set and the overall rate of heart disease is 45.9%. You still may choose to keep it as-is in this case as it’s not totally harmful. However, 2 needs to be addressed as the data is misleading and not really interpretable. So how can we remove the results from (at least) the top-level while maintaining the lower-level aggregation?

We can define an R function that provides JavaScript custom styling to a cell.

remove_text <-
  function(col, threshold) {
    JS(
      paste0(
        "function(rowInfo) {
          if(rowInfo.row['", col, "'] > ", threshold, ") {
            return {fontSize:0}
          }
        }"
      )
    )
  }

All this function does is reduce the font size to zero for any cell where the value of col in the respective row is greater than threshold. Thus, since we are over-counting the number of patients in the top-level, we can just apply this function to rows in which the count exceeds 303, which is the patient count in our data set. We provide this function to the style argument of colDef().

reactable(
  data = summary_dat,
  groupBy = c("Outcome", "ClinicalFactor", "Level"),
  columns = 
    list(
      ClinicalFactor = colDef(name = "Clinical factor"),
      HasOutcome = colDef(name = "Has Outcome?"),
      Patients = colDef(style = remove_text("Patients", nrow(dat)), name = "Count", aggregate = "sum", align = "center"),
      PercentOfPatients = colDef(style = remove_text("Patients", nrow(dat)), name = "Percent", aggregate = "sum", align = "center", format = colFormat(digits = 2, percent = TRUE)),
      OutcomeRate = colDef(style = remove_text("Patients", nrow(dat)), name = "Outcome rate (%)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2, percent = TRUE)),
      Age = colDef(style = remove_text("Patients", nrow(dat)), name = "Avg. age (years)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2)),
      Female = colDef(style = remove_text("Patients", nrow(dat)), name = "Female (%)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2, percent = TRUE))
    ),
  columnGroups = 
    list(
      colGroup(
        name = "Patients",
        columns = c("Patients", "PercentOfPatients")
      )
    ),
  striped = TRUE,
  highlight = TRUE,
  bordered = TRUE,
  resizable = TRUE,
  theme = reactablefmtr::sandstone()
) |>
  reactablefmtr::add_source("Use arrows to expand table", font_size = 12, font_style = "italic")

Use arrows to expand table


If we want to remove text from the second level of aggregation, we can just adjust the threshold.

reactable(
  data = summary_dat,
  groupBy = c("Outcome", "ClinicalFactor", "Level"),
  columns = 
    list(
      ClinicalFactor = colDef(name = "Clinical factor"),
      HasOutcome = colDef(name = "Has Outcome?"),
      Patients = colDef(style = remove_text("Patients", nrow(dat) - 1), name = "Count", aggregate = "sum", align = "center"),
      PercentOfPatients = colDef(style = remove_text("Patients", nrow(dat) - 1), name = "Percent", aggregate = "sum", align = "center", format = colFormat(digits = 2, percent = TRUE)),
      OutcomeRate = colDef(style = remove_text("Patients", nrow(dat) - 1), name = "Outcome rate (%)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2, percent = TRUE)),
      Age = colDef(style = remove_text("Patients", nrow(dat) - 1), name = "Avg. age (years)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2)),
      Female = colDef(style = remove_text("Patients", nrow(dat) - 1), name = "Female (%)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2, percent = TRUE))
    ),
  columnGroups = 
    list(
      colGroup(
        name = "Patients",
        columns = c("Patients", "PercentOfPatients")
      )
    ),
  striped = TRUE,
  highlight = TRUE,
  bordered = TRUE,
  resizable = TRUE,
  theme = reactablefmtr::sandstone()
) |>
  reactablefmtr::add_source("Use arrows to expand table", font_size = 12, font_style = "italic")

Use arrows to expand table


And that’s it.

I’ll acknowledge that I didn’t exhaust all options for this task and that there are definitely better ways to do this with more directed intention by accessing various properties of the table. Obviously this only works if you have a column in the table that only increases during the aggregation and does so to predictable amounts. But it did the trick here.

Alex Zajichek
Alex Zajichek
Statistician & Data Scientist
comments powered by Disqus

Related