Removing higher-order aggregation text in {reactable}

Tables
Javascript
Author

Alex Zajichek

Published

June 29, 2023

Often when I’m building a reactable I have potentially many aggregation levels but only want to display data in a lower subset of those. Here is a quick way to eliminate text using cell styling.

First let’s load some packages and look at an example data set.

Code
library(tidyverse)
library(reactable)
dat <- cheese::heart_disease
dat
# A tibble: 303 × 9
     Age Sex    ChestPain           BP Cholesterol BloodSugar MaximumHR
   <dbl> <fct>  <fct>            <dbl>       <dbl> <lgl>          <dbl>
 1    63 Male   Typical angina     145         233 TRUE             150
 2    67 Male   Asymptomatic       160         286 FALSE            108
 3    67 Male   Asymptomatic       120         229 FALSE            129
 4    37 Male   Non-anginal pain   130         250 FALSE            187
 5    41 Female Atypical angina    130         204 FALSE            172
 6    56 Male   Atypical angina    120         236 FALSE            178
 7    62 Female Asymptomatic       140         268 FALSE            160
 8    57 Female Asymptomatic       120         354 FALSE            163
 9    63 Male   Asymptomatic       130         254 FALSE            147
10    53 Male   Asymptomatic       140         203 TRUE             155
# ℹ 293 more rows
# ℹ 2 more variables: ExerciseInducedAngina <fct>, HeartDisease <fct>

Suppose we are interested in these outcomes:

  1. The rate of heart disease
  2. The rate of patients with a maximum heart rate > 150

And we want to summarize those, along with a few other patient characteristics, within a handful of important clinical factors. We might pivot our data into a long format to start:

Code
pivot_dat <-
  dat |> 
  
  # Convert to character types
  mutate(
    across(
      c(ChestPain, ExerciseInducedAngina, BloodSugar),
      as.character
    ),
    MaximumHR =
      case_when(
        MaximumHR > 150 ~ "Yes",
        TRUE ~ "No"
      )
  ) |>
  
  # Send "important clinical factors" down the rows
  pivot_longer(
    cols = c(ChestPain, ExerciseInducedAngina, BloodSugar),
    names_to = "ClinicalFactor",
    values_to = "Level"
  ) |> 
  
  # Send outcomes down the rows
  pivot_longer(
    cols = c(HeartDisease, MaximumHR),
    names_to = "Outcome",
    values_to = "HasOutcome"
  )
pivot_dat
# A tibble: 1,818 × 8
     Age Sex      BP Cholesterol ClinicalFactor        Level  Outcome HasOutcome
   <dbl> <fct> <dbl>       <dbl> <chr>                 <chr>  <chr>   <chr>     
 1    63 Male    145         233 ChestPain             Typic… HeartD… No        
 2    63 Male    145         233 ChestPain             Typic… Maximu… No        
 3    63 Male    145         233 ExerciseInducedAngina No     HeartD… No        
 4    63 Male    145         233 ExerciseInducedAngina No     Maximu… No        
 5    63 Male    145         233 BloodSugar            TRUE   HeartD… No        
 6    63 Male    145         233 BloodSugar            TRUE   Maximu… No        
 7    67 Male    160         286 ChestPain             Asymp… HeartD… Yes       
 8    67 Male    160         286 ChestPain             Asymp… Maximu… No        
 9    67 Male    160         286 ExerciseInducedAngina Yes    HeartD… Yes       
10    67 Male    160         286 ExerciseInducedAngina Yes    Maximu… No        
# ℹ 1,808 more rows

We can then calculate some summary statistics. Our full patient population is replicated for each ClinicalFactor and Outcome, so we should account for that.

Code
summary_dat <-
  pivot_dat |> 
  
  # Summarize characteristics
  summarize(
    Patients = n(),
    OutcomeRate = mean(HasOutcome == "Yes"),
    Age = mean(Age),
    Female = mean(Sex == "Female"),
    .by = 
      c(
        ClinicalFactor,
        Level,
        Outcome,
        HasOutcome
      )
  ) |> 
  
  # Add counts within groups
  mutate(
    PercentOfPatients = Patients / sum(Patients),
    .by = c(ClinicalFactor, Outcome)
  ) |> 
  
  # Make clean levels/labels
  mutate(
    ClinicalFactor = 
      case_when(
        ClinicalFactor == "BloodSugar" ~ "Blood sugar > 120 mg/dl?",
        ClinicalFactor == "ChestPain" ~ "Chest pain type",
        ClinicalFactor == "ExerciseInducedAngina" ~ "Exercise-induced angina?"
      ),
    Level = 
      case_when(
        Level == "TRUE" ~ "Yes",
        Level == "FALSE" ~ "No",
        TRUE ~ Level
      ),
    Outcome = 
      case_when(
        Outcome == "HeartDisease" ~ "Heart disease",
        Outcome == "MaximumHR" ~ "Maximum HR > 150"
      )
  ) |>
  
  # Rearrange
  relocate(PercentOfPatients, .after = Patients) |>
  relocate(Outcome, .before = everything()) |>
  arrange(
    Outcome,
    ClinicalFactor,
    Level,
    HasOutcome
  )
summary_dat
# A tibble: 32 × 9
   Outcome       ClinicalFactor      Level HasOutcome Patients PercentOfPatients
   <chr>         <chr>               <chr> <chr>         <int>             <dbl>
 1 Heart disease Blood sugar > 120 … No    No              141            0.465 
 2 Heart disease Blood sugar > 120 … No    Yes             117            0.386 
 3 Heart disease Blood sugar > 120 … Yes   No               23            0.0759
 4 Heart disease Blood sugar > 120 … Yes   Yes              22            0.0726
 5 Heart disease Chest pain type     Asym… No               39            0.129 
 6 Heart disease Chest pain type     Asym… Yes             105            0.347 
 7 Heart disease Chest pain type     Atyp… No               41            0.135 
 8 Heart disease Chest pain type     Atyp… Yes               9            0.0297
 9 Heart disease Chest pain type     Non-… No               68            0.224 
10 Heart disease Chest pain type     Non-… Yes              18            0.0594
# ℹ 22 more rows
# ℹ 3 more variables: OutcomeRate <dbl>, Age <dbl>, Female <dbl>

Now we’re ready to build the table (note the use of the zildge::rectbl_agg_wtd() function to compute the weighted average, see my other post for more information about this).

Code
reactable(
  data = summary_dat,
  groupBy = c("Outcome", "ClinicalFactor", "Level"),
  columns = 
    list(
      ClinicalFactor = colDef(name = "Clinical factor"),
      HasOutcome = colDef(name = "Has Outcome?"),
      Patients = colDef(name = "Count", aggregate = "sum", align = "center"),
      PercentOfPatients = colDef(name = "Percent", aggregate = "sum", align = "center", format = colFormat(digits = 2, percent = TRUE)),
      OutcomeRate = colDef(name = "Outcome rate (%)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2, percent = TRUE)),
      Age = colDef(name = "Avg. age (years)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2)),
      Female = colDef(name = "Female (%)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2, percent = TRUE))
    ),
  columnGroups = 
    list(
      colGroup(
        name = "Patients",
        columns = c("Patients", "PercentOfPatients")
      )
    ),
  striped = TRUE,
  highlight = TRUE,
  bordered = TRUE,
  resizable = TRUE,
  theme = reactablefmtr::sandstone()
) |>
  reactablefmtr::add_source("Use arrows to expand table", font_size = 12, font_style = "italic")

Use arrows to expand table


There are a few problems with this table:

  1. The redundancy of the top-level (and second-level) summary statistics is unappealing
  2. The top-level aggregation is just flat-out wrong because we’ve duplicated counts over the data set

Maybe 1 is not a huge deal, but I’ve probably already established somewhere that there are 303 patients in the data set and the overall rate of heart disease is 45.9%. You still may choose to keep it as-is in this case as it’s not totally harmful. However, 2 needs to be addressed as the data is misleading and not really interpretable. So how can we remove the results from (at least) the top-level while maintaining the lower-level aggregation?

We can define an R function that provides JavaScript custom styling to a cell.

Code
remove_text <-
  function(col, threshold) {
    JS(
      paste0(
        "function(rowInfo) {
          if(rowInfo.row['", col, "'] > ", threshold, ") {
            return {fontSize:0}
          }
        }"
      )
    )
  }

All this function does is reduce the font size to zero for any cell where the value of col in the respective row is greater than threshold. Thus, since we are over-counting the number of patients in the top-level, we can just apply this function to rows in which the count exceeds 303, which is the patient count in our data set. We provide this function to the style argument of colDef().

Code
reactable(
  data = summary_dat,
  groupBy = c("Outcome", "ClinicalFactor", "Level"),
  columns = 
    list(
      ClinicalFactor = colDef(name = "Clinical factor"),
      HasOutcome = colDef(name = "Has Outcome?"),
      Patients = colDef(style = remove_text("Patients", nrow(dat)), name = "Count", aggregate = "sum", align = "center"),
      PercentOfPatients = colDef(style = remove_text("Patients", nrow(dat)), name = "Percent", aggregate = "sum", align = "center", format = colFormat(digits = 2, percent = TRUE)),
      OutcomeRate = colDef(style = remove_text("Patients", nrow(dat)), name = "Outcome rate (%)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2, percent = TRUE)),
      Age = colDef(style = remove_text("Patients", nrow(dat)), name = "Avg. age (years)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2)),
      Female = colDef(style = remove_text("Patients", nrow(dat)), name = "Female (%)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2, percent = TRUE))
    ),
  columnGroups = 
    list(
      colGroup(
        name = "Patients",
        columns = c("Patients", "PercentOfPatients")
      )
    ),
  striped = TRUE,
  highlight = TRUE,
  bordered = TRUE,
  resizable = TRUE,
  theme = reactablefmtr::sandstone()
) |>
  reactablefmtr::add_source("Use arrows to expand table", font_size = 12, font_style = "italic")

Use arrows to expand table


If we want to remove text from the second level of aggregation, we can just adjust the threshold.

Code
reactable(
  data = summary_dat,
  groupBy = c("Outcome", "ClinicalFactor", "Level"),
  columns = 
    list(
      ClinicalFactor = colDef(name = "Clinical factor"),
      HasOutcome = colDef(name = "Has Outcome?"),
      Patients = colDef(style = remove_text("Patients", nrow(dat) - 1), name = "Count", aggregate = "sum", align = "center"),
      PercentOfPatients = colDef(style = remove_text("Patients", nrow(dat) - 1), name = "Percent", aggregate = "sum", align = "center", format = colFormat(digits = 2, percent = TRUE)),
      OutcomeRate = colDef(style = remove_text("Patients", nrow(dat) - 1), name = "Outcome rate (%)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2, percent = TRUE)),
      Age = colDef(style = remove_text("Patients", nrow(dat) - 1), name = "Avg. age (years)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2)),
      Female = colDef(style = remove_text("Patients", nrow(dat) - 1), name = "Female (%)", aggregate = zildge::rectbl_agg_wtd("Patients"), align = "center", format = colFormat(digits = 2, percent = TRUE))
    ),
  columnGroups = 
    list(
      colGroup(
        name = "Patients",
        columns = c("Patients", "PercentOfPatients")
      )
    ),
  striped = TRUE,
  highlight = TRUE,
  bordered = TRUE,
  resizable = TRUE,
  theme = reactablefmtr::sandstone()
) |>
  reactablefmtr::add_source("Use arrows to expand table", font_size = 12, font_style = "italic")

Use arrows to expand table


And that’s it.

I’ll acknowledge that I didn’t exhaust all options for this task and that there are definitely better ways to do this with more directed intention by accessing various properties of the table. Obviously this only works if you have a column in the table that only increases during the aggregation and does so to predictable amounts. But it did the trick here.