Leveraging free, code-first tools to iterate toward advanced analytics
Alex Zajichek
Research Data Scientist, Cleveland Clinic
February 27, 2025
A Little Background
Who is QHS?
Department of 120+ biostatisticians, data scientists, programmers, etc. that collaborate on and supply quantitative support to research activities at Cleveland Clinic
From clinical trials and study design to precision medicine, population health, AI in medicine, and more, across many disease areas
My area focuses on clinical prediction modeling and observational statistical analysis, primarily using EHR and/or registry data
# Load packageslibrary(tidyverse)library(tidycensus)library(mapgl)# Import WI tractswi_tracts <- arcgislayers::arc_read(url ="https://tigerweb.geo.census.gov/arcgis/rest/services/Generalized_ACS2023/Tracts_Blocks/MapServer/4", where ="STATE = '55'" )# Extract median income by tractdat <-get_acs(geography ="tract",variables ="B19013_001", # Median income,state ="WI",year =2022,progress_bar =FALSE ) |># Join to get boundariesinner_join(y = wi_tracts |>select(GEOID, geometry),by ="GEOID" ) |># Make an information columnmutate(Info =paste0(str_remove(NAME, ";.+$"), "<br>Median Income ($): ", round(estimate)) ) |># Convert to spatial data frame sf::st_as_sf()# Make the makemaplibre() |># Focus the mapping areafit_bounds(dat) |># Fill with the data valuesadd_fill_layer(id ="mc_acs",source = dat,fill_outline_color ="black",fill_color =interpolate(column ="estimate",values =range(dat$estimate, na.rm =TRUE),stops =c("#f2d37c", "#08519c"),na_color ="gray" ),fill_opacity =0.50,popup ="Info" ) |>add_legend(legend_title ="Median income ($)",values =range(dat$estimate, na.rm =TRUE),colors =c("#f2d37c", "#08519c") )
Quarto for Reproducible Documents
Background
Quarto is an open source technical publishing system
Vehicle for dissemination
Build custom analytical documents in programmatic way, promoting automation and reproducibility
Integrates well with R, Python, and many other tools
---title:"The Value of Open-Source"subtitle:"Leveraging free, code-first tools to iterate toward advanced analytics"institute:"Research Data Scientist, Cleveland Clinic"author:"Alex Zajichek"date:"2025-02-27"date-format: longformat:revealjs:theme:[serif, custom.scss]footer:"<em>AI Innovations at Work Conference 2025</em>"slide-number:trueincremental:true---
Markdown (body)
### Background- [Quarto](https://quarto.org/) is a open source technical publishing system- Build custom analytical documents in programmatic way- Vehicle for dissemination, promoting automation and reproducibility
Shiny for Web Applications
Background
Shiny is an R package for building custom, interactive web applications