Open Source R Software & Code
Causal subgroup identification is a powerful statistical tool for determining vulnerabilities in a population with respect to a particular treatment. As causal machine learning provides an efficient and accurate tool for data-driven subgroup identification, software packages are proving an essential means to easily disseminate and reproduce these algorithms broadly. The CRE Package, written in R and available on GitHub, implements the recently developed Causal Rule Ensemble (CRE) algorithm, a flexible and precise method for denovo subgroup discovery. The CRE approach focuses on identifying drivers of treatment heterogeneity in observational or randomized studies in the presence of an intervention or treatment.
2. BCF-IV: Bayesian Machine Learning for Heterogeneous Effects Discovery Under Imperfect Compliance, R function [github] [R package coming soon]
The BCF-IV function discovers and estimates, in an interpretable manner, the effects heterogeneity in settings where the assignment mechanism is irregular (e.g., instrumental variable and fuzzy regression discontinuity scenarios). This function is directly built to discover and estimate the heterogeneity in the Complier Average Treatment Effects (CACE). The BCF-ITT function discovers the heterogeneity in the intention-to-treat (ITT) and then estimates the effect both for the conditional ITT and the conditional CACE for the discovered subgroups.
Open Source Air Quality Disparities Mapper
Ambient exposure to PM2.5 is a major health burden and is linked to increased mortality and morbidity. Exposure to harmful air pollution varies precipitously across different socioeconomic groups and is a burden often borne disproportionately by minority and low-income communities. We developed an interactive, web-based air pollution components mapper that is easily accessible to members of the public. The mapper allows to: (i) assess the exposure disparities in exposure to PM2.5 components (e.g, elemental carbon, ammonium, nitrate, organic carbon, sulfate) and (ii) communicate such disparities in a medium that is public-facing and widely digestible to those outside the field. The components mapper combines high-resolution predictions of PM2.5 components from an ensembling machine learning model with demographic data from the U.S. Decennial Census.