SOFTWARE | falcobargaglistoffi

Open Source Software & Code for Causal Machine Learning

1. CRE: Interpretable Subgroups Identification Through Ensemble Learning of Causal Rules, R package [web page] [Cran] [github] [paper]

Causal subgroup identification is a powerful statistical tool for determining vulnerabilities in a population with respect to a particular treatment. As causal machine learning provides an efficient and accurate tool for data-driven subgroup identification, software packages are proving an essential means to easily disseminate and reproduce these algorithms broadly. The CRE Package, written in R and available on GitHub, implements the recently developed Causal Rule Ensemble (CRE) algorithm, a flexible and precise method for denovo subgroup discovery. The CRE approach focuses on identifying drivers of treatment heterogeneity in observational or randomized studies in the presence of an intervention or treatment.

2. NetworkCausalTree: An R Package for Heterogeneous Spillover Effects. R Package [github]

The NetworkCausalTree package introduces a machine learning method that uses tree-based algorithms and an Horvitz-Thompson estimator to assess the heterogeneity of treatment and spillover effects in clustered network interference. Causal inference studies typically assume no interference between individuals, but in real-world scenarios where individuals are interconnected through social, physical, or virtual ties, the effect of a treatment can spill over to other connected individuals in the network. To avoid biased estimates of treatment effects, interference should be considered. Understanding the heterogeneity of treatment and spillover effects can help policy-makers scale up interventions, target strategies more effectively, and generalize treatment spillover effects to other populations.

3. BCF-IV: Bayesian Machine Learning for Heterogeneous Effects Discovery Under Imperfect Compliance, R Package [github]

The BCF-IV function discovers and estimates, in an interpretable manner, the effects heterogeneity in settings where the assignment mechanism is irregular (e.g., instrumental variable and fuzzy regression discontinuity scenarios). This function is directly built to discover and estimate the heterogeneity in the Complier Average Treatment Effects (CACE). The BCF-ITT function discovers the heterogeneity in the intention-to-treat (ITT) and then estimates the effect both for the conditional ITT and the conditional CACE for the discovered subgroups.

Open Source Air Quality Disparities Mapper

4. Air Quality Disparities Mapper: An Open-Source Web Application for Environmental Justice, ArcGIS Javascript API [website] [github] [paper]

Ambient exposure to PM2.5 is a major health burden and is linked to increased mortality and morbidity. Exposure to harmful air pollution varies precipitously across different socioeconomic groups and is a burden often borne disproportionately by minority and low-income communities. We developed an interactive, web-based air pollution components mapper that is easily accessible to members of the public. The mapper allows to: (i) assess the exposure disparities in exposure to PM2.5 components (e.g, elemental carbon, ammonium, nitrate, organic carbon, sulfate) and (ii) communicate such disparities in a medium that is public-facing and widely digestible to those outside the field. The components mapper combines high-resolution predictions of PM2.5 components from an ensembling machine learning model with demographic data from the U.S. Decennial Census.