Causal Machine Learning

Machine learning can furnish the tools to investigate inherently causal problems (here an excellent introduction on what a "causal problem" is) at a finer grade of resolution.

In my research, I have developed new causal machine learning techniques to deal with heterogeneous effects. In particular, I investigated novel, transparent, data-driven ways to detect the units (i.e., human beings, enterprises, institutions, and so on) that are most or least affected by public or private interventions.


The techniques that my co-authors and I developed deal with:

  1. heterogeneous effects detection and estimation in quasi-experimental settings;

  2. drawing interpretable inference on heterogeneous effects in observational studies;

  3. heterogeneous effects detection and estimation in the presence of clustered network interference.

1. Experimental data are the gold standard for inferring causal effects but are very hard "to find" in real-world applications. Hence, social and health scientists often need to become creative and find empircal settings that "look like" experiments but, indeed, are not. In these scenarios, data-driven detection of heterogeneous effects is often disregarded. Together with Prof. Giorgio Gnecco (IMT School for Advanced Studies) and Prof. Kristof De Witte (KU Leuven) we accounted for this shortcoming, by introducing new algorithms that perform a data-driven search and estimation of heterogeneous effects in these settings (see this paper and this preprint).

2. Machine learning algorithms for causal inference are rising in their importance but often face a widely known shortcoming of machine learning techniques: they lack interpretability. In a recent work with Prof. Francesca Dominici (Harvard University) and Prof. Kwonsang Lee (Sungkyunkwan University) we developed a new algorithm for interpretable detection and estimation of heterogeneous effects. We applied the proposed methodology for the estimation of the individuals that are most vulnerable to the effects of exposure to higher level of air pollution (see here the groundbreaking works of Prof. Dominici in this field).

3. What happens if the units that I randomly assign to a treatment are inter-connected? Then, the treatment can spil from one unit to another and produce so-called spillover effects. This is often the case of social science studies, as human beings are connected by social, virtual and financial ties. In these scenarios, usual causal machine learning techniques may fail in correctly estimating the heterogeneous effect and do not account for heterogeneous spillover effect. For these reasons, together with Prof. Laura Forastiere (Yale University) and my colleague Costanza Tortù (IMT School for Advanced Studies) we worked on a new algorithm for interpretable discovery and estimation of heterogeneous treatment and spillover effects in the presence of clustered network interference. We applied this new methodology to evaluate the effects of intensive information training on weather insurance take-up in rural areas of China. This work has a wider set of possible applications in the field of randomized experiments in social networks.


Falco J. Bargagli Stoffi



677 Huntington Ave, Boston, MA 02115, United States



  • Black Twitter Icon
  • Github
  • google_scholar
  • Research gate

© 2019 By Falco J. Bargagli Stoffi