Using Machine Learning to Estimate the Effect of Racial Segregation on COVID-19 Mortality

Working Paper, 2020

Abstract: The novel coronavirus disease 2019 (COVID-19) has revealed large racial and ethnic disparities in mortality and infections across the US. This study examines the role that racial residential segregation has played in the increase of deaths and infections in the overall population and in shaping racial and ethnic mortality gaps. To account for other factors that may explain COVID-19 mortality and infection, I assemble a data set that includes 50 county-level factors that measure demographics, density and potential for public interaction, social capital, health risk factors, capacity of the health care system, air pollution, employment in essential businesses, and political views. I use machine learning methods to guide the selection of the most important controls. Results show that more segregated counties had higher mortality and infection rates overall and larger mortality among blacks and Hispanics relative to whites.

Draft available upon request.