Using Statistics and Data Science for Public Health and Social Good


Traditionally, many public health studies have started off as research questions, answered by identifying and analyzing an appropriate dataset. However, in some cases, the existence of data precedes (and therefore drives) the scientific question, and with the emergence of big data and the growth of data science as a field, this is becoming increasingly common. I will be giving an overview of three different types of data I have encountered in projects as a biostatistician, where aspects of each data source have inspired each of the respective projects. These aspects range from the size and quality of the data, to the populations in which the data were collected. With these different data, we have been able to learn about HIV risk among transgender women in sub-Saharan Africa, predict type 2 diabetes risk among undeserved populations seeking care at federally qualified community health centers in the US, and develop methods to correct for measurement error in self-reported dietary outcomes collected in nutrition studies. In presenting findings from these projects, I hope to highlight the broad range of scientific questions and high levels of public health impact that can come from working as a biostatistician or data scientist in health research.

Johnson Center, 327, Meeting Room C, Fairfax Campus