Reproducible Data Science: Building a Code Pipeline from End to End

Using my experience as a 2018 Data Science for Social Good Fellow, I will be providing an overview of our team’s workflow and pipeline construction, highlighting the importance of reproducibility and easy implementation of our code. The talk is divided into the different sections of our pipeline: 1) Data processing and cleaning, 2) Data staging, 3) Machine learning modeling infrastructure, and 4) Usability. Each stage is discussed in context of our DSSG project, in which we constructed a precision medicine tool to predict an individual’s risk of developing Type 2 Diabetes within the next 3 years.

Characterizing the Burden of HIV and Specific Vulnerabilities among Transgender Women compared to Men who have Sex with Men across Eight Sub-Saharan African Countries

Transgender women are at high risk of acquiring HIV; however, limited data have previously been collected to quantify their risk across Sub-Saharan Africa. We identified potential risk factors of HIV infection among trans women, and characterized differences in HIV risks between trans women and gay men and other men who have sex with men (MSM) in eight Sub-Saharan African Nations (Burkina Faso, Côte d’Ivoire, The Gambia, Lesotho, Malawi, Senegal, Swaziland and Togo).

social good