Leading With Data Science

For two weeks in June/July 2021, Polygon Research delivered a data science class for over 200 participants in Fannie Mae's Future Housing Leaders summer internship program. Students who completed the course requirements earned the following certificate.
Understanding Housing Industry Data
Participants gained hands-on experience with current year-to-date secondary market data, as well as 2 billion+ rows of historical loan performance and mortgage transaction data - both single-family and multi-family - through live Snowflake data shares and interactive visualizations in HMDAVision; students also had access to a class wiki with rich housing and mortgage industry backgrounders and strategy briefs.
Understanding Data Science
Students were introduced to different types of inference; how data science models relate to rules, causes, and effects; how the data science process can begin with either a focused question or with data; concepts relating to balancing use cases, models, and data; and how data wrangling and feature engineering fit in the data science pipeline, with a live demo of AWS SageMaker with NY Fed mortgage debt and delinquency data.
Understanding Machine Learning
Participants learned about generalization as the goal of machine learning; gained perspective on the difference between supervised and unsupervised ML; took a deep dive into classification and linear regression models; gained hands-on experience both with coding ML models in Python with statsmodels and scikit-learn, as well as with automated machine learning platforms BigSquid and Tangent Works; answered questions about prepayment, forbearance, and refinancing; and learned how to communicate the results of ML experiments.