Leading with Data Science

Four data science workshops with housing and real estate finance data. Students who completed all the requirements received a certificate.
Participants learned about generalization as the goal of machine learning; gained perspective on the difference between supervised and unsupervised ML; took a deep dive into classification and linear regression models; gained hands-on experience both with coding ML models in Python with statsmodels and scikit-learn, as well as with automated machine learning platforms BigSquid and Tangent Works; answered questions about prepayment, forbearance, and refinancing; and learned how to communicate the results of ML experiments.
Students were introduced to different types of inference; how data science models relate to rules, causes, and effects; how the data science process can begin with either a focused question or with data; concepts relating to balancing use cases, models, and data; and how data wrangling and feature engineering fit in the data science pipeline, with a live demo of AWS SageMaker with NY Fed mortgage debt and delinquency data.
Participants gained hands-on experience with current year-to-date secondary market data, as well as 2 billion+ rows of historical loan performance and mortgage transaction data - both single-family and multi-family - through live Snowflake data shares and interactive visualizations in HMDAVision; students also had access to a class wiki with rich housing and mortgage industry backgrounders and strategy briefs.
Understanding of Data Science
Understanding of Machine Learning
Understanding of Housing Industry Data
Course Overview