DS-UA 112 Introduction to Data Science

Term: Spring 2021
Instructor: Dr. Pascal Wallisch
Level: Undergraduate

Topics

Theory and history of data, science and data science; Probability theory: Basic probability, Probability distributions; Linear Algebra: Vectors, Norms, Matrix factorization, SVD;

Characterizing datasets: Central tendency, Dispersion, Correlation;

Prediction: Simple linear regression, Control, Multiple regression, Regularized regression;

Samples, Populations and Inferences, Null hypothesis testing and statistical significance; Inferential statistics: t-test and ANOVA, Non-parametric tests;

Resampling methods, Beyond p-values: Effect sizes, power and confidence; Bayes theorem: Bayesian statistics, Bayesian applications;

Multivariate statistics: Dimension reduction / PCA; Clustering methods, Logistic regression, Classification methods;

Causal inference, Data science: Truth and consequences;

Description

This is a survey course designed to bring about several specific outcomes. First, it is supposed to introduce you to foundational concepts in the field of data science. Second, we aim to impart the 21st century version of a liberal arts education. The 3 classical Rs of Reading, wRiting and aRithmetic are now joined by 2 new ones: data liteRacy and pRogramming. In this class, we assume that you are already somewhat familiar with the first 3 Rs and focus on the latter 2. Third, we aim to plant a variety of seeds about topics that you will encounter again in more advanced classes. Fourth, we hope to kindle a passion for data and data analysis that will last a lifetime. Finally, we also intend to impart several general purpose skills, namely applied data analytics and coding in Python as well as the QDAFI method – which will allow you to efficiently and gainfully read a scientific paper. Overall, the class is dedicated to the philosophy of computational empowerment. We live in transformational times. We believe that this mindset as well as these concepts are essential to a flourishing existence in the 21st century and beyond.