CSCI-UA 479 Data Management and Analysis

Term: Spring 2022
Instructor: Dr. Matthew Zeidenberg
Level: Undergraduate

Topics

Python; data formats (JSON CSV TSV etc.), retrieving data with Python (web, file system, etc.); working with Python data management packages (pandas, numpy)

data analysis: descriptive statistics, linear regression, decision trees and random forests; data visualization and plotting with Matplotlib;

relational databases using SQL: SQLite, relational database design, ER diagrams, normalization, SQL syntax, basic create/read/update/delete JOINs, aggregate queries, window functions;

NoSQL: MongoDB (a JSON document database), basic create/read/update/delete, querying;

Description

Extracting, transforming and analyzing data in myriad formats. Using traditional relational databases as well as non-relational databases as well as non-relational databases to store, manipulate, and query data. Students write custom programs, create queries, and use data analysis tools and libraries on a wide array of data sets. Additional topics: data modeling, cloud databases and AP programming.