16:954:534 Statistical Learning for Data Science (3)
Advanced statistical learning methods essential for applications in data science. Course covers optimization, supervised and unsupervised learning, trees and random forest, deep learning, graphical models and others.
16:954:567 Statistical Models and Computing (3)
Advanced statistical models and computing methods essential for applications in data science. Course covers inference of multivariate normal distribution and multivariate regression, nonparametric regression, bootstrap and EM, Bayesian analysis and MCMC method.
16:954:577 Advanced Analytics using Statistical Software(3)
Modeling and analysis of data, usually very large datasets, for decision making. Review and comparison of software packages used for Analytics Modeling. Multiple and logistic regression, multi-stage models, decision trees, network models and clustering algorithms. Investigate data sets, identify and fit appropriate data analytics models, interpret statistical models in context, distinguish between data analytics problems involving forecasting and classification, and assess analytics models for usefulness, predictive value, and financial gain.
16:954:581 Probability and Statistical Theory for Data Science (3)
The study of probabilistic and inferential tools important for applications in data science. Topics covered: Probability distributions; decision theory, Bayesian inference, classification, prediction; law of large numbers, central limit theorem; point and interval estimation; multiple testing, false-discovery rates.
16:954:596 Regression and time series analysis for data science (3)
This course introduces regression methods, state space modeling, linear time series models, and volatility models, which are important tools for data analysis, and are foundations for developing more specialized methods.
16:954:597: Data Wrangling and Husbandry (3)
This course provides an introduction to the principles and tools to retrieve, “tidy,” clean, and visualize data in preparation for statistical analysis. Principles of reproducibility and reusability are emphasized. It teaches techniques to wrangle and explore data. The emphasis is on preparation of data to ease the analysis rather than sophisticated analyses. Topics include methods to convert data from diverse sources into suitable form for data visualization and analysis; methods to scrape data from websites; data visualization; elementary database operations such as SQL’s join; construction of web-based analysis apps; and principles of reproducibility and reuseability, including literate programming, unit tests, and source code management.
16:958:588 Financial Data Mining(3)
Databases and data warehousing, exploratory data analysis and visualization, an overview of data mining algorithms, modeling for data mining, descriptive modeling, predictive modeling, pattern and rule discovery, text mining, Bayesian data mining, observational studies. Emphasis on the use of data mining techniques in finance and risk management. Prerequisites: 16:958:563, and 16:198:443 or equivalent C++ course or permission of instructor.
16:958:589 Advanced Programming for Financial Statistics and Risk Management (3)
This course covers the basic concepts of object oriented programming and the syntax of the Python language. The course objectives include learning how to go from the different stages of designing a program (algorithm) to its actual implementation. This class lays the foundation for applying Python for interactive financial analytics and financial application building.
16:198:512 Introduction to Data Structures and Algorithms (3)
An introduction for students in other degree programs on Data Structures and Algorithms.
16:198:521 Linear Programming (3) Linear inequalities, extreme points and rays, fundamental theorems. Optimality and duality. Geometric view. Primal and dual simplex methods. Degeneracy. Primal-dual method. Sensitivity. Basis factorization, implementation issues. Column generation. Structured models. Network simplex method and unimodularity. Polynomial-time algorithms for linear programming. Grigoriadis, Kalantari.
Prerequisites: Linear algebra and admission requirements.
16:198:539 Theory of Computation (3)
Mathematical theory of computing machines. Computable functions, recursive and recursively enumerable sets, recursion and fixed-point theorems, abstract complexity and complexity theoretic analogues of aspects of recursive-function theory, algorithmic (Kolmogoroff) complexity theory.
Allender. Prerequisite: 16:198:509 or equivalent.
16:198:541 Database Systems (3)
Relational data model. Relational query languages and their expressiveness. Dependency theory and relational normalization. Physical database design. Deductive databases and object-oriented databases. Optimization of relational queries.
Borgida. Prerequisites: 01:198:336 or equivalent; 16:198:513. Recommended: 16:198:509 or equivalent.
16:960:688 Bayesian Analysis (3)
16:332:509 (S) Convex Optimization for Engineering Applications (3)
Theory, algorithms, and tools to formulate and solve convex optimization problems that seek to minimize cost function subject to constraints; engineering applications.
16:332:562 (S) Visualization and Advanced Computer Graphics (3)
Advanced visualization techniques, including volume representation, volume rendering, ray tracing, composition, surface representation, and advanced data structures. User interface design, parallel and object-oriented graphic techniques, and advanced modeling techniques.