15.077J -- Statistical Learning and Data MiningCourse Description: Introduction to theory and application of statistics and data mining, concentrating on techniques used in management science, finance, engineering systems, and bioinformatics. First half builds the statistical foundation for the second half which concentrates on data-mining, supervised learning, and multivariate analysis. First half topics selected from sampling; theory of estimation; testing; nonparametric statistics; analysis of variance; categorical data analysis; regression analysis; MCMC; EM; Gibbs sampling; hidden Markov models; and Bayesian methods. Second half topics selected from logistic regression; principal components and dimension reduction; discrimination and classification analysis and trees; partial least squares; nearest neighbor and regularized methods; support vector machines; boosting and bagging; clustering; independent component analysis; and nonparametric regression. R, S+, Matlab, SAS, or similar statistics package used for data analysis and data mining.
This class is at the
Graduate levelAn example of a syllabus: 15077_Spring_2008.pdfThis course is also known as:
ESD.753JInstructor: R. E. Welsch
Prerequisites: 6.431, 15.085J, or 18.440; 18.06 or 18.700
Back to Classes