MS in Applied Statistics

Course Descriptions

Links to syllabi for courses offered 2016-17 are given here.  Links to syllabi for recent years are given here.

APSTA-GE.2003: Intermediate Quantitative Methods: The General Linear Model

This course follows a first year statistics class (pre-requisite for matriculation in the program) by examining more deeply multiple regression/correlation as a general and flexible system for analyzing data in the behavioral, social, and health sciences. In addition to covering more advanced topics related to traditional multiple regression/correlation, the course examines ANOVA, ANCOVA, and Path Analysis as special cases of this general linear model. A statistical software package is used to give students hands-on experience with topics covered.

STAT-GB 2301: Regression and Multivariate Data Analysis (through Stern)
This is a data-driven, applied statistics course focusing on the analysis of data using regression models. It emphasizes applications to the analysis of business and other data and makes extensive use of computer statistical packages. Topics include simple and multiple linear regression, residual analysis and other regression diagnostics, multicollinearity and model selection, autoregression, heteroscedasticity, regression models using categorical predictors, and logistic regression. All topics are illustrated on real data sets obtained from financial markets, market research studies, and other scientific inquiries. The goal of the class is that students begin to develop the skills to be able to collect, organize, analyze, and interpret regression data.

APSTA-GE.2004: Advanced Modeling I: Topics in Multivariate Analysis
This course extends the material covered in Intermediate Statistics by examining some of the more advanced topics in multivariate data analysis for the behavioral, social, and health sciences that are prerequisite to continuing the study of quantitative methods at NYU. The topics to be covered are logistic regression, multivariate analysis of variance, repeated measures analysis of variance, and an introduction to hierarchical linear modeling. A software package is used to give students hands-on experience with topics covered. In so doing, the course provides foundational skills and knowledge critical to those graduate students whose research relies on the analysis of quantitative data.

APSTA-GE.2011: Advanced Topics in Quantitative Methods: Classification and Clustering
Classification and clustering are important statistical techniques commonly applied in many social and behavioral science research problems. Both seek to understand social phenomena through the identification of naturally occurring homogeneous groupings within a population. Classification techniques are used to sort new observations into pre-existing or known groupings, while clustering techniques sort the population under study into groupings based on their observed characteristics. Both help to reveal hidden structure that may be used in further analyses. This course will compare and contrast these techniques, including many of their variations, with an emphasis on applications.

APSTA-GE.2012: Advanced Topics in Quantitative Methods: Causal Inference: Statistical Methods for Program Evaluation and Policy Research
The course provides students with a basic knowledge of both how to perform analyses and critique the use of some more advanced statistical methods useful in answering policy questions. While randomized experiments will be discussed, the primary focus will be the challenge of answering causal questions using data that do not meet such standards. Several approaches for observational data including propensity score methods, instrumental variables, difference in differences, fixed effects models and regression discontinuity designs will be discussed.

APSTA-GE.2013: Advanced Topics in Quantitative Methods: Missing Data
This course provides students with a basic knowledge of the potential implications of missing data on their data analyses as well as potential solutions. We will begin by discussing different types of mechanisms that can generate missing data. This will lay the groundwork for discussions of what types of missing data scenarios can be accommodated by each missing data method discussed subsequently. Simple missing data fixes will be described next as well as the problems they can create in terms of bias and loss of efficiency. We next explore some slightly more complicated fixes and the assumptions required for valid inference for each. The course will end with a focus on multiple imputation including discussions of the general framework, different models and algorithms and the basic theory.

APSTA-GE.2015: Advanced Topics in Quantitative Methods: Applied Spatial Statistics
Spatial data arise when information is collected on units that reside in different locations. Common examples include geology, criminology and epidemiology, where the goal may be to identify patterning or clusters (‘hot spots’) in the outcomes across the terrain being examined. In the social sciences, a similar set of questions and techniques are required, for example in studies of homelessness, poverty, environmental justice, and education. However, spatial data present a novel set of exploratory and modeling challenges, given the unique way in which outcomes are related (correlated) with each other through proximity. This course is an overview of the methods needed to analyze data for which it is suspected that the spatial component plays an important role.

APSTA-GE.2016: Advanced Topics in Quantitative Methods: Factor Scoring and Practical Issues in Scaling
This course reviews and expands on the topics of measurement and reliability. The general topic is factor scoring, with a focus on how to implement, evaluate, and interpret different methods of scoring in the multidimensional factor model. Students will also learn how to conduct exploratory factor analysis using rigorous methods for assessing goodness of fit and dimensionality. The course is especially well suited to students who have collected test/questionnaire data and require a method for getting scores from those data, scores which can then be used in further analyses.

APSTA-GE.2017: Educational Data Science Practicum
This intensive laboratory course will focus on doing data analysis projects with real data selected by the students. The core skills are oriented around first framing good research questions, then having these guide interacting with data of all types and varying quality (e.g., web-scraped, or clickstream-based rather than large national surveys) via visualization, principled modeling and evaluation of models using statistical learning techniques such as regression, classification and clustering, and presentation of results, using “reproducible research” tools (e.g., knitr, sweave) in the R programming language.

APSTA-GE.2040: Multilevel Models: Growth Curves
This is a course on models for multilevel growth curve data. These data arise in longitudinal designs, which are quite common to education and applied social, behavioral and policy science. Traditional methods, such as OLS regression, are not appropriate in this setting, as they fail to model the complex correlational structure that is induced by these designs. Proper inference requires that we include aspects of the design in the model itself. Moreover, these more sophisticated techniques allow the researcher to learn new and important characteristics of the social and behavioral processes under study. In this module, we will develop and fit a set of models for longitudinal. The course assignments will use state of the art statistical software to explore, fit and interpret the models.

APSTA-GE.2041: Practicum in Multi-Level Models
This is practicum course on models for multilevel growth curve data. This course is a natural sequel to Multi-Level Modeling: Growth Curves. Building on the theory and examples developed in that course, students will participate in a guided, larger research project that employs multi-level growth curve models. Students will meet in groups with the instructor in a lab setting to fit, evaluate and describe these models. The final project for the course will consist of a “results and discussion” section, journal article quality write-up.

APSTA-GE.2042: Multilevel Models: Nested Data Models
This is a course on models for multilevel nested data. These data arise in nested designs, which are quite common to education and applied social, behavioral and policy science. Traditional methods, such as OLS regression, are not appropriate in this setting, as they fail to model the complex correlational structure that is induced by these designs. Proper inference requires that we include aspects of the design in the model itself. Moreover, these more sophisticated techniques allow the researcher to learn new and important characteristics of the social and behavioral processes under study. The course assignments will use state of the art statistical software to explore, fit and interpret the models.

APSTA-GE.2044: Generalized Linear and Multilevel Growth Curve Models
This course is a second year course in advanced statistical techniques that covers useful quantitative tools in health, education, social science and policy research. Assuming a strong foundation in regression and the general linear model, this course focuses on data analysis that utilizes models for categorical, discrete or limited outcomes, as well as introducing growth curve modeling of these same outcome types. Examples are drawn from broad areas of applied research settings. In this course students will also learn the principles of likelihood-based inference, which will assist them in some of the more advanced statistics courses.

APSTA-GE.2094: Factor Analysis and Structural Equation Modeling
Course provides students with the software skills and theoretical knowledge required to apply structural equation modeling (SEM). First, we review of multiple regression and basic concepts from matrix algebra. Next, path analysis and factor analysis are developed, leading to more advanced topics, including how to translate theory into models, strategies for dealing with poor fitting specifications, categorical data, and issues in multigroup analysis. The latter component focuses on data applications of SEM, including a final project. Example material will be drawn from education and psychology.

APSTA-GE.2110: Applied Statistics: Using Large Databases in Education Research
This course is designed to serve as a bridge between more theoretical coursework in applied statistics (introductory course in statistics and linear models) and practical work with real, large-scale data. Although the focus is mainly on datasets relevant to education and educational policy research, the skills taught in the course are broadly transferrable across subject areas in social, behavioral, and health sciences. At the conclusion of this course students will be prepared to produce descriptive statistics about a population using data collected under complex survey design and to estimate simple cross-sectional and longitudinal regression models of the sort frequently employed in real applied data analysis.

APSTA-GE.2134: Experimental and Quasi-Experimental Design
This course covers issues related to the design of research studies. Topics include observational, quasi-experimental, and experimental studies. The course focuses on practical aspects of design, linking these to appropriate analyses.

APSTA-GE.2139: Survey Research I
This course provides a broad overview of the many aspects of survey research methodology including sampling, instrument design, the psychology of survey response, field testing, survey operations, nonresponse bias analysis and correction, and primary and secondary analysis of survey data. The course is designed primarily for those who intend to use surveys in their own research – whether designing original surveys or performing secondary analysis on survey data collected by others. Whenever possible, we will use examples and data from real surveys employed by academic researchers, professional survey firms, and Federal statistical agencies.

APSTA-GE.2351: Practicum in Applied Probability
This is a course in the foundations of statistical inference techniques. Assuming some prior exposure to foundational and intermediate statistical methods, this course will first cover topics such as Kolgomorov’s axioms of probabilities, basics of set theory, discrete combinatorial probability, Bayes’ theorem, probability distributions and their properties and assumptions of dependence and independence. These topics are followed by the foundational topics of statistics: sampling distributions, the law of large numbers and the central limit theorem. This course will mix theoretical approaches with simulation-based illustrations of these main topics. The student will be expected to understand the mathematical theory and apply the topics covered to problem solving via analytical and simulation based methods in statistical programming language such as R.

APSTA-GE.2352: Practicum in Statistical Computing
This course will introduce the student to modern statistical programming and simulation using the language R. The core skills are oriented around first understanding variables, data structures, program flow (e.g., conditional execution, looping) and functional programming, then applying these skills to answer interesting statistical questions involving the comparison of groups, which is core to statistical practice. Most statistical analysis will be motivated via simulations, rather than mathematical theory. The course content (programming and data analysis) requires significant outside reading and programming.

SOC-GA 2306 Event History Analysis (through FAS/Sociology)
This course surveys methods for analyzing event history data, with a focus on continuous-time models and estimation techniques. Topics include the exploratory analysis of event history data, nonparametric methods, right censoring, maximum likelihood estimation, alternative specifications for a time dependent baseline hazard rate, observed and unobserved heterogeneity, time-varying covariates, proportional and nonproportional models, multiple transition and competing risk models, left truncation and left censoring, and analogs of recursive and nonrecursive models. Major emphasis is placed on the logic, practical use, and estimation of models.

APSTA-GE.2401: Statistical Consulting Research Seminar
This course is designed to assist graduate students in the quantitative methods specific to the design and analysis of their theses. In this seminar format, under the guidance of one or more statistical faculty members, students will have the opportunity to present and defend their scholarly work-in-progress. They will also be required to critique and provide constructive suggestions for their fellow students. The focus of critiques will be on the research methodology and other statistical issues. Students will additionally benefit from being able to observe how the participating faculty diagnose and solve statistical issues that arise in others’ presented work and to benefit from this advice in their own work. In essence this course provides training in statistical consulting along with detailed feedback on one’s dissertation research.

Social Research Foundation

RESCH-GE.2132: Principles of Empirical Research
This course introduces advanced study of social science research methods. The course examines: 1), the relationship between theory and research; 2), methodological issues, such as objectivity, the logic of argument, reliability and validity; and 3), exemplars of various methodological techniques, including survey, interview, experimental, archival and ethnographic research. The course is designed to train social scientists to both recognize and be able to conduct rigorous, theoretically informed research.

Additional Courses

APSTA-GE.2310. Internship Course
In the internship, students will gain experience working with “real world” data, working with an approved faculty member, local firm or organization. Students will receive practical training focused on the kinds of issues that researchers face in collecting and analyzing data. This course will not only enhance the tools and techniques students develop, but will also possibly lead to employment opportunities after graduation.