MS in Applied Statistics for Social Science Research

Course Descriptions

View syllabi for current course offerings and syllabi for recent years.

APSTA-GE.2002: Statistics for the Behavioral and Social Sciences II
This course introduces students to an array of inferential techniques (t-tests, one- and two-way anova, simple and multiple regression, nonparametric methods) using the latest version of SPSS as a platform to achieve a hands-on experience with real data. The course is not appropriate for students seeking to learn the mathematical underpinnings of these techniques.

APSTA-GE.2003: Intermediate Quantitative Methods: The General Linear Model
This course follows a first year statistics class (pre-requisite for matriculation in the program) by examining more deeply multiple regression/correlation as a general and flexible system for analyzing data in the behavioral, social, and health sciences. In addition to covering more advanced topics related to traditional multiple regression/correlation, the course examines ANOVA, ANCOVA, and Path Analysis as special cases of this general linear model. A statistical software package is used to give students hands-on experience with topics covered.

APSTA-GE.2004: Advanced Modeling I: Topics in Multivariate Analysis
This course extends the material covered in Intermediate Statistics by examining some of the more advanced topics in multivariate data analysis for the behavioral, social, and health sciences that are prerequisite to continuing the study of quantitative methods at NYU. The topics to be covered are logistic regression, multivariate analysis of variance, repeated measures analysis of variance, and an introduction to hierarchical linear modeling. A software package is used to give students hands-on experience with topics covered. In so doing, the course provides foundational skills and knowledge critical to those graduate students whose research relies on the analysis of quantitative data.

APSTA-GE.2011: Supervised and Unsupervised Machine Learning
Classification and clustering are important statistical techniques commonly applied in many social and behavioral science research problems. Both seek to understand social phenomena through the identification of naturally occurring homogeneous groupings within a population. Classification techniques are used to sort new observations into pre-existing or known groupings, while clustering techniques sort the population under study into groupings based on their observed characteristics. Both help to reveal hidden structure that may be used in further analyses. This course will compare and contrast these techniques, including many of their variations, with an emphasis on applications.

APSTA-GE.2012: Causal Inference: Statistical Methods for Program Evaluation and Policy Research
The course provides students with a basic knowledge of both how to perform analyses and critique the use of some more advanced statistical methods useful in answering policy questions. While randomized experiments will be discussed, the primary focus will be the challenge of answering causal questions using data that do not meet such standards. Several approaches for observational data including propensity score methods, instrumental variables, difference in differences, fixed effects models and regression discontinuity designs will be discussed.

APSTA-GE.2013: Missing Data
This course provides students with a basic knowledge of the potential implications of missing data on their data analyses as well as potential solutions. We will begin by discussing different types of mechanisms that can generate missing data. This will lay the groundwork for discussions of what types of missing data scenarios can be accommodated by each missing data method discussed subsequently. Simple missing data fixes will be described next as well as the problems they can create in terms of bias and loss of efficiency. We next explore some slightly more complicated fixes and the assumptions required for valid inference for each. The course will end with a focus on multiple imputation including discussions of the general framework, different models and algorithms and the basic theory.

APSTA-GE.2014: Statistical Analysis of Networks
This course is an introduction to the analysis and modeling of network data. Network analysis is a key tool in understanding relational data - data describing the relationships between pairs and groups of individuals, as well as the global structure of relationships. We will focus on applications to and building tools for research in the social sciences, but the methodology can be extended to other areas. By the end of the course, you should have a working knowledge of basic network analysis tools and be able to use them to analyze your own data.

APSTA-GE.2015: Applied Spatial Statistics
Spatial data arise when information is collected on units that reside in different locations. Common examples include geology, criminology and epidemiology, where the goal may be to identify patterning or clusters (‘hot spots’) in the outcomes across the terrain being examined. In the social sciences, a similar set of questions and techniques are required, for example in studies of homelessness, poverty, environmental justice, and education. However, spatial data present a novel set of exploratory and modeling challenges, given the unique way in which outcomes are related (correlated) with each other through proximity. This course is an overview of the methods needed to analyze data for which it is suspected that the spatial component plays an important role.

APSTA-GE.2017: Educational Data Science Practicum
This intensive laboratory course will focus on doing data analysis projects with real data selected by the students. The core skills are oriented around first framing good research questions, then having these guide interacting with data of all types and varying quality (e.g., web-scraped, or clickstream-based rather than large national surveys) via visualization, principled modeling and evaluation of models using statistical learning techniques such as regression, classification and clustering, and presentation of results, using “reproducible research” tools (e.g., knitr, sweave) in the R programming language.

APSTA-GE.2040: Multilevel Models: Growth Curves
This is a course on models for multilevel growth curve data. These data arise in longitudinal designs, which are quite common to education and applied social, behavioral and policy science. Traditional methods, such as OLS regression, are not appropriate in this setting, as they fail to model the complex correlational structure that is induced by these designs. Proper inference requires that we include aspects of the design in the model itself. Moreover, these more sophisticated techniques allow the researcher to learn new and important characteristics of the social and behavioral processes under study. In this module, we will develop and fit a set of models for longitudinal. The course assignments will use state of the art statistical software to explore, fit and interpret the models.

APSTA-GE.2041: Practicum in Multi-Level Models
This is practicum course on models for multilevel growth curve data. This course is a natural sequel to Multi-Level Modeling: Growth Curves. Building on the theory and examples developed in that course, students will participate in a guided, larger research project that employs multi-level growth curve models. Students will meet in groups with the instructor in a lab setting to fit, evaluate and describe these models. The final project for the course will consist of a “results and discussion” section, journal article quality write-up.

APSTA-GE.2042: Multilevel Models: Nested Data Models
This is a course on models for multilevel nested data. These data arise in nested designs, which are quite common to education and applied social, behavioral and policy science. Traditional methods, such as OLS regression, are not appropriate in this setting, as they fail to model the complex correlational structure that is induced by these designs. Proper inference requires that we include aspects of the design in the model itself. Moreover, these more sophisticated techniques allow the researcher to learn new and important characteristics of the social and behavioral processes under study. The course assignments will use state of the art statistical software to explore, fit and interpret the models.

APSTA-GE.2044: Generalized Linear and Multilevel Growth Curve Models
This course is a second year course in advanced statistical techniques that covers useful quantitative tools in health, education, social science and policy research. Assuming a strong foundation in regression and the general linear model, this course focuses on data analysis that utilizes models for categorical, discrete or limited outcomes, as well as introducing growth curve modeling of these same outcome types. Examples are drawn from broad areas of applied research settings. In this course students will also learn the principles of likelihood-based inference, which will assist them in some of the more advanced statistics courses.

APSTA-GE.2085: Basic Statistics I
This introductory two-semester course is designed to prepare undergraduate- and master's-level students to use statistics for data analysis. The course make use of SPSS for Windows, a statistical computer software package for the social sciences. The first semester serves as a foundation for the second, covering methods for displaying and describing data. Topics include frequency distributions and their graphical representations, percentiles, measures of central tendency and dispersion, correlation, and simple regression.

APSTA-GE.2086: Basic Statistics II
The second semester builds on the foundation of the first and covers particular methods of statistical inference that rely on the normal t, F, and chi-square distributions to test hypotheses about means, variances, correlations, and proportions.

APSTA-GE.2093: Psychometric Theory
This course reviews and expands on the topics of measurement and reliability for psychological and educational test data. Begins with classical test theory, moves onto unidimensional and multidimensional factor models for continuous data, then item response theory for dichotomous and ordered categorical data. Well-suited for students who have collected test/questionnaire data and want to analyze measurement properties of the test (e.g., reliability, dimensionality) and obtain summary scores for each respondent that can be used for reporting or as variables in further analyses.

APSTA-GE.2094: Factor Analysis and Structural Equation Modeling
Course provides students with the software skills and theoretical knowledge required to apply structural equation modeling (SEM). First, we review of multiple regression and basic concepts from matrix algebra. Next, path analysis and factor analysis are developed, leading to more advanced topics, including how to translate theory into models, strategies for dealing with poor fitting specifications, categorical data, and issues in multigroup analysis. The latter component focuses on data applications of SEM, including a final project. Example material will be drawn from education and psychology.

APSTA-GE.2110: Applied Statistics: Using Large Databases in Education Research
This course is designed to serve as a bridge between more theoretical coursework in applied statistics (introductory course in statistics and linear models) and practical work with real, large-scale data. Although the focus is mainly on datasets relevant to education and educational policy research, the skills taught in the course are broadly transferrable across subject areas in social, behavioral, and health sciences. At the conclusion of this course students will be prepared to produce descriptive statistics about a population using data collected under complex survey design and to estimate simple cross-sectional and longitudinal regression models of the sort frequently employed in real applied data analysis.

APSTA-GE.2122: Applied Statistical Modeling and Inference
This is a course in the intermediate and advanced foundations of statistical inference in the context of applied research. Assuming some prior exposure to probability and statistics, this course will first cover topics such as the principles of estimation and hypothesis testing, and the general and generalized linear models, including scientific computation. This course thoroughly explores the frequentist approach to inference. The student will be expected to understand the mathematical theory, implement related statistical algorithms in statistical programming language such as R, and interpret models and parameters in the context of applied statistical analysis of real data.

APSTA-GE.2134: Experimental and Quasi-Experimental Design
This course covers issues related to the design of research studies. Topics include observational, quasi-experimental, and experimental studies. The course focuses on practical aspects of design, linking these to appropriate analyses.

APSTA-GE.2139: Survey Research I
This course provides a broad overview of the many aspects of survey research methodology including sampling, instrument design, the psychology of survey response, field testing, survey operations, nonresponse bias analysis and correction, and primary and secondary analysis of survey data. The course is designed primarily for those who intend to use surveys in their own research – whether designing original surveys or performing secondary analysis on survey data collected by others. Whenever possible, we will use examples and data from real surveys employed by academic researchers, professional survey firms, and Federal statistical agencies.

APSTA-GE.2331: Data Science for Social Impact
This course focuses on the competencies required and the issues that arise and how analysts use data and quantitative evidence to impact policy and practice. Students will learn how to gather and analyze data to address questions about program efficacy and efficient targeting of resources. Topics will include how to choose organizational partners, implement change, build trust with organizations and civic agencies, satisfy the needs of stakeholders and manage legal, ethical, and logistical constraints. Students will discuss real case studies and appropriate ways to address them.

APSTA-GE.2351: Practicum in Applied Probability
This is a course in the foundations of statistical inference techniques. Assuming some prior exposure to foundational and intermediate statistical methods, this course will first cover topics such as Kolgomorov’s axioms of probabilities, basics of set theory, discrete combinatorial probability, Bayes’ theorem, probability distributions and their properties and assumptions of dependence and independence. These topics are followed by the foundational topics of statistics: sampling distributions, the law of large numbers and the central limit theorem. This course will mix theoretical approaches with simulation-based illustrations of these main topics. The student will be expected to understand the mathematical theory and apply the topics covered to problem solving via analytical and simulation based methods in statistical programming language such as R.

APSTA-GE.2352: Practicum in Statistical Computing
This course will introduce the student to modern statistical programming and simulation using the language R. The core skills are oriented around first understanding variables, data structures, program flow (e.g., conditional execution, looping) and functional programming, then applying these skills to answer interesting statistical questions involving the comparison of groups, which is core to statistical practice. Most statistical analysis will be motivated via simulations, rather than mathematical theory. The course content (programming and data analysis) requires significant outside reading and programming.

APSTA-GE.2310. Internship Course
In the internship, students will gain experience working with “real world” data, working with an approved faculty member, local firm or organization. Students will receive practical training focused on the kinds of issues that researchers face in collecting and analyzing data. This course will not only enhance the tools and techniques students develop, but will also possibly lead to employment opportunities after graduation.

APSTA-GE.2401: Statistical Consulting Research Seminar
This course is designed to assist graduate students in the quantitative methods specific to the design and analysis of their theses. In this seminar format, under the guidance of one or more statistical faculty members, students will have the opportunity to present and defend their scholarly work-in-progress. They will also be required to critique and provide constructive suggestions for their fellow students. The focus of critiques will be on the research methodology and other statistical issues. Students will additionally benefit from being able to observe how the participating faculty diagnose and solve statistical issues that arise in others’ presented work and to benefit from this advice in their own work. In essence this course provides training in statistical consulting along with detailed feedback on one’s dissertation research.

RESCH-GE.2132: Principles of Empirical Research
This course introduces advanced study of social science research methods. The course examines: 1), the relationship between theory and research; 2), methodological issues, such as objectivity, the logic of argument, reliability and validity; and 3), exemplars of various methodological techniques, including survey, interview, experimental, archival and ethnographic research. The course is designed to train social scientists to both recognize and be able to conduct rigorous, theoretically informed research.

SOC-GA 2306 Event History Analysis (through FAS/Sociology)
This course surveys methods for analyzing event history data, with a focus on continuous-time models and estimation techniques. Topics include the exploratory analysis of event history data, nonparametric methods, right censoring, maximum likelihood estimation, alternative specifications for a time dependent baseline hazard rate, observed and unobserved heterogeneity, time-varying covariates, proportional and nonproportional models, multiple transition and competing risk models, left truncation and left censoring, and analogs of recursive and nonrecursive models. Major emphasis is placed on the logic, practical use, and estimation of models.

STAT-GB 2301: Regression and Multivariate Data Analysis (through Stern)
This is a data-driven, applied statistics course focusing on the analysis of data using regression models. It emphasizes applications to the analysis of business and other data and makes extensive use of computer statistical packages. Topics include simple and multiple linear regression, residual analysis and other regression diagnostics, multicollinearity and model selection, autoregression, heteroscedasticity, regression models using categorical predictors, and logistic regression. All topics are illustrated on real data sets obtained from financial markets, market research studies, and other scientific inquiries. The goal of the class is that students begin to develop the skills to be able to collect, organize, analyze, and interpret regression data.