Field Initiatives and Partnership Schools

Teachers for a New Era Final Report

The Effects of Multiple-Perspective Assessment Upon Student Teachers and Their Pupils

TNE Mini-Grant
Contract 3571-00/EOC 6820

Final Report - July 31, 2007


This TNE mini-grant was used to address the demand for accountability of traditional teacher-education programs (Darling-Hammond and Bransford, 2005) and the related pressure for value-added evidence of their efficacy in terms of a direct benefit to teacher development, and an indirect impact on pupil achievement. One of the main principles of the Teachers for a New Era (TNE) Learning Network calls for “grounding all elements of the teacher-education program on strong evidence, including measurement of pupil learning gains”. This inquiry builds upon ongoing research at NYU’s Steinhardt School of Culture, Education, and Human Development on ways to enhance the efficacy of student teaching, and methods to assess its impact on teacher-education students and their pupils.

In 2003, the Steinhardt School’s Center for Research on Teaching and Learning (CRTL) began a research and development effort focused on the assessment of teacher quality, and the linkages between measures of teacher performance and pupil learning. This research is rooted in self-inquiry that is designed to build an evidence base to inform continuous program improvement and to strengthen program accountability. The work supported by this mini-grant focused on expanding and linking two key components of this research: the assessment of the developing proficiency of teacher education students, and longitudinal follow-up of their success as beginning teachers.

Assessment of Teaching Proficiency
In Spring 2004, CRTL developed and piloted an early version of the Domain Referenced Student Teacher Observation Scale—Revised (DRSTOS-R), an observation protocol used to assess the developing proficiency of student teachers that was adapted from the work of Charlotte Danielson as presented in her book, Enhancing Professional Practice: A Framework for Teaching (Danielson, 1996). Danielson identified four domains of teachers’ professional practice: planning and preparation, classroom environment, instruction, and professional responsibilities. The DRSTOS-R is designed to assess the ability of teacher-education students to understand and integrate their classroom experiences and apply that learning to their own practice as student teachers. Between Fall 2004 and Spring 2006, CRTL revised and refined the original 29-item protocol into the current 20-item version, trained 16 student-teacher supervisors to use the scale, assessed its validity and inter-rater reliability, and collected DRSTOS-R supervisor ratings for 258 Steinhardt student teachers. DRSTOS-R data were featured prominently in Steinhardt’s Teacher Education Accreditation Council Inquiry Brief, which resulted in the full accreditation of all of its teacher education programs in January 2007.

Longitudinal Follow-Up of Steinhardt Graduates
As an additional strand of inquiry into the efficacy of Steinhardt’s teacher education programs, CRTL began a multi-phase, longitudinal study of the graduates of these programs in Spring 2005. In the first phase, CRTL sent an electronic file of the social security numbers of 2,151 graduates from the classes of 2001 through Fall 2005 to the State Education Departments of New York, Connecticut, and Florida to be matched to the teacher databases for these states. A series of matches identified almost 1,200 graduates who were teaching in the public schools of the three states during the 2003-04 and/or 2004-05 school years. Using data from these matches, CRTL has been studying the characteristics of the graduates who entered the teaching profession, the demographic and performance characteristics of the schools in which they taught, and patterns of retention and attrition. In the second phase of the research, which was undertaken in 2006-07, the follow-up file was updated for graduates from the classes of 2005 and 2006 through a match to the New York City Department of Education human resources data system, and surveys were sent to a sample of 391 of these graduates who were teaching in New York City public elementary and middle schools. The survey used an adaptation of the DRSTOS-R that was designed to obtain the self-perceptions of practicing teachers about their teaching practice. The third phase involves obtaining the standardized test scores in English language arts and mathematics for the pupils of these graduates, and using value-added modeling (VAM) methods to assess the impact of the graduates on the growth in academic achievement of their pupils. 


This mini-grant was used to build upon CRTL’s ongoing research on teacher-education program accountability and teacher quality in several ways. First, mini-grant funds were used to expand the use of DRSTOS-R by Steinhardt student-teacher supervisors and, for the first time, by cooperating teachers who serve as mentors to the student teachers. In addition, two other forms were developed to obtain the self-perceptions of students and practicing teachers. In the process of developing these new forms of the protocol, and training faculty, school staff, and students to use them, CRTL studied the dynamics of using a common set of standards to describe effective teaching upon the efficacy of the student teaching practicum. In this way, we were able to examine our thesis that a multi-perspective paradigm with a common language, standards, and shared idea of entry-level proficiency would lead to a more coherent and effective student teaching experience. Second, funds were used to support the collection of data for CRTL’s ongoing follow-up study, including the merger of data from the pre-service DRSTOS-R with data from the ongoing longitudinal study of Steinhardt graduates, including pupils' standardized achievement test scores. In this way, CRTL is examining the empirical, predictive validity of the DRSTOS-R against a criterion of growth in pupil learning. Finally, and perhaps most important, the activities supported by this grant are part of an ongoing effort to create a culture of accountability among teacher-education faculty, staff, and students. This work is part of a larger effort to create an evidence base for setting goals and standards, assessing the success of our programs, and guiding program planning. Dissemination of the results of this research, and conversations with faculty are aimed at fostering an attitude of self-inquiry.


In this section, we describe the activities that were financed through the mini-grant during the 2006-07 academic year. Consistent with the conceptual framework described above, these activities supported and were embedded in the major ongoing research and development projects in Steinhardt’s teacher-education programs.

DRSTOS-R Supervisor Training
As of July 2007, thirty student-teacher supervisors, with anywhere from 3-12 student teachers each, and representing both content-area-specific secondary school domains and childhood educators, have been trained and are using the protocol in the field. This academic year, we continued to collect supervisors’ summative scores of their student teachers’ growth in their fieldwork. During the past academic year, TNE funds allowed us to train eight supervisors at NYU, and two additional supervisors were trained at a partnership high school, alongside three cooperating teachers. In July, we conducted another intensive training for five more participating supervisors. Accordingly, mini-grant funds enabled us to almost double the number of supervisors trained in DRSTOS-R, and expand training to cooperating teachers for the first time.

Conversations at these professional meetings are centered around trying to establish inter-rater reliability based on video-taped sample lessons. Conversations around the domains and ratings assigned by the different supervisors also facilitate further validation of the instrument itself. In the winter, the main conversation was the contribution of the DRSTOS-R to establishing common definitions around important educational concepts, such as “assessment.” The summer training had an extensive conversation about whether or not there can be a one-size-fits-all document, or whether it needs to be modified for content areas and/or grade levels.

DRSTOS-R Multi-Perspective Case Study
Our research with the DRSTOS-R has reinforced the importance of considering context in the assessment process. In order to examine the use of the DRSTOS-R within a specific context, NYU researchers developed a small case study. The goal was to learn about similarities and differences in the ways in which cooperating teachers and supervisors assessed student teachers, and the extent to which using a common framework would bring enhance the consistency of their respective assessments.   Below is a summary of the study.

Methods and Procedures
Supervisors and cooperating teachers from a New York City high school that is engaged in a partnership initiative with NYU were invited to participate in the study.

  1. Three cooperating teachers and two supervisors volunteered and were trained to administer the DRSTOS-R together in a single session at the high school.
  2. During the training, inter-rater agreement was assessed by having each participant use the protocol to rate video-taped lessons purchased through ETS.
  3. End-of-semester DRSTOS-R scores were collected from all trained cooperating teachers and supervisors.
  4. Surveys from both DRSTOS-R-trained and untrained cooperating teachers from the high school were collected at the end of the semester. The surveys focused on the specific standards that the cooperating teachers used to assess their student teachers.
  5. Follow-up interviews were conducted with cooperating teachers to obtain their experiences using the DRSTOS-R, and their perceptions about its usefulness for assessing student teachers and providing a common framework for mentoring.

Summary of Findings
To begin, self-selected high school teachers (serving as cooperating, or host teachers to student teachers) and secondary student-teacher supervisors were trained in the DRSTOS-R protocol. Specifically, one cooperating teacher and supervisor were a pair, meaning the supervisor would be assessing the student teacher in this particular teacher’s classroom. There was one “non-match”—a cooperating teacher and a supervisor who would not have any interaction with a given student teacher. And then there was a “half-match,” indicating that a cooperating teacher was trained, but her supervisor was not.

The intent of this training is much like the others: to assess the protocol’s validity and establish inter-rater reliability in an effort to have a protocol that reliably depicts student teacher development in the four domains. The multiple perspectives provided by combining both cooperating teachers and university supervisors, some of whom had student teachers in common to which they apply the DRSTOS-R protocol, provided a dialogue about the variability of the vision of entry-level proficient teaching. In general terms, the differences were often attributable to perspective or role: (cooperating) teacher or supervisor.

At the end of the semester, each person present at the training was asked to complete a summative DRSTOS-R for each student teacher for whom they were responsible. In addition, cooperating teachers—the three trained as well as three non-trained who volunteered to participate—were given a survey necessitating an overall rating, and a paragraph explaining that number.

Review of the data, especially through an informal conversation facilitated by a member of the research team, revealed compelling trends. Cooperating teachers see themselves and their classrooms as models for student teachers, framing the student teachers more as guests than as participants. While they do think pre- and post-lesson conversations are important, the lesson itself is often not completely the student teacher’s own. If a teacher or school has curricular goals planned in advance, then a student teacher may be able to contribute individual lessons that fit within predetermined unit plans. In terms of controlling the amount of time the student teacher has as a lead-teacher rather than helper or observer, cooperating teachers rely on their own assessment of the student teachers—daily interaction, lesson plans—as means by which to decide how much input and opportunity to teach the student teachers should have.

Another major theme of the conversation concerned the breadth of the DRSTOS-R. A significant example of elements cooperating teachers found worthy of further conversation was those related to Classroom Environment. The teachers consider classroom setup and the establishment of classroom routines a major component of their job, and that the foundations for these, and most kinds of classroom management, are essential parts of the beginning of each school year. Given that, a full assessment is not possible because the student teacher does not get the full teaching experience.

A further perspective provided by the cooperating teachers is the involvement of their pupils in informally assessing the student teacher, and informing the classroom teachers’ decisions about how to use the student teacher. The cooperating teachers perceived the supervisor as an “as-needed” liaison to the university, rather than an integral part of the triad. Further, cooperating teachers often and informally elicit pupils’ responses to student teachers’ lessons, or gauge the success of the lesson based on the level of student work.

Linkages to Longitudinal Follow-Up Study
Mini-grant funds, in combination with funds from a Steinhardt Challenge Grant, supported a major expansion of Steinhardt’s ongoing longitudinal follow-up study of its teacher-education graduates. The purpose of the expansion is to conduct a study of the success of Steinhardt graduates in obtaining teaching positions, remaining in these positions and in the field, and in educating their pupils as measured through value-added-modeling (VAM) analyses of pupil growth in standardized test scores. In addition, this study will include an assessment of the predictive validity of the DRSTOS-R using the follow-up success measures as validity criteria. The work of this study accomplished during 2006-07 is described below. 

  1. The researchers sent an Excel file of 3,036 teacher-education graduates from the classes of 2001 through 2006 to NYCDOE for matching to the human resources file.
  2. A total of 1,507 (49.6%) graduates matched the file, indicating that they had worked as a teacher in a NYC public school at some point in time.
  3. As of spring 2007, 1,005 (66.7% of the 1,507) were still teaching full-time in NYC, 24 (1.6%) were working in NYC in other positions, such as assistant principals, guidance counselors, and literacy coaches, and 60 (4.0%) were on leaves of absence. A total of 415 (27.5%) were no longer employed by the NYC public schools. (See Table 1.)
  4. Of the 1,005 graduates who were teaching, 631 (63.1%) served in elementary or middle schools, 342 (34.2%) in high schools, and 27 (2.7%) in special education schools. (See Table 2.)
  5. In order to select teachers for the value-added modeling (VAM) analysis, the researchers examined the subjects and grade levels taught by the 631 graduates who were teaching in elementary and middle schools. Since VAM is going to be applied to growth in English language arts and math test scores, only graduates who were teaching these subjects in grades 4-8, grades for which pupils would have test scores for two consecutive years, were selected for the next phase of the study. A sample of 392 graduates who had teaching assignments that fit the criteria were selected as the VAM sample, more than 70% of whom were in a Common Branches elementary school assignment. (See Table 3.)
  6. The VAM sample graduates are highly diverse with respect to the schools in which they are teaching and their teaching experience. They teach in a total of 249 schools in every one of New York City’s 32 geographical school districts. While most schools in the sample have only one graduate teaching, some have high concentrations, with 13 schools employing five or more.  There is considerable variability in the sample’s total years of teaching experience in New York City (M = 3.3 years, SD = 2.4), ranging from a minimum of 1 month to a maximum of 15.4 years.
  7. A total of 64 graduates had DRSTOS-R protocol scores from their final semester of student teaching.
  8. The next step is to obtain the ELA and math achievement test scores for the pupils of sample graduates for the 2005-06 and 2006-07 school years. These data are currently being compiled by the NYC Department of Education.  These data will be used in VAM analyses, one of which will assess the linkages between the DRSTOS-R and pupil growth. This work will be performed during Fall 2007 and Winter of 2008.

Table 1: Current Status of Steinhardt Teacher Education Graduates Who Have Ever Taught in the NYC Public Schools

Spring 2007 status for graduates from 2001-2006

Current Status at NYC Public Schools



Cum. Percent

Full time teacher




Substitute teacher




Working in another position for NYC




On leave of absence




Not employed by NYC








Table 2: Levels of NYC Schools in Which Steinhardt Graduates Taught in Spring 2007

NYC School Levels

School Level






High School



Special Education



Total with data



Data missing



Grand total



Table 3: Teaching Assignments of Steinhardt Graduates Selected for the Value-Added Modeling Phase of the Study

Teaching Assignments












ENGLISH (Junior High School)



MATHEMATICS (Junior High School)
















AACTE Annual Meeting in New York City, February 2007
In February, CRTL shared its ongoing DRSTOS-R work at the American Association of College Teacher Educators (AACTE). We chose to present in an interactive workshop because we very much wanted this to be as much about institutional sharing as about dissemination.

Among the 25 attendees, multiple perspectives from the field of teacher education were represented, including supervisors of student teachers, directors of small teacher-education programs, faculty on program design committees, and fieldwork development professionals.

By far the most critical concept emerging from the activity—a modified version of the training we normally do, with a special focus on small group conversation about coming to consensus on domain choices and scores—was that student teacher assessment is a critical topic. Some institutions have begun to tackle its challenges, and responded to the conversation sharing their policies; others left the session with new insights into issues facing their program. For our part, CRTL understood that this topic is part of a larger professional conversation, and that varied institutions conceptualize and carry out fieldwork in myriad ways. The structure of a teacher-education program has a significant influence on how it assesses its student teachers.

In addition to the workshop on DRSTOS-R, CRTL’s Director gave a PowerPoint presentation at the TNE Learning Network panel session on the assessment of pupil learning growth. The PowerPoint was posted to the TNE website.

NYU Coursework
CRTL followed up the success and richness of the AACTE presentation by piloting the assessment protocol in a graduate-level NYU class, and found similar responses. In both groups, domains, rather than scores, were the harder to agree on. Personal perceptions and definitions of different words and actions can be highly varied, and thus influence both the domain to which they assign a certain action and the score they attach. This conversation, in both settings, emphasized the importance of the summative nature of this assessment, rather than the “point-in-time” purpose of some observation protocols. Context is essential to the understanding of the subjective nature of teaching and teaching assessment, which points to the importance of shared understandings and ongoing relationships, such as supervisors looping with the student teacher, in the student teachers’ education.

AERA Annual Meetings
Together with faculty from the Departments of Teaching and Learning and Applied Psychology, CRTL’s Director delivered a paper presentation on the longitudinal follow-up studies and VAM analyses at the Annual Meeting of AERA in Chicago in April 2007, and has submitted a proposal to report new findings from the study at the Annual Meeting in New York in April 2008.

TNE Regional Network
The work on DRSTOS-R was presented at a TNE Regional Network at Montclair State University in May 2007. Member institutions agree to continue sharing progress on teacher-education student assessment at future meetings.

Work on the DRSTOS-R and the VAM longitudinal study has been incorporated into the agenda of the Partnership for Teacher Education (PTE), a collaborative venture between the New York City Department of Education, NYU, and CUNY to improve education and induction in New York City. The school-university partnerships that have been fostered through PTE have provided a culture of collaboration that has facilitated the research and provided practical applications for the lessons learned. In addition, PTE will provide some financial support to sustain the work that has been seeded by the TNE mini-grant.


This grant has supported two strands of work. The first is research and development on a student teacher assessment that has deep and wide-ranging implications for the ways that we train teachers, and the second is research on the impact of our programs on our students and their pupils, which will have a strong effect on program accountability. While these two strands are certainly related, they are also instructive individually. Often the fieldwork component of a teacher-education program is a capstone or cumulative experience. Given its importance, its assessment needs considerable attention so that it does not become a lost portion of a teacher's development. The case study we conducted gives us a lens into select classrooms within a progressive urban high school, but not into all high schools, not even into all progressive or urban schools. Understanding the disconnect between the subjective nature of the student teacher’s experience, and the objective nature of (even the most well-meaning observational) assessment highlights the importance of the differences.

A general conclusion from this case study is the importance of the limitations of the student teaching assessment endeavor. The field has long acknowledged the variability of the student teaching experience; what is essential in this next era of research is to understand the importance of that variability—how the student teaching experience impacts a given student teacher in a particular context. In turn, that knowledge can deeply inform the development of assessments. In its attempt to understand the cooperating teachers’ conceptualization of student teaching, and their role within that concept, the ways in which we assess student teaching can become more cognizant of giving voice to multiple perspectives and the uniqueness of experience.

The lesson learned from the follow-up study is a hopeful one. While it is difficult to maintain contact with students once they leave the institution, technology provides new ways to keep track of students, and a culture of mutual accountability has resulted in greater willingness for institutions to cooperate and share information. We still must make sure that we are measuring the right constructs with instruments that are reliable and valid for the inferences we wish to make. However, we don’t feel quite so lonely in this daunting endeavor—partnership is crucial to success.


The effects of this project can be witnessed not only in Steinhardt, but also in its relationships with partner institutions. This kind of inquiry, with strands connecting practicing and pre-service teachers in the public schools, teacher-education curricular decisions, and faculty research, results in growth in each area.  First, it creates a culture of accountability, where academic goals are tied to practice, and where needs of the NYC public schools have their prominent place in the teacher-education program. Ongoing data collection from each component of the research will necessitate continued communication, and meaningful use of data. For example, as NYU grows its partnership with schools, the role of NYU graduates mentoring current NYU pre-service teacher candidates can grow. As a result, NYU departments receive frequent and relevant feedback. Further, as more matches are made with past students, the more the institution will be able to monitor and revise its programs to meet the needs of the pupils we serve. As our ability to communicate with practicing teachers increases, so does what we are able to use as comparative data. What is most representative of the TNE research is the quality of questions and conversations it has engendered, and the direction in which it points our further work.


Danielson, C. (1996). Enhancing Professional Practice: A Framework for Teaching. Alexandria, VA: Association for Supervision and Curriculum Development.

Darling-Hammond, L., and Bransford, J. (2005). Preparing Teachers for a Changing World: What Teachers Should Learn and Be Able to Do. San Francisco, CA: Jossey-Bass.