Over 100 years of eudcational research and scholarship.  Subscribe today.
Home Articles Subscriptions About TCRecord Advanced Search   


Implementing and Analyzing Performance Assessments in Teacher Education

reviewed by Brent Duckor

coverTitle: Implementing and Analyzing Performance Assessments in Teacher Education
Author(s): Joyce E. Many & Ruchi Bhatnagar (Eds.)
Publisher: Information Age Publishing, Charlotte
ISBN: 1641131195, Pages: 270, Year: 2017
Search for book at Amazon.com

This is an ambitious volume that examines the extent to which teacher licensure tools such as EdTPA can advance our understanding of high-quality teaching among novices and those who teach, coach, and guide them into the profession.

Without foregrounding the importance of expert-novice distinctions in the meaningful assessment of powerful teaching practices or calling attention to the challenges inherent in standardized assessments that purport to evaluate teacher candidate learning trajectories and progressions, the book nonetheless invites its readers on a journey. The story that unfolds is largely about the development, implementation, and analysis of “lessons learned” about one of the major reform movements in portfolio-based assessment. Tracing the roots of teacher performance assessments (TPAs), the first chapter outlines the history of TPAs and the characteristics of five national (edTPA and PPAT) and three state (PACT in California, NH-TCAP in New Hampshire, and KPTP in Kansas) TPAs currently in use.

While instructive, this history unfortunately omits important research developments related to the validation of scores and ongoing studies of production data with, for example, the PACT, one of the longest running TPAs in California (Castellano, Duckor, Wihardini, Téllez, & Wilson, 2016; Duckor, Castellano, Téllez, Wihardini, & Wilson, 2014). These omissions are important because the nature and use of TPAs are contested, not just by subjects who take the test but also by experts in the teacher research and psychometric communities. The Chapter One framing of TPA history requires more unpacking for those early on the front lines.

The remaining chapters report research from different stakeholders’ perspectives on edTPA (Chapters Two, Three, Four, Five, Six, Seven, and Ten) as it was piloted by programs across the United States. In terms of research questions, methodology, and results, this volume runs the gamut of quantitative and qualitative study, though it mostly leans toward interviews, document reviews, case studies, and the occasional survey. Chapter Eight, for example, reports on document analysis and analysis of case studies of three first-year teachers who took New Hampshire’s Teacher Candidate Assessment of Performance (NH-TCAP). Chapter Nine describes the history and use of a performance assessment for teacher candidates in Kansas (the  KPTP) used since 2009. No chapter addresses substantively TPAs used in California (a launching site for TPAC, which became EdTPA), which prepares a sizeable percentage of U.S. K-12 teachers; in 2015-16, for instance, California credential programs prepared 13,300 teachers. In that sense, the impact story of edTPA is largely confined to a convenience sample which may highlight some early adopter state program experiences at the expense of other, more mature ones.

In the background context of this volume, we are barely reminded that not all agree with standardized TPAs. These texts are largely silent on “enemies” and “critical friends” of the TPA movement (Cochran-Smith, Piazza, & Power, 2013; Dover, 2018; Dover & Schultz, 2016; Petchauer, Bowe, & Wilson, 2018). Yet one senses the authors are struggling in silence with others. Not everyone in teacher education agrees with the move to tie teacher education outcomes to explicitly defined, state-led professional standards and measures, and some worry about mission creep: Has TPA become the boundary marker by which we judge not only pre-service teachers’ skills, proficiency, or competence, but also their mentors, program faculty, coordinators, and communities of practice? The nascent program experiences and institutional bumps in the road chronicled here exemplify these struggles, and are thematic markers to read carefully across these texts.


To illustrate the challenges: In Chapter Two, “From Isolation to a Community of Practice: Redefining the Relationship of Faculty and Adjunct University Supervisors During the Implementation of edTPA,” the authors conclude that the university supervisors’ “participation in a community of practice that centers upon a common goal, in this case the goal of enhancing the development and assessment of teacher candidates, has the potential to dissolve the hegemonic stratification that traditionally infuses university settings” (p. 59). And yet, in Chapter Three, “Faculty Investment in Student Success: A Four-Year Investigation of edTPA Implementation,” the authors use the term covert leadership (Mintzberg, 1998) to emphasize the importance of adopting an administrative approach that motivates, supports, and coordinates faculty activities rather than handling the implementation process through institutional directives.

There are delicate dialectics at play in this volume across each study setting; between acculturation and coercion, hegemony and engagement, transparency and mandated routine. The reader is often unclear if the studies, presented as a whole, are meant to shed light on the dilemmas of edTPA “early adoption” or if each study, by itself, merely points to the problem of high stakes assessment on its subjects generally, setting and maintaining “standards without standardization” in a profession as McNeil (1986, 2000) might have it.

For this reader, the crux of the book’s import turns on Chapter Nine, “State Education Agency Use of Teacher Candidate Performance Assessments: A Case Study of the Implementation of a Statewide Portfolio-Based Assessment System in Kansas,” which describes the creation and use of the Kansas Performance Teaching Portfolio (KPTP), including its history as an assessment for practicing teachers and the assessment’s evolution into an evaluation used for initial teacher preparation program completion. The authors explore a data analysis tool developed to aid KPTP implementation and improvement, and reflections on lessons learned are “offered to other agencies” embarking on the development of statewide performance assessments, including ways to “leverage partnerships after development and implementation as thoughts turn to systemic improvement and effective data use.” (p. xi). This is a climax in the plot where we should all pay attention. It appears TPA scores have consequences – far beyond the teacher candidate. By now, don’t we have a right to assume that validation is a prerequisite for fair use of any standardized assessment, and to ask the authors where the evidence is to support these TPA uses?

Without citing the 2014 Testing Standards (AERA, APA, NCME), which govern fair use of high stakes score data, or exploring what it means to validate TPA scores for particular consequential uses of data (whether to evaluate programs, redirect resources, even remediate candidates based on sub-scores), the authors in this volume allude to big ideas in standardized assessment but largely ignore them. The TPA community occasionally advertises its tools as educative or formative or even useful. But one wonders: For whom? For whose good? Based on which line of evidence? The notions of validity and reliability receive passing mention in some chapters, but the evidence is thin for their substantive meaning, and the problems raised by TPA score interpretation and data use are not well documented in this volume.

As we wrote of the California PACT in its heyday, when exploring any TPA’s potential for supporting, scaffolding, and guiding programmatic change (beyond pilot study), we must work harder to validate results. Researchers, psychometricians, and teacher educators must come together to help all TPA stakeholders make sense of “their” data and its warranted, appropriate uses. Validation studies are a critical part of the puzzle of unpacking the meaning and dependability of TPA data (Duckor, Castellano, Téllez, Wihardini, & Wilson, 2014), particularly as intended and unintended uses multiply across settings and users in the United States. For example, Kane (1994) noted that each intended interpretation assigned to test scores needs to be supported with evidence. He writes that the interpretations for licensure and certification tests involve “sequences of inferences, or an argument, leading from the test score to decisions about licensure or certification” (p. 133).

The argument-based framework for validation of test results has gained momentum in the last several decades (Cronbach, 1988; Kane, 2013; Haertel & Herman, 2005; Messick, 1989, 1994), as has the concern for the consequences and use of high stakes test score results for individuals

and institutions (Haertel, 2013; Moss, 2013; Shepard, 1993). The renewed focus on the meaning of validity of TPA data contained here is both epistemological and pragmatic.


Yes, most of these studies focus on transitional adoption of the TPA instrument, but this begs the deeper question: What are the limits of data use at scale? Can we misread, beyond studies of whether we like or approve of the instrument in its pilot phases, the deeper meaning of the TPA scores for education reform? Has edTPA in particular incorporated lessons learned from previous TPAs about those limits and the caveats of, for example, subscore interpretation, or has it merely moved the validation goalpost forward, again?

It may seem unfair to ask a volume on Implementing and Analyzing Performance Assessments in Teacher Education to carry more water than it can bear. And yet our profession is at a turning point. Claims matter. Data counts. And consequential uses of TPA data, given limited resources and institutional overload in higher education, as this book amply demonstrates, are moving us in directions that signal and impact what it means to be a teaching professional in the United States. TPA stories are a piece of the teacher education “plot,” foregrounding and shadowing struggles with the standardization movement and corporate forces on the horizon. This volume only hints at those issues and the vexing developments facing our profession.




American Educational Research Association (AERA, APA, NCME). (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

Castellano, K., Duckor, B., Wihardini, D., Téllez, K., & Wilson, M. (2016). Assessing academic language in an elementary mathematics teacher licensure exam. Teacher Education Quarterly, 43(1), 1–25.


Cochran-Smith, M., Piazza, P., & Power, C. (2013). The politics of accountability: Assessing teacher education in the United States. The Education Forum, 77(1), 6–27.


Cronbach, L. J. (1988). Five perspectives on validity argument. In H. Wainer & H. I. Braun (Eds.), Test validity (pp. 3–17). Hillsdale, NJ: Lawrence Erlbaum.


Dover, A. G. (2018). Your compliance will not protect you: Agency and accountability in urban teacher preparation. Urban Education. doi: 10.1177/0042085918795020


Dover, A. G., & Schultz, B. D. (2016). Troubling the edTPA: Illusions of objectivity and rigor. The Educational Forum, 81(1), 95–106.


Duckor, B., Castellano, K. E., Téllez, K., Wihardini, D., & Wilson, M. (2014). Examining the internal structure evidence for the performance assessment for California teachers: A validation study of the elementary literacy teaching event for Tier I teacher licensure. Journal of Teacher Education, 65(5), 402–420.


Haertel, E. H. (2013). Getting the help we need. Journal of Educational Measurement, 50(1), 84–90.


Haertel, E. H., & Herman, J. L. (2005). A historical perspective on validity argument for accountability testing. In J. L. Herman & E. H. Haertel (Eds.), Uses and misuses of data for educational accountability and improvement (pp. 1–34). Malden, MA: Blackwell Synergy.


Kane, M. (1994). Validating interpretive arguments for licensure and certification examinations. Evaluations and the Health Professions, 17(2), 133–159.


Kane, M. (2013). Validation as a pragmatic, scientific activity. Journal of Educational Measurement, 50(1), 115–122.

McNeil, L. M. (1986). Contradictions of control: School structure and school knowledge. New York, NY: Routledge.

McNeil, L. (2000). Contradictions of school reform: Educational costs of standardized testing. New York, NY: Routledge.


Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York, NY: Macmillan.


Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23.


Mintzberg, H. (1998). Covert leadership: Notes on managing professionals. Harvard Business Review, 76(6), 140–147.


Moss, P. A. (2013). Validity in action: Lessons from studies of data use. Journal of Educational Measurement, 50(1), 91–98.


Petchauer, E., Bowe, A. G., & Wilson, J. (2018). Winter is Coming: Forecasting the Impact of edTPA on Black Teachers and Teachers of Color. The Urban Review, 50(2), 323–343.


Shepard, L. A. (1993). Evaluating test validity. In L. Darling- Hammond (Ed.), Review of research in education (pp. 405–450). Washington, DC: American Educational Research Association.


Cite This Article as: Teachers College Record, 2018, p. -
http://www.tcrecord.org ID Number: 22521, Date Accessed: 10/19/2018 3:04:43 PM

Article Tools

Related Articles

Site License Agreement    
 Get statistics in Counter format