White House Conference on Preparing Tomorrow's Teachers
My assignment for this conference was to examine and report on research related to the preparation and professional development of teachers. That is a big topic and there are many ways to organize the scholarship and frame the discussion. I decided to focus on research most relevant to policy. I'm using the word policy to mean a governmental plan stipulating goals and acceptable procedures for pursuing those goals.
The most recent and impactive statement of government policy on the preparation and professional development of teachers is the reauthorization of the Elementary and Secondary Education Act (ESEA), signed into law by the President on January 8th of this year.
Title I of ESEA addresses the goal of enhancing academic achievement for disadvantaged children. With respect to teachers, it requires that states, beginning this coming school year, must prepare and widely disseminate a report that includes information on the quality of teachers and the percentage of classes being taught by highly qualified teachers in each public school in the state. The framers of this bill defined a "highly qualified teacher" as someone with a bachelor's degree who is licensed to teach on the basis of full state certification or passing the state licensure exam. The bar is raised beyond simple licensure or certification for new teachers: At the elementary school level, a highly qualified new teacher must have passed a test of subject knowledge and teaching skills in reading, writing, mathematics. At the middle and secondary school level, a highly qualified new teacher must have passed a rigorous exam or have the equivalent of an undergraduate major in each of the subjects he or she teaches. A goal of the bill is for disadvantaged students to have equal access to high quality teachers.
While Title I of ESEA approaches the goal of placing highly qualified teachers in the classroom by mandating pre-service credentials, Title II addresses the same goal by funding in-service professional development for teachers. Many forms and functions of professional development are allowed under Title II. One focus is on increasing teachers' knowledge of the academic subjects they teach through intensive, classroom-focused training. Another focus is on obtaining alignment between professional development activities and student academic achievement standards, state assessments, and state and local curricula.
What do these requirements within ESEA suggest with regard to the framers' assumptions about teacher preparation and professional development, and to what degree are those assumptions supported by research?
These are assumptions I've extracted from the ESEA provisions:
- Teachers matter (otherwise why focus on teachers at all)
- Teachers vary in their quality (otherwise why distinguish highly qualified teachers from others)
- Quality is affected by
- General knowledge and ability (otherwise why require a bachelor's degree)
- Certification and licensure (otherwise why make that a defining feature of being highly qualified)
- Experience (otherwise why distinguish beginning from experienced teachers)
- Subject matter knowledge (otherwise why require that beginning teachers have demonstrated through their college major or an examination that they have knowledge of the subject matter they teach)
- Intensive and focused in-service training (otherwise why provide funds to support such activities)
- Alignment between teacher training and standards-based reforms (otherwise why require evidence of such alignment in state applications for funding)
Before I describe what research tells us about these assumptions, we need to take a brief side trip into the world of methodology. It is typical in science that a given problem is addressed with multiple methods. The individual methods often ask and answer slightly different questions. In the early stages of research on a topic, the inconsistencies and ambiguities that result from different methods can be frustrating. Witness, for example, the recent flurry of conflicting studies and conclusions on the value of mammography in the prevention of breast cancer. However, conflicting studies and interpretations often spur the next round of investigations, and over time the evidence converges and generates consensus.
Research on teacher preparation and professional development is a long way from the stage of converging evidence and professional consensus. Several approaches to studying the topic are used, and like the proverbial blind men examining different parts of an elephant, each generates a different perspective. I will provide some background knowledge on the different methodological tools as I address the principal policy issues.
Do teachers matter?
The answer may seem so obvious that the question isn't worth asking. One reason is that all of us can generate anecdotes about teachers who have made a difference in our lives. I remember my 11th grade English teacher whose interest in my writing and the books I was reading inspired me to think about careers involving words. But however powerful such personal narratives may seem, we need to remember that in science the plural of anecdote is not evidence. Most undergraduates believe in extrasensory perception and will tell stories about experiencing it. That doesn't mean that extrasensory perception is a fact.
The Coleman study
Contrary to our intuitions and anecdotes about the importance of teachers, the landmark 1966 study, Equality of Educational Opportunity, by sociologist James Coleman, suggested that differences in teachers did not matter much. This was a huge study employing 60,000 teachers in grade 6 and beyond in over 3,000 schools. The principal finding was that nearly all of the variability in how students achieved was attributable to their socioeconomic background rather than to the schools they attended. On the subject of teacher attributes, Coleman wrote, "A list of variables concerning such matters as teachers' scores on a vocabulary test, their own level of education, their years of experience, showed little relation to achievement of white students, but some for Negroes.... Even so, none of these effects was large."
Coleman's methodology is now understood to have been seriously flawed. All of his analyses were conducted on data that had been aggregated to the school level. For example, the average vocabulary score for all teachers in a school was related to the average test score for all children in a school. Researchers now understand that aggregating data in this way can distort findings. I am reminded of the man who had his head in the oven and his feet in the freezer but whose temperature, on average, was just right. If you average together the effective teachers with the ineffective teachers, and the high performing students with the low performing students, you don't get to see the cold and hot spots where teacher characteristics might make a difference.
Recent multi-level studies
More recent studies in the tradition of Coleman's work have analyzed multilevel data that goes down to individual classrooms and students. Statistical techniques are used to apportion differences in children's academic achievement among the different environments that are assumed to affect their learning and development. Such studies typically parse out the influence of the individual abilities and knowledge the child brings to the classroom, the classroom itself, and the characteristics of the school in which that classroom is housed. With enough children and teachers and schools, and with some fancy statistics, it is possible to estimate the relative contribution of each of these factors to the differences that are observed among children in academic achievement. These studies generate much higher estimates of the relative influence of teachers and schooling on academic achievement than reported by Coleman.
The pie chart that follows reflects findings from a recent scholarly review of this literature (Scheerens & Bosker, 1997). Roughly 20% of the differences in student achievement is associated with the schools children attend, another 20% is associated with individual classrooms and teachers, and the remaining 60% is associated with differences among the children in each classroom, including the effects of their prior achievement and their socioeconomic background.
Note two things about these multilevel studies. First, they are only able to indicate the relative contribution of teachers to academic achievement, not the mechanisms by which teachers affect student learning. Thus, we find that teachers are important, by not why. Second, because the data are collected at a single point in time, the influence of teachers may be substantially underestimated. This is because the 60% effect attributable to students in the pie chart includes the effects of instruction in previous grades. Some children in a given class will have had an effective teacher the previous year and some will have had an ineffective teacher. But we can't see these influences if the children are measured only at one point in time. These unmeasured effects of previous teachers get folded into the unexplained differences among children in the same classroom. This increases the estimated influence of children compared to teachers and schools.
Value-added methods are a new and more powerful way of addressing the question of whether teachers matter. Value-added methods examine students' gains from year to year rather than their scores at a single point in time. Teachers who are adding value to student achievement will be those whose students gain most over the school year. Thus if a math teacher has children who start the year at the 95th percentile and end the year at the 90th percentile, she would not be considered an exemplary teacher even if the performance of her students was the highest in the district. In contrast, a teacher who raised her students' performance from the 45th to the 60th percentile over the course of a year would be deemed very effective even if her children performed below the average in the district. Value-added methods require that children be followed longitudinally, i.e., the same children must be tested each year and identified uniquely in the resulting database.
Sanders and Rivers (1996) used value-added methods to examine the cumulative effects of teacher quality on academic achievement. The effectiveness of all math teachers in grades 3, 4, & 5 in two large metropolitan school districts in Tennessee was estimated by determining the average amount of annual growth of the students in their classrooms. These data were used to identify the most effective (top 20%) and the least effective (bottom 20%) teachers. The progress of children assigned to these low and high performing teachers was tracked over a three-year period. The next figure illustrates the results.
Children assigned to three effective teachers in a row scored at the 83rd percentile in math at the end of 5th grade, while children assigned to three ineffective teachers in a row scored at the 29th percentile.
The next figure illustrates results from an equivalent study on math performance in Dallas (Jordan, Mendro, & Weerasinghe, 1997). The results are very similar.
Understand that these studies overestimate the actual effect of teachers on academic achievement because the assignment of students to teachers from year to year is essentially random, at least in elementary school (Rowan, 2002). The typical child is not lucky enough to get 3 highly effective teachers or unlucky enough to get 3 highly ineffective teachers in a row. However, these studies demonstrate persuasively that the potential effect of teacher quality on academic achievement is quite high.
In summary, we now know that Coleman was wrong: Teachers do matter, as our anecdotal experiences suggest and as Congress assumed when it reauthorized ESEA and authorized $3 billion annually for teacher training and professional development. Whew!
Characteristics of effective teachers
Given that teachers are important, the important research task is to identify the characteristics that distinguish quality teachers and to determine how those characteristics can be enhanced. Let's go through the characteristics assumed to be important in ESEA and take a look at the related research.
Certification and licensure
The issue of certification has generated more heat than light. You would think it would be simple to compare student achievement for certified versus uncertified teachers, but it is not. One reason is that states typically require some form of certification or licensure for a teacher in the public schools within some period of time after the teacher begins employment. Thus teachers without certification are typically inexperienced beginners. That means that simple comparisons of certified versus uncertified teachers are biased by differences in experience and age. Second, the issue of certification is often confused with the issue of alternative certification, which is a route to a teaching license that bypasses some of the undergraduate coursework requirements in education. Sometimes arguments for or against alternative certification are made on the basis of comparisons of teachers with certificates, including alternative certificates, with teachers working with provisional or temporary licenses. Third, the issue of certification is often confused with the issue of out-of-field teaching. Generally, out-of-field teachers, e.g., someone with a degree in English who is teaching math, are certified. Arguments for or against certification based on comparing out-of-field and in-field teaching are thus inappropriate. Fourth, the definitions and requirements for licensure and certification differ substantially from state to state, and sometimes within jurisdictions within the same state. These differences make it difficult to know exactly what is being compared when data are aggregated across states and jurisdictions.
With those caveats in mind, my reading of the research is that the evidence for the value of certification in general is equivocal at best. For example, Goldhaber and Brewer (1998) analyzed data from over 18,000 10th graders who participated in the National Education Longitudinal Study of 1988. After adjusting for students' achievement scores in 8th grade, teacher certification in 10th grade was not significantly related to test scores in 10th grade. In another study, notable because it uses experimental logic rather than the correlational approaches that dominate study of this topic, Miller, McKenna, and McKenna (1998) matched 41 alternatively trained teachers with 41 traditionally trained teachers in the same school. There were no significant differences in student achievement across the classrooms of the two groups of teachers.
A study by Darling-Hammond (1999) stands in contrast to the many studies that find no effects or very small effects for teacher certification. She related scores on the National Assessment of Educational Progress at the state level to the percentage of well qualified teachers in each state. "Well qualified" was defined as a teacher who was fully certified and held the equivalent of a major in the field being taught. For generalist elementary teachers, the major had to be in elementary education; for elementary specialists, the major had to be in content areas such as reading, mathematics, or special education. Darling-Hammond reported that teacher qualifications accounted for approximately 40 to 60 percent of the variance across states in average student achievement levels on the NAEP 4th and 8th grade reading and mathematics assessment, after taking into account student poverty and language background.
Although this study is frequently cited, the approach of aggregating data at the level of the state is seriously problematic. It goes backwards in terms of aggregation from the work of Coleman whose findings are considered suspect because the analyses were of data at the school level. Students do not experience a teacher with the average level of certification in a state; they experience a teacher who is or is not certified. The aggregation bias may account for Darling-Hammond's estimates of the effects of certification being light years out of the range of effects that have been reported by all other studies of this topic.
Subject matter knowledge
The effects of teacher training on academic achievement become clearer when the focus becomes subject matter knowledge as opposed to certification per se. The research is generally consistent in indicating that high school math and science teachers with a major in their field of instruction have higher achieving students than teachers who are teaching out-of-field (e.g., Brewer & Goldhaber, 2000; Monk, 1994; Monk & King, 1994; Rowan, Chiang, & Miller, 1997). These effects become stronger in advanced math and science courses in which the teacher's content knowledge is presumably more critical (Monk, 1994; Chiang, 1996).
The best studies, including the ones cited here, control for students' prior achievement and socio-economic status. Studies that simply report the association between teachers' undergraduate majors and student achievement are difficult to interpret. For instance the year 2000 National Assessment of Educational Progress in math reports that eighth-graders whose teachers majored in mathematics or mathematics education scored higher, on average, than 8th graders whose teachers did not major in these fields. However, there are many interpretations of this simple association, including a well-documented rich-get-richer process in which students with higher math abilities are assigned to classes taught by better trained teachers.
Interestingly, the 2000 NAEP finds no relationship between math scores at 4th grade and teachers' major. Likewise, Rowan (2002) using a different dataset found no relationship in elementary school between certification in math and student achievement in math, and no relationship between having a degree in English and student achievement in reading. These findings suggest that subject matter knowledge in these areas as currently transmitted to teachers-in-training by colleges of education is not useful in the elementary school classroom.
General knowledge and ability
The most robust finding in the research literature is the effect of teacher verbal and cognitive ability on student achievement. Every study that has included a valid measure of teacher verbal or cognitive ability has found that it accounts for more variance in student achievement than any other measured characteristic of teachers (e.g., Greenwald, Hedges, & Lane, 1996; Ferguson & Ladd, 1996; Kain & Singleton, 1996; Ehrenberg & Brewer, 1994).
This is troubling when joined with the finding that college students majoring in education have lower SAT and ACT scores than students majoring in the arts and sciences. For example, among college graduates who majored in education, 14% had SAT or ACT scores in the top quartile, compared to 26% who majored in the social sciences, compared to 37% who majored mathematics/computer science/natural science. In addition, those who did not prepare to teach but became teachers were much more likely to have scored in the top quartile (35 percent) than those who prepared to teach and became teachers (14 percent) (NCES, 2001).
In general, studies of the effects of teacher experience on student achievement suggest a positive effect. For instance, Rowan (2002) found a significant effect of teaching experience on reading and math outcomes in elementary school, with larger effects for later elementary school than early elementary school. Likewise, Greenwald, Hedges, and Laine (1996), in their large meta-analysis of the literature on school resources and student achievement, found significant effects of teacher experience.
Many districts and states provide incentives for teachers to return to the classroom to obtain advanced degrees in education. The bulk of evidence on this policy is that there are no differential gains across classes taught by teachers with a Masters' degree or other advanced degree in education compared to classes taught by teachers who lack such degrees.
Intensive and focused in-service training
Although the literature on professional development is voluminous, there are only a few high quality studies relating teacher professional development experiences to student outcomes. Recommendations for "high quality" professional development tend to emphasize the importance of more intense, content-focused experiences (i.e., not one-day generic workshops), as well as more opportunities for peer collaboration and more structured induction experiences for new teachers. These recommendations are reasonable, but are supported by little more than anecdotal evidence, inferences based on theories of learning, and survey data indicating that teachers feel they get more from such experiences than from typical workshops.
One relatively strong study supporting the value of focused professional development is by Cohen and Hill (2000). These investigators compared the effects of teacher participation in professional development specifically targeted to a mathematics education reform initiative in California compared to teacher participation in special topics and issues workshops that were not linked to the content of the mathematics initiative (e.g., workshops in techniques for cooperative learning). The more time teachers spent in targeted training on the framework and curriculum of the mathematics reform, the more their classroom practice changed in ways that were consistent with the mathematics reform, and the more they learned about the content and standards for that reform. Teachers who participated in special topics and issues workshops showed no change in their classroom practice or knowledge related to the reform. Teachers who participated in the focused training and whose classroom practice moved towards incorporating the framework of the new math initiative had students who scored higher on a test of the math concepts imparted by the new curriculum.
This study and a couple of others (Wiley and Yoon, 1995; Brown, Smith, and Stein, 1996; and Kennedy, 1998) suggest that when professional development is focused on academic content and curriculum that is aligned with standards-based reform, teaching practice and student achievement are likely to improve.
Summary of the effects of teacher characteristics on student achievement
The figure that follows attempts to summarize the relative strength of each of the dimensions of teacher quality I have reviewed. The heights of the bars in the graph should not be taken as exact or specific to any particular research study. Rather they are intended simply to summarize graphically the conclusions I have drawn in the preceding narrative.
All of the research reviewed to this point is correlational in nature and focuses on differences across teachers. The history of this line of research flows from attempts to demonstrate that teachers and classrooms make a difference, to determining how much of a difference they make, to trying to identify characteristics of teachers that contribute to those differences. Within psychology, this is called differential psychology or the study of individual differences.
There is another tradition within psychology that is relevant to attempts to improve teacher quality. That is the experimental tradition. It looks not for individual differences among teachers but for interventions that raise the effectiveness of all teachers. These are called main effects. Unfortunately experimental methods have not yet found their way to research on teacher training. Even so there are data of a weaker nature that suggest experiences and policies that can produce main effects, i.e., can raise the performance of all teachers and through them the achievement of all students. These data demonstrate the effects of the contexts in which teachers work. There are many dimensions to the context of teaching. Here I focus on the components of standards-based educational reform that are embodied in the ESEA reauthorization and the ongoing practice of many states. These components are: 1) learning standards for each academic subject for each grade, 2) assessments that are aligned to those standards, and 3) provisions for holding educators accountable for student learning. For standards-based reform to work there is reason to think that two additional components are necessary: 1) teachers must be provided with curriculum that is aligned with the standards and assessments; and 2) teachers must have professional development to deliver that curriculum.
We can see the effect of curriculum in the next figure. Three schools in Pittsburgh that were weak implementers of a standards-based math curriculum were compared with three schools with similar demographics that were strong implementers. Note that racial differences were eliminated in the strong implementation schools, and that performance soared. There is no reason to believe that any of the individual differences in teachers previously described, such as cognitive ability or education, differed among the weak implementation schools versus the strong implementation schools. Yet the teachers in the strong implementation schools were dramatically more effective than teachers in the weak implementation schools. Thus a main effect of curriculum implementation swamped the effects of individual differences in background among teachers.
We see this effect on a larger scale in a database developed by the American Institutes of Research under contract to the U.S. Department of Education. The database includes academic achievement data and demographic data on each school in 48 different states that have their own assessment system. The Education Trust has analyzed the data to ask the question of how many high-poverty and high-minority schools have high student performance. They have identified 4,577 high-flying schools nationwide that are in the top third of poverty in their state and also in the top third of academic performance. Whatever these schools are doing to perform so well, and we need to understand that better than we do now, it is very unlikely that they have teachers who are dramatically different from teachers in less effective schools on the individual differences previously surveyed. Again, there is a main effect, something going on in the school as a whole that affects the practice of all teachers in the school, and raises student achievement accordingly.
The next table examines main effects at a higher level, in this case for states. Here we see 4th grade math gains on the National Assessment of Educational Progess for African Americans between 1992 and 1996 for the United States as a whole and for three states (Massachusetts, Texas, and Michigan) that beat the national increase by a substantial margin.
United States: + 8
Texas: + 13
Michigan: + 13
The next figure continues this same theme by demonstrating how North Carolina outpaced the United States as a whole in gains in 4th grade reading between 1992 and 1998.
|United States||North Carolina|
Again, something is going on that generates better performance from all teachers regardless of the individual differences in education and cognitive abilities they bring to the classroom.
Putting it all together
Summarizing the material reviewed, we see that teachers matter and differ in effectiveness. The most important influence on individual differences in teacher effectiveness is teachers' general cognitive ability, followed by experience and content knowledge. Masters' degrees and accumulation of college credits have little effect, while specific coursework in the material to be taught is useful, particularly in more advanced subjects. Specific, curriculum-focused and reform-centered professional development appears to be important to effective instruction. Context studies tell us that all teachers can do a better job when supported by good curriculum, good schools, and good state policy. With the exception of the role of certification, these research findings align well with the provisions of ESEA.
There is an irony in demonstrating that teachers are important by showing that students' academic achievement is dependent on the teachers they are assigned. In other fields, substantially variation in performance among professionals delivering the same service is seen as a problem to be fixed. For example, we would not tolerate a system in which airline pilots varied appreciably in their ability to accomplish their tasks successfully, for who would want to be a passenger on the plane with the pilot who is at the 10th percentile on safe landings. Yet the American system of public education is built on what Richard Elmore has called the ethic of atomized teaching: autonomous teachers who close the doors to their classrooms and teach what they wish as they wish. The graphs from the value-added studies tell us what happens when a child has the back luck to be assigned to a teacher whose approach doesn't work. Variation in teacher effectiveness needs to be reduced substantially if our schools are going to perform at high levels.
There are three routes to that goal suggested by the research I have reviewed. First, we can be substantially more selective in the cognitive abilities that are required for entry into the teaching profession. Second, we can provide pre-service and in-service training that is more focused on the content that teachers will be delivering and the curriculum they will be using. Third, we can provide much better contexts for teachers to do their work. One important context is in the form of systems that link and align standards, curricula, assessment, and accountability. These policy directions are not conceptually incompatible, but each requires resources. We need better research to inform policy makers on the costs and benefits of each approach.
We are at the beginning of an exciting new period in teaching, one in which previous assumptions and ways of doing business will be questioned. As we build a solid research base on this topic, one that is more specific and experimental than we have currently, we should be much better able to provide effective instruction for all children. My hope and expectation is that when my sons have children in school they will not have to experience the anxieties nor engage in the machinations my wife and I went through each year as we tried to get our children assigned to what we believed were the best teachers in the next grade. Individual differences in teachers will never go away, but powerful instructional systems and new, effective forms of professional development should reduce those differences to the point that every teacher should be good enough so that no child is left behind.
Brown, C., Smith, M., & Stein M. (1996, April). Linking Teacher Support to Enhanced Classroom Instruction. Paper presented at the American Educational Research Association, New York, NY.
Chiang, F.S. (1996). Teacher's ability, motivation and teaching effectiveness. Unpublished doctoral dissertation, University of Michigan, Ann Arbor.
Coleman, J. et al. (1966). Equality of Educational Opportunity, Washington D.C.: Government Printing Office.
Darling-Hammond, L. (1999) Teaching and Knowledge: Policy issues posed by alternate certification for teachers. Seattle, Washington: Center for the Study of Teaching and Policy, University of Washington.
Ehrenberg, R. & Brewer, D. (1994) Do School and Teacher Characteristics Matter? Evidence From High School and Beyond. Economics of Education Review, 14: 1-23.
Ferguson, R. & Ladd H. (1996) How and Why Money Matters: An Analysis of Alabama Schools. In H. Ladd (ed). Holding Schools Accountable. Washington, DC: Brookings Institution: 265-298.
Goldhaber, D., & Brewer, D. (1998) Why should we reward degrees for teachers. Phi Delta Kappan, October 1998: 134-138.
Greenwald, R. Hedges. L, & Laine. R. (1996) The Effect of School Resources on Student Achievement. Review of Educational Research, 66, 361-396.
Jordan, H.R., Mendro, R., & Weerasinghe, D. (1997) Teacher effects on longitudinal student achievement: A preliminary report on research on teacher effectiveness. Paper presented at the National Evaluation Institute, Indianapolis, IN.
Kain, J. & Singleton, K. (1996) Equality of Education Revisited. New England Economic Review, May-June: 87-111
Kennedy, M. (1998, April) Form and Substance in Inservice Teacher Education. Paper presented at the annual meeting of the American Educational Research Association, San Diego, CA.
Miller, J., McKenna, M., & McKenna, B. (1998) A comparison of alternatively and traditionally prepared teachers. Journal of Teacher Education, 49, 165-176
Monk, D. H. (1994). Subject Area Preparation of Secondary Mathematics and Science Teachers and Student Achievement. Economics of Education Review, 13, 125-145
Monk, D., & King J. (1994) Multilevel Teacher Resource Effects on Pupil Performance in Secondary Mathematics and Science. In Ronald G. Ehrenberg (ed.), Choices and Consequence. Ithaca NY: ILR Press.
National Center for Education Statistics (2001). The Condition of Education, 2001. Washington, DC: US Department of Education.
Rowan B. (2002) What Large-Scale, Survey Research Tells Us About Teacher Effects on Student Achievement: Insights From the Prospects Study of Elementary Schools. Ann Arbor: University of Michigan (unpublished).
Rowan, B., Chiang, F.S., & Miller, R. J. (1997). Using Research on Employee's Performance to Study the Effects of Teacher on Students' Achievement. Sociology of Education, 70, 256-284.
Sanders, W., & Rivers, J. (1996, November) Cumulative and Residual Effects of Teachers on Future Student Academic Achievement. Knoxville, TN.: University of Tennessee Value-Added Research and Assessment Center.
Scheerens, J. & Bosker, R. (1997). The Foundations of Educational Effectiveness. New York: Pergamon.
Wiley, D., & Yoon, B. (1995) Teacher Reports of Opportunity to Learn: Analyses of the 1993 California Learning Assessment System. Educational Evaluation and Policy Analysis, 17, 355-370.