Do Teachers Matter?

A psychology study hit the headlines last Friday under the banner ‘Teacher quality makes little difference, study shows’.

AN AUSTRALIAN study has cast doubt over the “teacher effect”, by suggesting differences between teachers play only a minor role in how well a child will learn.

The global study, led by the University of New England, monitored 500 pairs of identical twins during their first three years of school.

They were divided into two groups: twins who shared the same teacher, and siblings who were split across different classrooms.

The same genetics and home life ensured the twins had the same ability to learn. The study was designed to reveal any differences linked to their teacher alone.

Professor Brian Byrne said the finding contradicted the views of educationalists who claimed teacher quality could account for a variance of up to 40 per cent in a child’s learning outcome.

“Our study shows … the ‘teacher effect’ on differences in children’s acquisition of literacy skills in the early years of schooling is no more – and sometimes less – than 8 per cent,” Professor Byrne said.

“This result is certainly in the direction you would expect if teachers do ‘make a difference’ but it is not very large.

Curious to learn more, I wrote to the lead author, Professor Brian Byrne at UNE. Within a few hours, Brian had emailed me back a copy of the study. He has asked me not to post the text, since it is still forthcoming at the Journal of Educational Psychology. Given that teacher effects are a pretty important policy issue, I thought it might be worth posting a handful of thoughts on it.

A multi-country study that follows twins is very cool. The logistical work required to pull together something like this (including administering bespoke tests) is extremely impressive.
In the study, though not so much in the ensuing media, the authors acknowledge that they cannot separate teacher effects from classroom effects. This is important, since the headline from these results could just as easily have been “class size doesn’t matter”.
The authors spend quite a bit of time discussing the education literature on teacher quality, but less on the economics literature on teacher quality. Economists have indeed separated class size effects from teacher quality effects, and their conclusion on the share of variance explained by teachers is around 2-5% in this study, and up to 16% in this study.
Numbers like 2%, 8% and 16% may sound small, but I don’t think we should be surprised to learn that family background is very important. The question is: are teacher quality effects large enough to affect children’s life chances? Two examples: (1) these numbers imply that a 90th percentile teacher is twice as effective as a 10th percentile teacher; and (2) according to my own Australian results on the dispersion of teacher quality, assuming that the impact of having a more effective teacher persists over time, and that Indigenous children typically get teachers at the 25th percentile, these results imply that the black-white test score gap in Australia could be closed in five years by giving all Indigenous pupils teachers at the 75th percentile.
While twins represent a novel way of decomposing teacher effects, they do have their drawbacks. In particular, one might think that having a same-aged sibling creates the possibility of spillovers: if you have a lousy teacher and your twin brother has a great one, wouldn’t you use his notes when doing homework? If true, this will bias downwards the estimated share of variance explained by teacher differences.

I emailed the above comments to Brian last night, who speedily responded (on his birthday!) with the following.

Brian Byrne: Comments on Andrew Leigh’s comments

Andrew has graciously invited me to comment on his notes about our research on “teacher effects” and its media presentation. I agree with him that this important issue deserves an airing because questions of public policy such as teacher preparation and assignment and school “league tables” are best considered in the light of relevant available evidence.

What I do here is cite some of Andrew’s comments and in turn comment on them

There are a couple of factual matters:

The global study, led by the University of New England, monitored 500 pairs of identical twins during their first three years of school.

For the record, there are indeed about 500 pairs of identical twins in the study, plus the same number of fraternal twins. Our analyses for the paper in question employed both types, with similar results, but the UNE publicity person felt, correctly in my view, that readers and listeners would find it easier to grasp the case of identical twins, where the genetics can’t be a source of differences within twin pairs. In addition, we didn’t include our 250 sets of Scandinavian twins in these analyses because with just a few exceptions these twin pairs were kept together in school and so as a sample were not informative for the issue at stake. We had about 720 pairs in total in the analyses.

They (the twins) were divided into two groups: twins who shared the same teacher, and siblings who were split across different classrooms.

We, the researchers, didn’t divide the children. Parents and/or schools made those decisions; fortuitously, over the three years we followed the children, about half were in the same class as their co-twin and about half were in different classes.

Now, to other matters:

“Our study shows … the ‘teacher effect’ on differences in children’s acquisition of literacy skills in the early years of schooling is no more – and sometimes less – than 8 per cent,”

In case you’re interested, that figure comes from the difference, on average, between the intraclass correlation coefficients for reading and spelling results from same- and different-class twins. The average differences, across both country samples, all tests and zygosity, is about .08, which can be read directly as % variance accounted for by classroom-level effects, that is by the difference in what behaviour-geneticists refer to as unique, or non-shared, environment. For example, and from memory, the interclass correlation between monozygotic twins who have the same teacher is about .85, for those with different teachers, about .78. The correlations for dizygotic twins are both lower than these at about .54 and .46 (that’s the evidence for a substantial genetic effect of early reading), but the difference is about the same in absolute terms.

I say “and sometimes less” because, as Andrew points out, we cannot separate teacher influence from other processes operating at the classroom level. Those processes might include class size (though as I read the literature, within reasonable limits size doesn’t matter much), but includes others as well. Herb Marsh, now at Oxford, and others, have identified “classroom climate,” which is independent of the teacher, as influential. The issue is important because to the degree that classroom-level effects are not teacher effects the implications for practice are less clear-cut–practices like teacher rewards, the effects of cumulative exposure to supposed “good” and “bad” teachers, assigning 75^th percentile teachers to disadvantaged groups, and so on. It is a very challenging research question to separate teacher from other classroom effects, but an important one.

The question is: are teacher quality effects large enough to affect children’s life chances?

In our article we give an example, similar to Andrew’s helpful ones, of the effect of having a 75^th percentile teachers versus a 25^th percentile one—about one-third of a standard deviation in reading in a year. But remember that the interpretation of this kind of scenario as down to teachers depends on classroom effects being completely teacher effects.

While twins represent a novel way of decomposing teacher effects, they do have their drawbacks. In particular, one might think that having a same-aged sibling creates the possibility of spillovers: if you have a lousy teacher and your twin brother has a great one, wouldn’t you use his notes when doing homework? If true, this will bias downwards the estimated share of variance explained by teacher differences.

The question of generalizing from twin studies to children (or people) in general is an important one, and one that behaviour-geneticists are very aware of. We discuss this in our article. We actually conclude that, in this case, twin cooperation is not likely to be a source of contamination. We do so on the grounds that if some process like this were operating twins would be more alike than otherwise, and as long as the process applied equally to monozygotic and dizygotic pairs, the estimate of what is called “shared environment” would be considerably higher than it actually is (in fact, it’s pretty close to zero in the Australian subsample). Shared environment refers to factors that twins share, like family and school, that influence the trait of interest.

By the way, it is on the grounds of the small, and generally statistically nonsignificant, shared environment effects, where school-level influences would show up, that I’ve been questioning the proposed school league tables. My opinion is that if significant differences do exist between average school scores in early literacy (and our data are silent on this) they are more likely to be about characteristics of the students in the school than about school-level policies and practices.

Like all scientific investigations, there are limitations in any piece of research, and much more needs to be known about all these issues. Meantime, I’m grateful to Andrew for the chance to discuss them.

(xposted @ Core Economics)

5 Responses to Do Teachers Matter?

Corin says:

July 29, 2009 at 8:05 pm

Andrew, what policy lessons might you draw from this??
Kevin Rennie says:

July 29, 2009 at 8:17 pm

If teachers don’t matter much, It makes me feel better about all the mistakes I made over the years. I used to blame my twin brother for anything that went awry. No kidding.

The larger the class size, the less teacher impact I suspect.

Interesting stuff!

It’s an appropriate week to read Frank McCourt’s ‘Teacher Man’. He shared most teachers’ self doubt about their own effectiveness.
diana kornbrot says:

July 30, 2009 at 6:02 pm

so how different ARE 2 teachers in the same school?
given the known? difference in teacher ability it would be very surprising if the the MEAN difference between teacher abilities per twin were more than .5 sd. [this is guesstimate, but true estimation is possible]
% variance is VERY VERY dubious measure when applied in such situations.
of course if the teachers are similar then the home/genetic difference will look large
best

diana

diana
Matt Ryan says:

August 3, 2009 at 10:05 pm

I notice that the cited Hanushek study also came up with about an 8% fixed teacher effect on reading (in the Texas schools study).

I think the twins study (especially the genetic influences) is pretty stunning alongside LSAY and PISA-based studies which struggle to “explain” more than around a quarter of 15 year old student test score variance (as distinct from the young’uns in this study).

But then LSAY/PISA-based studies don’t attempt to crack the ‘inner circle’ of influences on student outcomes of genetics (the focus of Byrne) and individual teachers (the focus of Hanushek, Leigh, Rockoff..). Rather, their focus is on ‘outer circle’ influences surrounding school level and community (SES) characteristics.

I look forward to the published paper.
Kerry Bell says:

August 7, 2009 at 10:52 am

As a high school ‘drop out’ who subsequently undertook adult education and whose career has included a period as the CEO of a number of private education institutes I find the subject of teacher quality facinating. Yet one nobody on the operational side of education seems willing to seriously take the issue on board!

Certainly the 100 or so teachers under my employ were very willing to accept “poor student quality” as an excuse for poor student outcomes but never at any time seemed willing to accept that their role as teachers should include the ability to, at the very least, improve year on year Grade point averages of weaker students.

Put more clearly my arguement was “sure the student may not be top percentile material but that should not be an excuse for them falling backwards under your tutelage”

Not sure what the solution is but as with every profession there certainly seems to be a need to have educators accept responsibility for student outcomes both good and bad. If this can be acheived maybe our next generation will begin to improve in areas such as literacy and math instead of appearing to fall behind their parents in these import skill areas?

Kerry Bell