Vol. 3, No. 2 - August 1998
Measuring Progress Toward Equity in Science and Mathematics Education
By Marry M. Kennedy
![]() |
Benefits of teacher
professional development for students
|
There
is a much-maligned event in education called the one-shot workshop. This event has been criticized by virtually every teacher who
has ever participated in it and by virtually everyone else even vaguely
interested in improving teaching. Researchers
and policy analysts, critical of the one-shot workshop, have generated a
number of proposals for how continuing education programs for teachers should
be organized, arguing that they be lengthy rather than brief, that teachers
have a role in defining the content rather than having the topics imposed on
them, that the scheduled meetings be interspersed with classroom practice
rather than concentrated into a short period of time, or that they allow
teachers to work in groups, rather than in isolation (Corcoran, 1995;
Goldenberg & Gallimore, 1991; Little, 1993; Loucks-Horsley et al., 1998).
There
is a common-sense appeal to these ideas.
It makes sense that, if you really want to alter teaching practice, you
need more than a 2-hour workshop. But
the ultimate benefits of these recommended changes have seldom been examined.
This
brief examines these contentions by reviewing studies of professional
development that examine benefits to students.
A major finding from this review is that program content,
what is being taught (e.g., management strategies, knowledge of how students
learn specific school subject matter), is an important predictor of benefit to
students. This finding should not
be a surprise, but it is a surprise in light of the literature describing
optimal professional development: This literature does not address content as
much as it addresses program form and structure.
From
the total pool of 93 studies found that examined the effectiveness of various
approaches to continuing teacher education in either mathematics or science,
only 10 included evidence of benefits to students.
The paucity of evidence for how these programs ultimately benefitted
students is itself an important finding.
Table
1
Studies
Included In This Review
|
Citation |
Subject
Matter Context |
Grade
Span of Participating Students |
Source
of Participants
|
Form
and Distribution of Contact Time |
Total
Contact Hours* |
Study
Duration In Months* |
|
Category 1: Content Focus is on Teaching Behaviors that Apply generically to All School Subjects |
|
Stallings
and Krasavage (1986) |
Math
|
Math
2-4
|
-4
school-wide projects |
Distributed workshops |
|
16 month
|
|
|
16
month
Stevens and Slavin (1995) |
Math
|
Math
K-6
|
K-6 school-wide projects |
Distributed workshops |
|
8 months
|
|
|
Category
2: Content Focus is on
Teaching Behaviors that Apply to
a Particular Subject |
|||||||
|
Good,
Grouws & Ebmeier (1983) |
Math |
4-12 |
individual
volunteers |
2
@ 1.5 |
3 |
4 |
|
Good
& Grouws (1979) |
Math |
4 |
individual
volunteers |
2
@ 1.5 |
3 |
4 |
|
Mason
& Good (1993) |
Math |
4-6 |
individual
volunteers |
3
@ 1.5 |
4.5 |
5 |
|
Otto
and Schuck (1983) |
Science |
8 |
individual
volunteers |
5
@ variable |
16 |
2.5 |
|
Rubin
& Norman (1992) |
Science |
6-9 |
individual
volunteers |
Univ.
course (10
@ 3) |
30 |
3 |
|
Lawrenz
& McCreath (1988) |
Science |
1-8 |
individual
volunteers |
Univ.
course (15
@ 3) |
45 |
8 |
|
Marek
and Methven (1991) |
Science |
1-5 |
individual
volunteers |
4
wk Summer Institute |
100 |
8 |
|
|
|
Cobb
et al (1991) |
Math |
2 |
individual
volunteers |
1
wk Sum. Inst. + Distributed |
150 |
8 |
|
Wood
and Sellers (1996) |
Math |
2-3 |
individual
volunteers |
1
wk Sum. Inst. + Distributed |
150 |
16 |
|
Category
4: Content Focus
is on How Students Learn and How to Assess Student Learning |
|
Carpenter
et al (1989) |
Math |
1 |
individual
volunteers |
4
wk Summer Institute |
80 |
|
Table
1 groups studies according to program content.2
The four categories include, respectively:
•
Programs
that prescribe a set of teaching behaviors that are expected to apply
generically to all school subjects.
These behaviors might result from process-product research or might
include things like cooperative grouping.
In either case, the methods are expected to be equally
effective across school subjects;
Technically,
category 1 programs are not aimed specifically at mathematics or science
education but instead offer a set of ideas that are presumed to be applicable
to all school subjects. However, because such programs constitute a large
fraction of professional development, and because such programs typically
include mathematics test scores in their portfolio of outcomes, two studies
which illustrate this line of work are included.
•
Programs
that prescribe a set of teaching behaviors that seem generic, but are
proffered as applying to one particular school subject, such as mathematics or
science. Though presented in the context of a particular subject, the
behaviors themselves have a generic quality to them, in that they are expected
to be generally applicable across all topics in that subject;
•
Programs
that provide general guidance on both curriculum and pedagogy for teaching a
particular subject, and that justify their recommended practices with
references to knowledge about how students learn this subject, and
•
Programs
that provide knowledge about how students learn particular subject matter
but do not provide specific guidance on the practices that should be used to
teach that subject.
The
programs being examined in these studies also differ along many of the
dimensions that reformers care about: duration, intensity, focus on individual
teachers versus school-wide focus and so forth. One difference that also needs
to be attended to, however, is the duration of the study itself.
Some of the studies followed students for an entire school year or
longer, while others followed them for only a semester or less.
Longer study durations can reduce apparent program effects because they
increase the likelihood that other events ( staffing changes, fire drills or
other traumas, other policy changes, etc) will disrupt program influences.
Thus as we examine the findings from these studies, we need to be wary
of findings that are based on short-term studies: Their program effects may
appear larger than those of longer-term studies, not because of differences in
program quality but instead because of differences in the length of the study.
One
difference among these categories is especially important, and that is their
tacit model for how they expect their programs
to eventually influence student achievement.
Underlying these different approaches to continuing professional
education are different assumptions about the path between the program and its
eventual effects on student learning.
Figure 1 illustrates these differing sets of assumptions.
Programs in categories 1 and 2 expect their programs first to change
teacher behaviors, and expect that
these behavioral changes will, in turn, lead to student learning.
Those in categories 3 and 4, on the other hand, expect their programs
to first change teacher knowledge;
they tend to be relatively less prescriptive about teaching practices.
The category 3 program provides teachers with knowledge about how
students learn mathematics, with some curriculum materials, and with some
ideas about new practices that will better promote student learning.
The program in category 4 focuses even more narrowly on teacher
knowledge, specifically knowledge of how students learn particular
mathematical ideas. These program
developers do, of course, expect teaching practice to change, but instead of
prescribing the details of the new practice, they assume that changes in
teacher knowledge will stimulate teachers to devise their own new teaching
practices which will, in turn, lead to student learning.
These four categories of program content, then, reflect a continuum from more prescriptive to more discretionary, and from more focused on behavior to more focused on ideas.
Programs
Aimed at Improving Student Learning in Mathematics

Figure 1: Three Paths to Student Learning
For
both of the illustrative studies in category 1, contact time was extensive and
distributed throughout the school year, teachers received in-class
visitations, and the programs worked with whole schools rather than individual
volunteers. Thus, these programs represent the kind of professional
development that has been recommended.
The
category 2 studies focusing on mathematics consist entirely of programs
sponsored by Tom Good and his colleagues (Good & Grouws, 1979; Good,
Grouws, & Ebmeier, 1983; Mason & Good, 1993) and all are variations of
the Missouri Mathematics Model. These program are typically very brief, consisting of just two 1½
hour sessions during which the specific recommended teaching behaviors and
their rationales are explained. Teachers
also receive a manual with more detailed discussion of the Missouri
Mathematics Model.
The
programs in categories 3 and 4 differ considerably from those in category 2,
and are similar to one another in their theoretical orientations.
Both are interested in student cognition, both assume some form of
constructivist theory of learning, and both are interested in increasing
teachers’ attention to problem solving and reasoning in place of recall of
computational procedures. Category
3 studies both focus on a single program, the Problem-Centered Mathematics
Program. This program provides
teachers with knowledge about student learning and thinking in mathematics,
gives them mathematics problems that are designed to be challenging for
students at the grade level they teach, and gives them a class discussion
format that encourages thoughtful engagement with these problems. There is just one study in category 4. It examines the Cognitively Guided Instruction program, which
is similar to the Problem-Centered Mathematics Program in its orientation to
mathematics, but focused more on the particular mathematical content that
students learn in the relevant grade levels and on the particular kinds of
difficulties they are likely to have in learning this content. It does not define what teachers should do with this
knowledge.
Table
2 shows the size of program effects on student achievement in mathematics that
were found in each of these categories of studies.
Each number indicates the size of the program effect in standardized
units relative to a comparison group.
With the exception of category 1 programs, which worked with whole
schools, all studies involved teachers who volunteered to participate and who
were randomly assigned to experimental and comparison groups.
Table 2
Average Standardized Effect Sizes Achieved in Mathematics Studies
| |
Basic Skills | Problem Solving |
toward Math |
| Category | -.14 | .10 | |
| Category 2 | .17 | |
|
| Category 3 | |||
Researchers
tended to measure three types of effects: basic skills, advanced reasoning,
and attitudes toward the subject. Basic
skills were generally assessed with traditional standardized achievement
tests, and researchers devised their own procedures for assessing advanced
reasoning and attitudes. Table 2
suggests that programs in categories 3 and 4 tend to demonstrate greater gains
in reasoning and problem solving as well as comparable or greater gains in
basic skills. Even in basic
skills, the smallest program effects were in category 1 and the largest appear
in category 4. This pattern of
outcomes suggests that the content of programs
does indeed make a difference, and that programs that focus on subject matter
knowledge and on student learning of particular subject matter are likely to
have larger positive benefits for student learning than are programs that
focus mainly on teaching behaviors. This pattern is particularly striking in
light of the fact that the two programs in category 1 more closely approximate
the ideal in terms of form and structure than do the programs in categories 3
and 4.
Why
do the category 3 & 4 programs have greater effects on students?
Several hypotheses have been suggested.
One early hypothesis was that teachers in categories 1 and 2 could not
improve their mathematics teaching because they did not have adequate subject
matter knowledge. However, the
more successful programs in this review were not providing subject matter
knowledge per se, but rather knowledge about how
students learn subject matter knowledge.
No doubt teachers acquired some subject matter knowledge along the way
in these programs, but this was not the central focus of programs in either
category 3 or category 4.
Another
hypothesis is that by giving teachers a greater understanding of how students
learn, programs in categories 3 and 4 enable teachers to continue to develop
and refine their own practices. That
is, it is the lack of
prescriptiveness that makes this knowledge valuable.
In contrast, the Madeline Hunter program (category 1) and the Missouri
Mathematics Model (category 2) both prescribe virtually invariant daily
routines. Though not so rigid,
there is also a recommended pattern for classroom activities and a recommended
set of learning activities in the Problem Centered Mathematics Program
(category 3) as well. The
Cognitively Guided Instruction program provided teachers with the least
amount of specific information as to what they should do in their classrooms
and with the most specific
information about the mathematics content they would be teaching and how
students learn that content.
Programs
Aimed at Improving Student Learning in Science
The
four science studies that provided student outcome data fell entirely into
category 2: They claim to offer teachers techniques that are uniquely suited
to science teaching, but the techniques themselves still have a generic
character, in that they do not depend on the particular science content being
taught. For instance, Rubin &
Norman taught teachers to model particular science processes such as
generating hypotheses, identifying and controlling variables, and defining
things operationally. During
their program, they used generic
lesson formats to train teachers in how to model each of these skills.
Modeling the skill of “identifying and controlling variables”
consists of asking aloud such questions as, “What is the manipulated
variable in this experimental situation?”
However,
the category 2 studies in science reflect two different models of teaching.
To reflect this difference in program content, Table 3 provides two
sub-groupings within its category 2 programs.
Table
3
Standardized Effect Sizes Attained in Each Science Study
| CATEGORY 2 | Basic Skills | Problem Solving |
toward Math |
|
Category 2—Focus on Behaviors that Apply to this Particular Subject |
|||
| (a) Modeling as a Teaching Strategy | .71 |
||
| (b) Learning Cycle as a Teaching Strategy | |||
The
effects shown in Table 3 are larger than their counterparts in Table 2, a
difference that probably reflects greater alignment between instructional
content and assessment content. Because
science curricula in American schools are less standardized than mathematics
curricula, and because science content is not normally included in
standardized achievement tests, science researchers are more likely to devise
their own curriculum materials and their own outcome measures. This was the
case in these studies. Consequently,
there is likely to be a greater articulation between the content taught in
participating “treatment” classrooms and the content assessed by the
science researchers than is the case in mathematics programs.
Like
Table 2, Table 3 appears to suggest that program content matters. It suggests that programs focusing on scientific processes
had greater effects than those focusing on the learning cycle.
However, the two studies that taught teachers to model scientific
processes were extremely brief, extending only 2 ½ and 3 months,
respectively, whereas the two studies that taught students the learning cycle
were full-year studies. Consequently,
differences in effects that appear to reflect program content differences
could be a function of study duration rather than program content.
The Relevance of Program Form and Structure

The
programs examined in this small body of research represent a variety of
program structures, and this variety enables us to examine the merits of
several hypotheses about critical features of continuing professional
education. The patterns suggest
that program content is a central predictor of benefit to students.
They also suggest that many other program dimensions are less reliable
producers of benefits for students.
Briefly, in this small sample of studies, we can conclude that:
•
Differences in total
contact hours were unrelated to student outcomes.
Programs in category 1 provided far more contact hours than programs in
category 2, and yet had smaller effects on student learning.
Similarly, the category 3
program provided more contact hours than the category 4 program did, and yet
did not yield a noticeable advantages for students.
•
Evidence was mixed for the benefits of distributed
time. The studies
in mathematics did not support this hypothesis, for the mathematics program
with the most substantial overall influences on student learning consisted of
a summer institute with no distributed seminars during the next academic year. Conversely, the one program that demonstrated negative
effects on student learning, the Madeline Hunter program studied by Stallings
and Krasavage, provided both
seminars and in-class visitations throughout the school year. Studies in science, on the other hand, offer some support:
One of the studies that focused on the learning cycle provided a concentrated
summer institute, while the other provided a university course with sessions
distributed across a full school semester.
The distributed program appeared to produce greater benefits to
students than did the concentrated summer institute.
•
None of the programs that provided in-class visitations produced
noticeably greater benefits to student learning.
•
The fact that the category 1 programs—those working with whole
schools—demonstrated the smallest influences on student learning among
these studies suggests that providing services to whole school staffs may not
be the most important feature of continuing professional education for
teachers. However, it is likely that whole school programs involve at
least some teachers who did not volunteer to participate, and this fact may
reduce the apparent program benefits.
Based
on the studies reviewed here,, a strong case can be made for attending more to
the content of continuing professional education and for attending less to the
structural and organizational features of such programs.
In these studies, programs whose content focused mainly on teachers’
behaviors demonstrated smaller influences on student learning than did
programs whose content focused on how students learn particular subject
matter. These more successful
professional development programs were not simply courses in mathematics or
science, but instead were about what to teach and how
students learn that subject matter. Cohen
and Hill (1998), in their study of California mathematics reform, also find
that the content of professional development is important.
The programs in categories 3 and 4 were very specific in their focus.
They did not address generic learning, but instead addressed the
learning of particular mathematical ideas.
An
equally important finding from this review is the lack
of clear benefit of several popular structural program features.
The programs reviewed here differed in the total number of contact
hours they provided teachers, in whether or how that time was distributed, in
whether that time included in-class visitations, and in whether teachers
participated as members of whole schools or as individuals.
The reason for the lack of clear benefit for these program dimensions
is likely related to the important role of program content:
A program whose content is not valuable will not be improved by
increasing the number of contact hours, distributing contact hours over time,
providing in-class visits, and so forth.
Structural features alone provide no guarantee of improved teacher
learning or of eventual benefit to students.
What is still unclear, however, is whether,
given important content, these structural features of programs might
further enhance the program's benefit to students.
The central message from these studies, though, is to attend to content
first, before attending to structure.
While
the findings presented here suggest that the focus of professional
development, content, should be attended to first before form and structure,
they also suggest that effective professional development in mathematics and
science treats teachers as professionals.
Reform advocates have argued that teachers will profit more from
knowledge and insights which they can develop in their own ways than from
prescriptions that give them little practical leeway, and the pattern of
program effects shown here suggests that these reformers are right.
[1] This Brief is a summary of
material in a research monograph by Mary Kennedy (1998), Form
and substance in inservice teacher education (Research Monograph No. 13).
Madison: University of Wisconsin–Madison, National Institute for Science
Education.
2 References for the 12 studies in Table 1 are provided at the end of this Brief.
Studies Reviewed