 |
Empirical Evidence on Achievement |
Introduction
Some critics of American public education would have us believe that today's
students are far less knowledgeable and skilled than students of twenty, thirty,
or forty years ago. The "evidence" to support this conclusion is
largely anecdotal, based on limited personal experiences, or is a consequence of
selective interpretation of test scores. The reality is that there are no
empirical data to support such a bleak picture of America's students or public
schools.
This does not mean that the public schools are without weaknesses. There
are serious problems which need to be addressed; however, there is no evidence
to suggest that the public schools as a group are as bad as some critics
suggest.
For example, consider student performance on the Iowa Test of Basic Skills
and the Iowa Test of Educational Development, two widely used
commercially-developed, norm-referenced tests. Scores on these tests, which
can be tracked all the way back to the 1930's, were at record highs in the
1990's (Bracey, 1995).
- 68% of teachers would like to see greater use of instructional techniques
which focus on critical thinking and problem-solving, with less emphasis given
to mastery of content.
- 95% of teachers would like kindergarten through 3rd grade classes to
have no more than 20 students.
- 62% of teachers would like each teacher to have a ten-minute break each
morning and afternoon.
Graduation rates for Wisconsin over the past half century also show
remarkable changes. In 1950, only one-third of Wisconsin's adults had a high
school diploma or more. By 1990, nearly 80% of adults in Wisconsin had at least
a high school diploma. Likewise, in 1950, only 12.9% of Wisconsin's adults,
aged 25 or more, had formal education beyond high school. By 1990, this figure
had increased more than three-fold, to 41.5%.
Berliner and Biddle in The Manufactured Crisis (1995) challenge
those who argue that today's students are not as intelligent or able as students
of the past. They offer the following points:
- ". . . since 1932 the mean IQ for white Americans aged two to
seventy-five has risen about .3 points per year" (p. 43). Scores for
other groups are not available.
- "In the United States, today's youth probably average about 15 IQ
points higher than did their grandparents and 7.5 points higher than did their
parents on the Stanford-Binet and Wechsler tests" (p. 43).
- "Or to put this another way, the number of students expected to have
IQ's of 130 or higher--the typical cut-off point for defining giftedness in many
school districts throughout the nation--is now about seven times greater than it
was for the generation now retiring from leadership positions in the country and
often complaining about the poor performance of today's youth. Now that is
something to contemplate" (p. 44).
This section of the paper briefly summarizes six kinds of information
regarding student achievement and competency: (1) National Assessment of
Educational Progress (NAEP), (2) SAT scores, (3) ACT scores, (4) Wisconsin's
high school graduation rate, (5) the Wisconsin Student Assessment System, and
(6) International Assessments. There also are a few comments about the Sandia
Report.
1. National Assessment Results
Since 1969, the National Assessment of Educational Progress (NAEP) has
tested national samples of students ages nine, thirteen, and seventeen. In
general, the scores of students in reading and mathematics have been stable over
the past two decades, whereas scores in science are down slightly.
Berliner and Biddle note that ". . . evidence from the NAEP also does
not confirm the myth of a recent decline in American student achievement.
Instead, it indicates a general pattern of stable achievement combined with
modest growth in achievement among students from minority groups and from 'less
advantaged' backgrounds" (pp. 25-26).
There is no evidence to suggest that students of twenty, thirty, or
forty years ago were any more knowledgeable or skilled. The strengths and
weaknesses of today's students are essentially the same as those of their
parents and grandparents.
Scores of Wisconsin's students on National Assessment Tests have been
very positive:
· In 1994, Wisconsin's fourth graders ranked third among the
44 states and other jurisdictions which participated in the National Assessment
of Educational Progress assessment in reading. The average score for
Wisconsin's fourth graders was 225. Scores ranged from Maine's high of 229 to
Guam's low of 183. The national average was 213.
· In 1992, Wisconsin's fourth grade students tied for second,
while eighth graders tied for fourth on the NAEP mathematics assessment.
· In 1992, Wisconsin's fourth graders ranked sixth on the NAEP
reading assessment.
· In 1990, Wisconsin's eighth grade students ranked sixth in
the first state-by-state comparisons of mathematics performance on the
National Assessment of Educational Progress.
2. SAT Scores
The Scholastic Aptitude Test (now called the Scholastic Assessment Test) was
originally normed in 1941 on a population of 10,654 white males who primarily
attended private eastern universities. The test measures student knowledge in
two areas, verbal and mathematical, and is designed to predict academic success
in college. Scores on the SAT are not reported as the number or percent of
correct answers (there are 138 questions), but as a scale score, ranging from
400 to 1,600.
Bracey (1995) points out the following: "Ever
since the 1970's, when the College Board sponsored a study of the decline in SAT
scores, the minuscule annual score changes have been front-page, prime-time
news. For the last three years, scores have been edging upward, with the 1995
gains the largest in a decade. When the scores were in decline, the New York
Times and the Washington Post positioned the results on page 1. In 1993 and
1994, the New York Times buried the news of the upticks in scores deep in
Section A, while the Post relegated the outcome to the Metro section, which
contains news of local interest. This year, the Post continued its policy of
placing the SAT results in the Metro Section, while on the morning of the
release the New York Times ignored the story altogether. The Washington Times
did put the story on page 1, but implied the gains occurred because the new SAT
is easier" (p. 153).
During the period from approximately 1963 to 1975 there was a decline in
aggregate SAT scores in the range of 60 to 90 scale points. Many argued that
this decline was proof of a serious and significant deterioration in America's
schools. In reality, this decrease of from 60 to 90 points on a 1,200 point
scale represented a drop of approximately 5% in the number of questions answered
correctly.
Furthermore, measurement experts who have investigated the drop in SAT
scores have concluded that the most important reason for the decline was due to
the fact that a greater number of students, especially those with weaker high
school records, began to take the SAT. In short, beginning in the mid-
1960's, takers of the SAT became a less elite population of high school
students. Thus, in recent years, more than one million students take the SAT
annually. Compare this figure with the 10,654 who originally took the SAT in
1941.
Critics also fail to acknowledge that in recent years SAT scores have
increased. In 1995, for example, SAT scores had their largest increase in a
decade. This growth was largely ignored by the popular media.
Bracey makes an additional point: "So although critics have trumpeted
the 'alarming' news that aggregate national SAT scores fell during the late
1960's and the early 1970's, this decline indicates nothing about the
performance of American schools. Rather, it signals that students from a
broader range of backgrounds were then getting interested in college, which
should have been cause for celebration, not alarm" (Berliner and Biddle, p.
21).
Some critics now charge that the recent improvements in SAT scores are due
to the fact that the test is easier. Representatives of SAT, however, maintain
that the test has essentially the same difficulty level as in previous years.
In fact, current scores (and those for 1996-97 when a new scale will be used)
will still be "anchored" to the original 1941 performance levels.
Thus, if one feels compelled to compare the performance of today's students with
the original norming population of nearly sixty years ago, he or she will be
able to do so.
SAT Scores in Wisconsin
Wisconsin's students have consistently outscored students throughout the
nation on the SAT over the past two decades. However, a minority of
Wisconsin's graduating seniors take the SAT. In 1995, about 9% of 12th grade
students (4,998) took the SAT. As these figures are considered, keep in mind
the important conclusion by Powell and Steelman (1996). In their study of
state SAT scores, Powell and Steelman report that more than 80% of the
variation in state SAT averages is attributable to the participation rate. That
is, the fewer students tested in a state, the higher SAT scores tend to be.
SAT scores: Wisconsin and the nation, 1985-1995 |
|
Wisconsin |
Nation |
|
Verbal |
Math |
Total |
Verbal |
Math |
Total |
| 1975 |
492 |
544 |
1036 |
434 |
472 |
906 |
| 1980 |
472 |
533 |
1005 |
424 |
466 |
890 |
| 1985 |
478 |
536 |
1014 |
431 |
475 |
906 |
| 1990 |
466 |
514 |
980 |
422 |
474 |
896 |
| 1995 |
501 |
572 |
1073 |
428 |
482 |
910 |
3. ACT Scores:
Wisconsin has placed first or tied for first on the ACT (American College
Test) for the past eleven years. Overall, the ACT is the predominant college
admissions test in 28 states, including Wisconsin. Scores are reported on a
scale, ranging from 1 to 36. Approximately two-thirds (64%, or 37,194) of
Wisconsin's graduating seniors took the ACT in 1995.
ACT scores: Wisconsin and the nation, 1986,
1990 and 1995 |
|
Wisconsin |
Nation |
| 1986 |
492 |
434 |
| 1990 |
472 |
424 |
| 1995 |
478 |
431 |
4. Wisconsin's High School Graduation Rate
Wisconsin's dropout rate has declined steadily over the past decade. In
1985 the annual dropout rate was 3.65%; in 1995 it declined to its lowest level
ever--2.63%. (Note: A dropout rate of 2.63% means that 2.63% of the
state's students in grades 9-12 dropped out of school during the school year.
This percent represents approximately 6,800 students).
Percent of Wisconsin students who dropped out of
school, 1985-1995 |
| Year |
Percent |
| 1985 |
3.65% |
| 1986 |
3.49% |
| 1987 |
3.24% |
| 1988 |
3.30% |
| 1989 |
3.11% |
| 1990 |
3.13% |
| 1991 |
3.26% |
| 1992 |
3.00% |
| 1993 |
3.15% |
| 1994 |
2.93% |
| 1995 |
2.63% |
This means that at the current time about 87-88% of all 9th grade students
graduate from high school "on time." Others graduate after their
original class (a few return to school; others pass the GED).
National graduation rates are considerably lower, as shown in the table
below.
National graduation rates for selected
years |
| Year |
Percent who graduate |
| 1929-30 |
29% |
| 1939-40 |
50% |
| 1949-50 |
59% |
| 1959-60 |
70% |
| 1969-70 |
77% |
| 1979-80 |
71% |
| 1989-90 |
72% |
| 1994-95 |
73% |
Note: We were not able to obtain graduation rates for Wisconsin for earlier
years. In addition, graduation rates of forty and fifty years ago, however
calculated, are somewhat suspect simply because the "compulsory attendance
laws" of this period were not enforced and/or did not require school
attendance beyond the eighth grade.
5. The Wisconsin Student Assessment System
Beginning with the 1993-94 school year, the DPI's Wisconsin Student
Assessment System (WSAS) has tested eighth and tenth grade students in language,
reading, mathematics, science, social studies, and writing. These tests are
known as the Knowledge And Concepts Examinations.
The latest assessment of students (October, 1995) included 30
multiple-choice questions in each of the areas listed above; also included
were two writing samples and a survey of students' career interests and
educational plans.
Student performance over the three years of testing has improved
steadily. For example, the average Grand Composite Scores (calculated by
adding the scores of each specific subtest) are 162 for both eighth and tenth
grade students during the 1995-96 school year. The eighth and tenth grade
average Grand Composite Scores in 1993-94 were 155 and 154, respectively.
Average Grand Composite Scores, 1993-94 to
1995-96* |
| Year |
Eighth |
Tenth |
| 1993-94 |
155 |
154 |
| 1994-95 |
159 |
158 |
| 1995-96 |
162 |
162 |
Student performance on the state assessment tests is not the same for
all subpopulations. For example, females outperform males, while among the
various ethnic groups, white students have the highest levels of performance.
Average Grand Composite Scores by ethnicity
and gender, 1995-96 |
|
Eighth |
Tenth |
| All students |
159 |
160 |
| Native-American |
134 |
135 |
| Asian-American |
146 |
152 |
| African-American |
119 |
122 |
| Hispanic-American |
134 |
138 |
| White |
163 |
163 |
| Mixed ethnic |
153 |
155 |
|
|
|
| Females |
161 |
162 |
| Males |
156 |
158 |
National Percentile Scores
National comparisons also are available for the Knowledge and Concepts
Examinations. This makes it possible to compare the performance of students in
Wisconsin with students throughout the country.
Except for writing, the 1995-96 national percentile scores for Wisconsin
students were above the national averages on all of the Knowledge and Concepts
Examinations. Performance in writing is mixed; tenth graders compare favorably
with the national average, whereas eighth grade students score slightly below
the national average.
Wisconsin's average percentile scores for
eighth and tenth grade students, 1995-96 |
|
National Percentile Scores* |
| Subject |
Eighth |
Tenth |
| Reading |
59 |
66 |
| Mathematics |
72 |
73 |
| Language |
56 |
62 |
| Science |
65 |
68 |
| Social Studies |
64 |
64 |
| Battery Total |
70 |
74 |
| Imaginative Writing |
49 |
62 |
| Expressive Writing |
45 |
56 |
*The national average is the 50th percentile.
6. International Assessments
Some critics of American education often argue that the results of domestic
assessments are no longer relevant because the United States is now part of a
highly competitive, global economy. They call attention to the relatively poor
performance of U.S. students on international assessments in mathematics and
science. The same critics usually fail to mention that the performance of
U.S. students in reading has been very favorable .
There are at least three problems associated with international
assessments that need to be understood by anyone who uses or reports the
results: (1) the selection of samples, (2) the practice of rank-ordering
countries, and (3) the use of a single statistic to describe a country's quality
of education.
The Selection of Samples
Rotberg (1990) alerts us to the fact that in many international
assessments the performance of representative, national samples of U.S. students
has been compared with elite populations of students in other countries. She
also points out that in some of the assessments of 12th grade students, it was
found that countries which test a greater percentage of their twelfth grade
students have the weakest overall performance. Conversely, countries which
tested smaller percentages did the best. For example, on an eighth-grade
mathematics assessment, Japan was top-ranked, whereas Hong Kong was in the
middle. By 12th grade, however, Hong Kong was top-ranked, and Japan was second.
This apparent "decline" in the performance of Japanese students was a
consequence of the difference in the size of the populations from which the
samples were drawn. A much smaller and more select group of students was
tested in Hong Kong, (only 3% of the students take mathematics in grade 12),
compared with a much larger group of students in Japan.
There have been so many problems associated with testing senior high school
level students that there have been no international assessments in mathematics
and science at the secondary level since 1987 (Bracey, 1996, p. 5).
Other studies, comparing the performance of smaller groups of younger
students also have created headlines about the poor relative performance of
U.S. students. In a 1996 article written for Educational Researcher,
Bracey is especially critical of the research by Stevenson, Stigler and others
who have compared elementary U.S. students with students from China, Japan,
and Taiwan. In general, these studies suggest that the best U. S. elementary
school students would be only average students in Japan or China.
Bracey offers the following comments about these small scale studies:
"The various articles (studies) do not reveal how the schools were
selected or how representative they are. It would be naive in the extreme to
believe that a nation as closed, a nation as obsessed with its public image as
The People's Republic of China . . . would give an American researcher free
access to a random sample of schools" (p. 7).
Furthermore, "over 20% of the Chicago children did not speak English at
home. The Chicago sample was thus not a representative sample of the United
States, nor was it comparable to the Beijing sample on many important
demographic variables. The Chicago sample is heavily weighted with variables
associated with low achievement" (p. 8).
Rank-ordering of Countries
In addition to problems associated with sampling, the results themselves
are frequently misunderstood. Whenever results are reported by the media,
average scores for an entire country are reduced to a single statistic--a rank
among all countries. Average scores for participating countries tend to be
closely bunched, but when countries are ranked from top to bottom, the small
differences in scores tend to become large differences in ranks. For example,
if the scores of U.S. nine- and thirteen-year-olds on the 1992 Second
International Assessment of Educational Progress had been only slightly
different, their ranks would have varied considerably. "If U.S.
13-year-olds had scored 72% correct in science, instead of 67, they would have
finished 5th rather than 13th. Similarly, if the third-ranked 9-year-olds had
scored 60 instead of 65, they would have finished 12th. Most countries score
close together such that small differences in scores make large differences in
ranks" (Bracey, 1996, p. 6).
The Use of a Single Statistic
Use of a single score (a ranking) to summarize the entire U.S. system of
education is simplistic and ignores the variation which exists among the fifty
states, as well as the differences found among school systems within each state.
This is especially critical in a country such as the United States which is
extremely diverse and has great variation in the quality of its public schools.
For example, in the 1992 international assessment of mathematics, U.S.
13-year-olds ranked 13th among 15 nations. However, if other reporting
categories are used, a far different picture emerges. In this instance,
Asian-American students scored the highest on this assessment, while students
from Iowa and North Dakota tied with Korea for third.
Asian students, U.S. Schools (287) Taiwan (285) Korea, Iowa, North
Dakota (283) Advantaged urban students, U.S. (283) White students,
U.S. schools (277) Hungary, Wisconsin (277)
In contrast, the lowest ranked categories were as follows:
Jordan (246) Mississippi (246) Hispanic students, U.S. schools (245) Disadvantaged
urban students, U.S. (239) Black students, U.S. (236) District of
Columbia (234)
7. The Sandia Report
In February, 1990, at the request of the Bush Administration, the Strategic
Studies Center at the Sandia National Laboratory in New Mexico began a
comprehensive review of the effectiveness of K-12 education in the United
States. The request was apparently made in the belief that the Laboratory would
find a system of failing K-12 schools, thus providing a rationale for a national
school voucher system.
The researchers at Sandia gave a positive evaluation of U.S. public
education in April, 1992: "Our most detailed analyses to date have focused
on popular measures used to discuss the status of education in America. We
looked at data over time to put performance of the current system in proper
perspective. To our surprise, on nearly every measure we found steady or
slightly improving trends" (Carson, Huelskamp, and Woodall, p.259).
Conclusions
The general conclusions of the Sandia Report were as follows:
- Educational data are generally incomplete and sometimes inaccurate.
- The data that are available indicate serious problems in American
education. However, they do not support popular headlines nor indicate
system-wide failure. The educational system has never performed better.
- The evidence of decline used to justify system-wide reform is based on
misinterpretations or misrepresentations of the data.
- Based on these conclusions, we believe that the national debate is not
focused on the most pressing problems.
Reforms
The Sandia Report also identified the serious educational problems facing
the nation, concluding that "these challenges do not call for a system-wide
revolution." Among the suggested reforms were the following:
- Improving the performance of disadvantaged students.
- Meeting the educational and training needs of immigrants.
- Upgrading the quality of educational data available to policymakers.
- Improving the status of K-12 educators.
Barriers to Improvement
Finally, the Sandia Report identified the impediments to educational
improvement:
- The crisis rhetoric which hinders reform by claiming system-wide failure.
- The misuse of simplistic measures with dubious value, e.g., the use of
declining average SAT scores or unfavorable international comparisons.
- The preoccupation with the link to economic competitiveness.
- The excessive focus on projected shortfalls in technical expertise
distracts our attention from meeting the basic reforms identified above.
|