Primer: Education Issues - Empirical Evidence on Achievement
Back
to the Table of Contents
Introduction
Some critics of American public education would have
us believe that today's students are far less knowledgeable and skilled
than students of twenty, thirty, or forty years ago. The "evidence"
to support this conclusion is largely anecdotal, based on limited personal
experiences, or is a consequence of selective interpretation of test scores.
The reality is that there are no empirical data to support such a bleak
picture of America's students or public schools.
This does not mean that the public schools are without
weaknesses. There are serious problems which need to be addressed; however,
there is no evidence to suggest that the public schools as a group are
as bad as some critics suggest.
For example, consider student performance on the Iowa
Test of Basic Skills and the Iowa Test of Educational Development, two
widely used commercially-developed, norm-referenced tests. Scores on these
tests, which can be tracked all the way back to the 1930's, were at record
highs in the 1990's (Bracey, 1995).
- 68% of teachers would like to see greater use of instructional techniques
which focus on critical thinking and problem-solving, with less emphasis
given to mastery of content.
- 95% of teachers would like kindergarten through 3rd grade classes
to have no more than 20 students.
- 62% of teachers would like each teacher to have a ten-minute break
each morning and afternoon.
Graduation rates for Wisconsin over the past half century
also show remarkable changes. In 1950, only one-third of Wisconsin's adults
had a high school diploma or more. By 1990, nearly 80% of adults in Wisconsin
had at least a high school diploma. Likewise, in 1950, only 12.9% of Wisconsin's
adults, aged 25 or more, had formal education beyond high school. By 1990,
this figure had increased more than three-fold, to 41.5%.
Berliner and Biddle in The Manufactured Crisis
(1995) challenge those who argue that today's students are not as intelligent
or able as students of the past. They offer the following points:
- ". . . since 1932 the mean IQ for white Americans aged two to
seventy-five has risen about .3 points per year" (p. 43). Scores
for other groups are not available.
- "In the United States, today's youth probably average about
15 IQ points higher than did their grandparents and 7.5 points higher
than did their parents on the Stanford-Binet and Wechsler tests"
(p. 43).
- "Or to put this another way, the number of students expected
to have IQ's of 130 or higher--the typical cut-off point for defining
giftedness in many school districts throughout the nation--is now about
seven times greater than it was for the generation now retiring from
leadership positions in the country and often complaining about the
poor performance of today's youth. Now that is something to contemplate"
(p. 44).
This section of the paper briefly summarizes six kinds
of information regarding student achievement and competency: (1) National
Assessment of Educational Progress (NAEP), (2) SAT scores, (3) ACT scores,
(4) Wisconsin's high school graduation rate, (5) the Wisconsin Student
Assessment System, and (6) International Assessments. There also are a
few comments about the Sandia Report.
1. National Assessment Results
Since 1969, the National Assessment of Educational Progress
(NAEP) has tested national samples of students ages nine, thirteen, and
seventeen. In general, the scores of students in reading and mathematics
have been stable over the past two decades, whereas scores in science
are down slightly.
Berliner and Biddle note that ". . . evidence from
the NAEP also does not confirm the myth of a recent decline in American
student achievement. Instead, it indicates a general pattern of stable
achievement combined with modest growth in achievement among students
from minority groups and from 'less advantaged' backgrounds"
| |
| Science | 1970 | 1996 |
| 17 yr-olds | 305 | 296 |
| 13 yr-olds | 255 | 256 |
| 9 yr-olds | 225 | 230 |
| Mathematics | 1973 | 1996 |
| 17 yr-olds | 304 | 307 |
| 13 yr-olds | 266 | 274 |
| 9 yr-olds | 219 | 231 |
| Reading | 1971 | 1996 |
| 17 yr-olds | 285 | 287 |
| 13 yr-olds | 255 | 259 |
| 9 yr-olds | 208 | 212 |
| Writing | 1984 | 1996 |
| 17 yr-olds | 290 | 283 |
| 13 yr-olds | 267 | 264 |
| 9 yr-olds | 204 | 207 |
All scores are on a 500 point-scale**
There is no evidence to suggest that students of twenty,
thirty, or forty years ago were any more knowledgeable or skilled. The
strengths and weaknesses of today's students are essentially the same
as those of their parents and grandparents.
Scores of Wisconsin's students on National Assessment
Tests have been very positive:
- In 1994, Wisconsin's fourth graders ranked third among the 44 states
and other jurisdictions which participated in the National Assessment
of Educational Progress assessment in reading. The average score for
Wisconsin's fourth graders was 225. Scores ranged from Maine's high
of 229 to Guam's low of 183. The national average was 213.
- In 1992, Wisconsin's fourth grade students tied for second, while
eighth graders tied for fourth on the NAEP mathematics assessment.
- In 1992, Wisconsin's fourth graders ranked sixth on the NAEP reading
assessment.
- In 1990, Wisconsin's eighth grade students ranked sixth in the first
state-by-state comparisons of mathematics performance on the National
Assessment of Educational Progress.
2. SAT Scores
The Scholastic Aptitude Test (now called the Scholastic
Assessment Test) was originally normed in 1941 on a population of 10,654
white males who primarily attended private eastern universities. The test
measures student knowledge in two areas, verbal and mathematical, and
is designed to predict academic success in college. Scores on the SAT
are not reported as the number or percent of correct answers (there are
138 questions), but as a scale score, ranging from 400 to 1,600.
During the period from approximately 1963 to 1975 there
was a decline in aggregate SAT scores in the range of 60 to 90 scale points.
Many argued that this decline was proof of a serious and significant deterioration
in America's schools. In reality, this decrease of from 60 to 90 points
on a 1,200 point scale represented a drop of approximately 5% in the number
of questions answered correctly.
Furthermore, measurement experts who have investigated
the drop in SAT scores have concluded that the most important reason for
the decline was due to the fact that a greater number of students, especially
those with weaker high school records, began to take the SAT. In short,
beginning in the mid- 1960's, takers of the SAT became a less elite population
of high school students. Thus, in recent years, more than one million
students take the SAT annually. Compare this figure with the 10,654 who
originally took the SAT in 1941.
Critics also fail to acknowledge that in recent years
SAT scores have increased. In 1995, for example, SAT scores had their
largest increase in a decade. This growth was largely ignored by the popular
media.
Bracey makes an additional point: "So although
critics have trumpeted the 'alarming' news that aggregate national SAT
scores fell during the late 1960's and the early 1970's, this decline
indicates nothing about the performance of American schools. Rather, it
signals that students from a broader range of backgrounds were then getting
interested in college, which should have been cause for celebration, not
alarm" (Berliner and Biddle, p. 21).
Some critics now charge that the recent improvements
in SAT scores are due to the fact that the test is easier. Representatives
of SAT, however, maintain that the test has essentially the same difficulty
level as in previous years. In fact, current scores (and those for 1996-97
when a new scale will be used) will still be "anchored" to the
original 1941 performance levels. Thus, if one feels compelled to compare
the performance of today's students with the original norming population
of nearly sixty years ago, he or she will be able to do so.
Wisconsin's students have consistently outscored students
throughout the nation on the SAT over the past two decades. However, a
minority of Wisconsin's graduating seniors take the SAT. In 1995, about
9% of 12th grade students (4,998) took the SAT. As these figures are considered,
keep in mind the important conclusion by Powell and Steelman (1996). In
their study of state SAT scores, Powell and Steelman report that more
than 80% of the variation in state SAT averages is attributable to the
participation rate. That is, the fewer students tested in a state, the
higher SAT scores tend to be.
| |
| Wisconsin | Nation |
| Verbal | Math | Total | Verbal | Math | Total |
| 1975 | 492 | 544 | 1036 | 434 | 472 | 906 |
| 1980 | 472 | 533 | 1005 | 424 | 466 | 890 |
| 1985 | 478 | 536 | 1014 | 431 | 475 | 906 |
| 1990 | 466 | 514 | 980 | 422 | 474 | 896 |
| 1995 | 501 | 572 | 1073 | 428 | 482 | 910 |
| 1996* | 577 | 586 | 1163 | 505 | 508 | 1013 |
| 1997 | 597 | 590 | 1169 | 505 | 511 | 1016 |
*Note: a new scale was introduced in1996
It also is interesting to note that in Wisconsin students
from public schools tend to score higher on the SAT than do students from
religious and independent private schools. In 1997, the composite scores
were as follows:
| |
| School Type | Verbal | Math | Total |
| Public Schools | 587 | 600 | 1187 |
| Private Independent | 543 | 547 | 1090 |
| Religious | 563 | 571 | 1134 |
3. ACT Scores:
Wisconsin has placed first or tied for first on the
ACT (American College Test) for the past eleven years. Overall, the ACT
is the predominant college admissions test in 28 states, including Wisconsin.
Scores are reported on a scale, ranging from 1 to 36. Approximately two-thirds
(64%, or 37,194) of Wisconsin's graduating seniors took the ACT in 1995.
| |
| Year | Wisconsin | Nation |
| 1986 | 22.2 | 20.8 |
| 1990 | 21.8 | 20.6 |
| 1995 | 22.0 | 20.8 |
| 1996 | 22.1 | 20.9 |
| 1997 | 22.3 | 21.0 |
4. Wisconsin's High School Graduation Rate
| |
| Year | Percent |
| 1985 | 3.65% |
| 1986 | 3.49% |
| 1987 | 3.24% |
| 1988 | 3.30% |
| 1989 | 3.11% |
| 1990 | 3.13% |
| 1991 | 3.26% |
| 1992 | 3.00% |
| 1993 | 3.15% |
| 1994 | 2.93% |
| 1995 | 2.63% |
| 1996 | 2.40% |
This means that at the current time about 90% of all
9th grade students graduate from high school on time. Others
graduate after their original class (a few return to school; others pass
the GED).
National graduation rates are considerably lower, as
shown in the table below.
| |
| Year | Percent who graduate |
| 1929-30 | 29% |
| 1939-40 | 50% |
| 1949-50 | 59% |
| 1959-60 | 70% |
| 1969-70 | 77% |
| 1979-80 | 71% |
| 1989-90 | 72% |
| 1994-95 | 73% |
| 1995-96 | 76% |
We were unable to obtain graduation rates for Wisconsin
for earlier years. In addition, graduation rates of forty and fifty years
ago, however calculated, are somewhat suspect simply because the compulsory
attendance laws of this period were not enforced and/or did not
require school attendance beyond the eighth grade. Note: it is estimated
that 10% of students obtain a high school degree via an alternative route
(e.g., GED).
Since 1989, Wisconsins third grade students have
been tested in the area of reading comprehension. Scores are reported
in terms of a minimal standard of proficiency. The standard represents
a level which might best be described as barely passing. It
does not mean that students read at grade level.
The test has three purposes: (1) to allow districts
to evaluate their primary reading programs, (2) to allow for comparisons
of reading performance across schools and districts, and (3) to identify
marginal readers who may need remedial reading help.
The performance standard established in 1989 has been
applied to all subsequent years. In other words, the performance standard
has remained at the same level of difficulty. As would be expected, there
have been slight fluctuations in the performance of students over time.
| |
| Year | Percent |
| 1989 | 87.5% |
| 1990 | 84.2% |
| 1991 | 84.1% |
| 1992 | 87.7% |
| 1993 | 85.5% |
| 1994 | 89.4% |
| 1995 | 88.3% |
| 1996 | 89.7% |
| 1997 | 87.1% |
6. The Wisconsin Student Assessment System
Beginning with the 1993-94 school year, the DPIs
Wisconsin Student Assessment System (WSAS) has tested eighth and tenth
grade public school students in language, reading, mathematics, science,
social studies, and writing. In 1996-97, fourth grade students also were
tested. These tests are known as the Knowledge and Concepts Examinations.
As in previous years, Wisconsins students continue
to score higher than the national averages on these tests. The national
percentile scores for students tested four, eight, and ten in October,
1996 are as follows:
| |
| Grade | Read | Math | Science | SocStud. | Lang. Arts |
| Four | 67 | 63 | 65 | 70 | 62 |
| Eight | 67 | 64 | 61 | 66 | 60 |
| Ten | 64 | 71 | 63 | 66 | 63 |
Note: national percentile scores range from 1 to 99,
with 50 representing the national average
Proficiency Standards:
Beginning with the 1997-98 school year, Wisconsin also
will be reporting student performance in terms of proficiency standards.
In each area tested (mathematics, reading and language/ writing, science,
and social studies) performance will be reported in terms of four levels:
Minimal, Performance, Basic, Proficient, and Advanced.
The definitions of each proficiency level differ by
subject area; however, the general definitions are as follows:
- Advanced: Distinguished in the content area. Academic achievement
is beyond mastery. Test score provides evidence of in-depth understanding
in the academic content area tested.
- Proficient: Competent in the content area. Academic achievement includes
mastery of the important knowledge and skills. Test score shows evidence
of skills necessary for progress in the academic content area tested.
- Basic: Somewhat competent in the content area. Academic achievement
includes mastery of most of the important knowledge and skills. Test
score shows evidence of at least one major flaw in understanding the
academic content area tested. (Basic does not mean that a child is failing
in the content area).
- Minimal Performance: Limited in the content area. Test score shows
evidences of major misconceptions or gaps in knowledge and skills basic
to progress in the academic content area tested.
A proficiency score answers the question, How
does the performance of my child on this test compare with pre-established
high expectations for academic success?
Proficiency standards were established to set high expectations
for all students. Comparative (norm-referenced scores) show that Wisconsins
students do better than students throughout the country on nearly all
tests. However, a proficiency score judges performance in terms of high
academic standards set by people in Wisconsin. This is why, for example,
a student in Wisconsin may receive a high national percentile score, yet
still be judged Basic in a content area.
Wisconsins proficiency levels are based on what
students are expected to know and be able to do. The proficiency levels
are not based on the achievement levels of students in the national comparison
group. However, the table compares the national percentile scores with
the childs proficiency levels.
| |
| Min Perf | Basic | Proficient | Advanced |
| Reading |
| Grade 4 | 26 or less | 27-45 | 46-90 | 91 or greater |
| Grade 8 | 35 or less | 36-51 | 52-88 | 89 or greater |
| Grade 10 | 28 or less | 29-55 | 56-86 | 87 or greater |
| Enhanced Language |
| Grade 4 | 26 or less | 27-66 | 67-96 | 97 or greater |
| Grade 8 | 30 or less | 31-85 | 86-97 | 98 or greater |
| Grade 10 | 35 or less | 36-79 | 80-94 | 95 or greater |
| Mathematics |
| Grade 4 | 20 or less | 21-58 | 59-89 | 90 or greater |
| Grade 8 | 42 or less | 43-79 | 80-94 | 95 or greater |
| Grade 10 | 60 or less | 61-83 | 84-96 | 97 or greater |
| Science |
| Grade 4 | 18 or less | 19-44 | 45-88 | 89 or greater |
| Grade 8 | 29 or less | 30-59 | 60-89 | 90 or greater |
| Grade 10 | 38 or less | 39-68 | 69-94 | 95 or greater |
| Social Studies |
| Grade 4 | 28 or less | 29-47 | 48-83 | 84 or greater |
| Grade 8 | 23 or less | 24-44 | 45-79 | 80 or greater |
| Grade 10 | 31 or less | 32-50 | 51-80 | 81 or greater |
* Examples Showing How to Read the Table: A fourth grade
student who scores at the 91st national percentile in reading (doing better
than 91% of students in the national comparison group) would be Advanced.
A 4th grade student who scored at the 44th percentile in reading would
be Basic. In mathematics, a 10th grade student scoring at
the national average--the 50th percentile-- would receive a Minimal
Performance score. In order to be Advanced in Mathematics,
a 10th grade student would have to score at the 97th percentile or higher.
7. International Assessments
Critics of American education often argue that the results
of domestic assessments are no longer relevant because the United States
is now part of a highly competitive, global economy. They call attention
to the relatively poor performance of U.S. students on selected international
assessments, while failing to mention favorable results. For example,
in the 1991 international assessment of reading, U.S. 4th grade students
scored second after Finland, and our 9th graders ranked ninth among 31
participating countries.
In recent years, the performance of U.S. students in
mathematics and science has been more positive.
In the 1995 Third International Mathematics and Science
Study (TIMSS) more than one-half million students in five grade levels
from 41 nations were tested.
In mathematics, U.S. eighth graders scored slightly
below the international average of the 41 participating countries. However,
scores of U.S. students were on a par with other industrialized nations,
including Canada, Germany, and Great Britain.
In science, eighth graders were above the international
average. Again, scores were not significantly different from those of
Canada, Great Britain, and Germany.
U.S. fourth graders performed above average in mathematics
(ranking 8th out of 26 participating countries). In science, U.S. fourth
grade students ranked second, trailing only Korea.
The results of the Third International Mathematics and
Science Study (TIMSS) were released in November, 1996. Overall, eighth
graders in the United States scored an average of 500 on the mathematics
test, compared with an international average of 513. However, eighth graders
in a consortium of school districts mainly from the north and northwest
suburbs of Chicago scored an average of 587, far better than American
students as a group. (The consortium, calling itself the First in
the World Consortium, consisted of districts representing 32 elementary
schools, 17 middle schools, and 6 high schools. The eighth graders from
the consortium districts were outperformed significantly only by students
in Singapore (UCSMP Newsletter, No. 21, Spring 1996-97).
This high level of performance found among the consortium
schools reminds us of the limits of international comparisons in which
the scores of an entire country often are reduced to a single statistic,
typically a scale score or rank.
In order to understand fully the results of any international
assessment, it is crucial that one recognize that there are at least three
problems associated with international assessments: (1) the selection
of samples, (2) the practice of rank-ordering countries, and (3) the use
of a single statistic to describe a countrys quality of education.
The Selection of Samples
Rotberg (1990) alerts us to the fact that in many international
assessments the performance of representative, national samples of U.S.
students has been compared with elite populations of students in other
countries. She also points out that in some of the assessments of 12th
grade students, it was found that countries which test a greater percentage
of their twelfth grade students have the weakest overall performance.
Conversely, countries which tested smaller percentages did the best. For
example, on an eighth-grade mathematics assessment, Japan was top-ranked,
whereas Hong Kong was in the middle. By 12th grade, however, Hong Kong
was top-ranked, and Japan was second. This apparent decline
in the performance of Japanese students was a consequence of the difference
in the size of the populations from which the samples were drawn. A much
smaller and more select group of students was tested in Hong Kong, (only
3% of the students take mathematics in grade 12), compared with a much
larger group of students in Japan.
Other studies, comparing the performance of smaller
groups of younger students also have created headlines about the poor
relative performance of U.S. students. In a 1996 article written for Educational
Researcher, Bracey is especially critical of the research by Stevenson,
Stigler and others who have compared elementary U.S. students with students
from China, Japan, and Taiwan. In general, these studies suggest that
the best U. S. elementary school students would be only average students
in Japan or China.
The various articles (studies) do not reveal how
the schools were selected or how representative they are. It would be
naive in the extreme to believe that a nation as closed, a nation as obsessed
with its public image as The Peoples Republic of China . . . would
give an American researcher free access to a random sample of schools
(p. 7).
Furthermore, over 20% of the Chicago children
did not speak English at home. The Chicago sample was thus not a representative
sample of the United States, nor was it comparable to the Beijing sample
on many important demographic variables. The Chicago sample is heavily
weighted with variables associated with low achievement (p. 8).
Rank-ordering of Countries
In addition to problems associated with sampling, the
results themselves are frequently misunderstood. Whenever results are
reported by the media, average scores for an entire country are reduced
to a single statistic--a rank among all countries. Average scores for
participating countries tend to be closely bunched, but when countries
are ranked from top to bottom, the small differences in scores tend to
become large differences in ranks. For example, if the scores of U.S.
nine- and thirteen-year-olds on the 1992 Second International Assessment
of Educational Progress had been only slightly different, their ranks
would have varied considerably. If U.S. 13-year-olds had scored
72% correct in science, instead of 67, they would have finished 5th rather
than 13th. Similarly, if the third-ranked 9-year-olds had scored 60 instead
of 65, they would have finished 12th. Most countries score close together
such that small differences in scores make large differences in ranks
The Use of a Single Statistic
Use of a single score (a ranking) to summarize the
entire U.S. system of education is simplistic and ignores the variation
which exists among the fifty states, as well as the differences found
among school systems within each state.
This is especially critical in a country such as the
United States which is extremely diverse and has great variation in the
quality of its public schools. For example, in the 1992 international
assessment of mathematics, U.S. 13-year-olds ranked 13th among 15 nations.
However, if other reporting categories are used, a far different picture
emerges. In this instance, Asian-American students scored the highest
on this assessment, while students from Iowa and North Dakota tied with
Korea for third.
- Asian students, U.S. Schools (287)
- Taiwan (285)
- Korea, Iowa, North Dakota (283)
- Advantaged urban students, U.S. (283)
- White students, U.S. schools (277)
- Hungary, Wisconsin (277)
In contrast, the lowest ranked categories were as follows:
- Jordan (246)
- Mississippi (246)
- Hispanic students, U.S. schools (245)
- Disadvantaged urban students, U.S. (239)
- Black students, U.S. (236)
- District of Columbia (234)
Back
to the Table of Contents