skip to main navigation skip to demographic navigationskip to welcome messageskip to quicklinksskip to features
  • Membership Ad Test 3
  • WEAC Member Benefits

Primer: Education Issues - Empirical Evidence on Achievement

Back to the Table of Contents

Introduction

Some critics of American public education would have us believe that today's students are far less knowledgeable and skilled than students of twenty, thirty, or forty years ago. The "evidence" to support this conclusion is largely anecdotal, based on limited personal experiences, or is a consequence of selective interpretation of test scores. The reality is that there are no empirical data to support such a bleak picture of America's students or public schools.

This does not mean that the public schools are without weaknesses. There are serious problems which need to be addressed; however, there is no evidence to suggest that the public schools as a group are as bad as some critics suggest.

For example, consider student performance on the Iowa Test of Basic Skills and the Iowa Test of Educational Development, two widely used commercially-developed, norm-referenced tests. Scores on these tests, which can be tracked all the way back to the 1930's, were at record highs in the 1990's (Bracey, 1995).

  • 68% of teachers would like to see greater use of instructional techniques which focus on critical thinking and problem-solving, with less emphasis given to mastery of content.
  • 95% of teachers would like kindergarten through 3rd grade classes to have no more than 20 students.
  • 62% of teachers would like each teacher to have a ten-minute break each morning and afternoon.

Graduation rates for Wisconsin over the past half century also show remarkable changes. In 1950, only one-third of Wisconsin's adults had a high school diploma or more. By 1990, nearly 80% of adults in Wisconsin had at least a high school diploma. Likewise, in 1950, only 12.9% of Wisconsin's adults, aged 25 or more, had formal education beyond high school. By 1990, this figure had increased more than three-fold, to 41.5%.

Berliner and Biddle in The Manufactured Crisis (1995) challenge those who argue that today's students are not as intelligent or able as students of the past. They offer the following points:

  • ". . . since 1932 the mean IQ for white Americans aged two to seventy-five has risen about .3 points per year" (p. 43). Scores for other groups are not available.
  • "In the United States, today's youth probably average about 15 IQ points higher than did their grandparents and 7.5 points higher than did their parents on the Stanford-Binet and Wechsler tests" (p. 43).
  • "Or to put this another way, the number of students expected to have IQ's of 130 or higher--the typical cut-off point for defining giftedness in many school districts throughout the nation--is now about seven times greater than it was for the generation now retiring from leadership positions in the country and often complaining about the poor performance of today's youth. Now that is something to contemplate" (p. 44).

This section of the paper briefly summarizes six kinds of information regarding student achievement and competency: (1) National Assessment of Educational Progress (NAEP), (2) SAT scores, (3) ACT scores, (4) Wisconsin's high school graduation rate, (5) the Wisconsin Student Assessment System, and (6) International Assessments. There also are a few comments about the Sandia Report.

1. National Assessment Results

Since 1969, the National Assessment of Educational Progress (NAEP) has tested national samples of students ages nine, thirteen, and seventeen. In general, the scores of students in reading and mathematics have been stable over the past two decades, whereas scores in science are down slightly.

Berliner and Biddle note that ". . . evidence from the NAEP also does not confirm the myth of a recent decline in American student achievement. Instead, it indicates a general pattern of stable achievement combined with modest growth in achievement among students from minority groups and from 'less advantaged' backgrounds"

Performance of Students on NAEP Tests

Science 1970 1996
17 yr-olds 305 296
13 yr-olds 255 256
9 yr-olds 225 230
Mathematics 1973 1996
17 yr-olds 304 307
13 yr-olds 266 274
9 yr-olds 219 231
Reading 1971 1996
17 yr-olds 285 287
13 yr-olds 255 259
9 yr-olds 208 212
Writing 1984 1996
17 yr-olds 290 283
13 yr-olds 267 264
9 yr-olds 204 207

All scores are on a 500 point-scale**

There is no evidence to suggest that students of twenty, thirty, or forty years ago were any more knowledgeable or skilled. The strengths and weaknesses of today's students are essentially the same as those of their parents and grandparents.

Scores of Wisconsin's students on National Assessment Tests have been very positive:

  • In 1994, Wisconsin's fourth graders ranked third among the 44 states and other jurisdictions which participated in the National Assessment of Educational Progress assessment in reading. The average score for Wisconsin's fourth graders was 225. Scores ranged from Maine's high of 229 to Guam's low of 183. The national average was 213.
  • In 1992, Wisconsin's fourth grade students tied for second, while eighth graders tied for fourth on the NAEP mathematics assessment.
  • In 1992, Wisconsin's fourth graders ranked sixth on the NAEP reading assessment.
  • In 1990, Wisconsin's eighth grade students ranked sixth in the first state-by-state comparisons of mathematics performance on the National Assessment of Educational Progress.

2. SAT Scores

The Scholastic Aptitude Test (now called the Scholastic Assessment Test) was originally normed in 1941 on a population of 10,654 white males who primarily attended private eastern universities. The test measures student knowledge in two areas, verbal and mathematical, and is designed to predict academic success in college. Scores on the SAT are not reported as the number or percent of correct answers (there are 138 questions), but as a scale score, ranging from 400 to 1,600.

During the period from approximately 1963 to 1975 there was a decline in aggregate SAT scores in the range of 60 to 90 scale points. Many argued that this decline was proof of a serious and significant deterioration in America's schools. In reality, this decrease of from 60 to 90 points on a 1,200 point scale represented a drop of approximately 5% in the number of questions answered correctly.

Furthermore, measurement experts who have investigated the drop in SAT scores have concluded that the most important reason for the decline was due to the fact that a greater number of students, especially those with weaker high school records, began to take the SAT. In short, beginning in the mid- 1960's, takers of the SAT became a less elite population of high school students. Thus, in recent years, more than one million students take the SAT annually. Compare this figure with the 10,654 who originally took the SAT in 1941.

Critics also fail to acknowledge that in recent years SAT scores have increased. In 1995, for example, SAT scores had their largest increase in a decade. This growth was largely ignored by the popular media.

Bracey makes an additional point: "So although critics have trumpeted the 'alarming' news that aggregate national SAT scores fell during the late 1960's and the early 1970's, this decline indicates nothing about the performance of American schools. Rather, it signals that students from a broader range of backgrounds were then getting interested in college, which should have been cause for celebration, not alarm" (Berliner and Biddle, p. 21).

Some critics now charge that the recent improvements in SAT scores are due to the fact that the test is easier. Representatives of SAT, however, maintain that the test has essentially the same difficulty level as in previous years. In fact, current scores (and those for 1996-97 when a new scale will be used) will still be "anchored" to the original 1941 performance levels. Thus, if one feels compelled to compare the performance of today's students with the original norming population of nearly sixty years ago, he or she will be able to do so.

SAT Scores in Wisconsin

Wisconsin's students have consistently outscored students throughout the nation on the SAT over the past two decades. However, a minority of Wisconsin's graduating seniors take the SAT. In 1995, about 9% of 12th grade students (4,998) took the SAT. As these figures are considered, keep in mind the important conclusion by Powell and Steelman (1996). In their study of state SAT scores, Powell and Steelman report that more than 80% of the variation in state SAT averages is attributable to the participation rate. That is, the fewer students tested in a state, the higher SAT scores tend to be.

SAT scores: Wisconsin and the nation,
1975-1995

Wisconsin Nation
Verbal Math Total Verbal Math Total
1975 492 544 1036 434 472 906
1980 472 533 1005 424 466 890
1985 478 536 1014 431 475 906
1990 466 514 980 422 474 896
1995 501 572 1073 428 482 910
1996* 577 586 1163 505 508 1013
1997 597 590 1169 505 511 1016


*Note: a new scale was introduced in1996

It also is interesting to note that in Wisconsin students from public schools tend to score higher on the SAT than do students from religious and independent private schools. In 1997, the composite scores were as follows:

SAT scores: Wisconsin Public, Private, and Religious Schools, 1997

School Type Verbal Math Total
Public Schools 587 600 1187
Private Independent 543 547 1090
Religious 563 571 1134

3. ACT Scores:

Wisconsin has placed first or tied for first on the ACT (American College Test) for the past eleven years. Overall, the ACT is the predominant college admissions test in 28 states, including Wisconsin. Scores are reported on a scale, ranging from 1 to 36. Approximately two-thirds (64%, or 37,194) of Wisconsin's graduating seniors took the ACT in 1995.

ACT scores: Wisconsin and the nation,
1986, 1990, 1995, 1996 and 1997

Year Wisconsin Nation
1986 22.2 20.8
1990 21.8 20.6
1995 22.0 20.8
1996 22.1 20.9
1997 22.3 21.0

4. Wisconsin's High School Graduation Rate

Wisconsin's dropout rate has declined steadily over the past decade. In 1985 the annual dropout rate was 3.65%; in 1996 it declined to its lowest level ever--2.40%. (Note: A dropout rate of 2.40% means that 2.40% of the state’s students in grades 9-12 dropped out of school during the school year).

Percent of Wisconsin students who dropped out of school, 1985-1996

Year Percent
1985 3.65%
1986 3.49%
1987 3.24%
1988 3.30%
1989 3.11%
1990 3.13%
1991 3.26%
1992 3.00%
1993 3.15%
1994 2.93%
1995 2.63%
1996 2.40%

This means that at the current time about 90% of all 9th grade students graduate from high school “on time.” Others graduate after their original class (a few return to school; others pass the GED).

National graduation rates are considerably lower, as shown in the table below.

National graduation rates
for selected years

Year Percent who graduate
1929-30 29%
1939-40 50%
1949-50 59%
1959-60 70%
1969-70 77%
1979-80 71%
1989-90 72%
1994-95 73%
1995-96 76%

We were unable to obtain graduation rates for Wisconsin for earlier years. In addition, graduation rates of forty and fifty years ago, however calculated, are somewhat suspect simply because the “compulsory attendance laws” of this period were not enforced and/or did not require school attendance beyond the eighth grade. Note: it is estimated that 10% of students obtain a high school degree via an alternative route (e.g., GED).

5. Third Grade Reading Comprehension Test

Since 1989, Wisconsin’s third grade students have been tested in the area of reading comprehension. Scores are reported in terms of a minimal standard of proficiency. The standard represents a level which might best be described as “barely passing.” It does not mean that students read at “grade level”.

The test has three purposes: (1) to allow districts to evaluate their primary reading programs, (2) to allow for comparisons of reading performance across schools and districts, and (3) to identify marginal readers who may need remedial reading help.

The performance standard established in 1989 has been applied to all subsequent years. In other words, the performance standard has remained at the same level of difficulty. As would be expected, there have been slight fluctuations in the performance of students over time.

Percent of Students Exceeding the Minimal Performance Standard, 1989 - 1997

Year Percent
1989 87.5%
1990 84.2%
1991 84.1%
1992 87.7%
1993 85.5%
1994 89.4%
1995 88.3%
1996 89.7%
1997 87.1%

6. The Wisconsin Student Assessment System

Beginning with the 1993-94 school year, the DPI’s Wisconsin Student Assessment System (WSAS) has tested eighth and tenth grade public school students in language, reading, mathematics, science, social studies, and writing. In 1996-97, fourth grade students also were tested. These tests are known as the Knowledge and Concepts Examinations.

As in previous years, Wisconsin’s students continue to score higher than the national averages on these tests. The national percentile scores for students tested four, eight, and ten in October, 1996 are as follows:

Average Grand Composite Scores, 1996

Grade Read Math Science SocStud. Lang. Arts
Four 67 63 65 70 62
Eight 67 64 61 66 60
Ten 64 71 63 66 63

Note: national percentile scores range from 1 to 99, with 50 representing the national average

Proficiency Standards:

Beginning with the 1997-98 school year, Wisconsin also will be reporting student performance in terms of proficiency standards. In each area tested (mathematics, reading and language/ writing, science, and social studies) performance will be reported in terms of four levels:

Minimal, Performance, Basic, Proficient, and Advanced.

The definitions of each proficiency level differ by subject area; however, the “general” definitions are as follows:

  • Advanced: Distinguished in the content area. Academic achievement is beyond mastery. Test score provides evidence of in-depth understanding in the academic content area tested.
  • Proficient: Competent in the content area. Academic achievement includes mastery of the important knowledge and skills. Test score shows evidence of skills necessary for progress in the academic content area tested.
  • Basic: Somewhat competent in the content area. Academic achievement includes mastery of most of the important knowledge and skills. Test score shows evidence of at least one major flaw in understanding the academic content area tested. (Basic does not mean that a child is failing in the content area).
  • Minimal Performance: Limited in the content area. Test score shows evidences of major misconceptions or gaps in knowledge and skills basic to progress in the academic content area tested.

A proficiency score answers the question, “How does the performance of my child on this test compare with pre-established high expectations for academic success?”

Proficiency standards were established to set high expectations for all students. Comparative (norm-referenced scores) show that Wisconsin’s students do better than students throughout the country on nearly all tests. However, a proficiency score judges performance in terms of high academic standards set by people in Wisconsin. This is why, for example, a student in Wisconsin may receive a high national percentile score, yet still be judged “Basic” in a content area.

Wisconsin’s proficiency levels are based on what students are expected to know and be able to do. The proficiency levels are not based on the achievement levels of students in the national comparison group. However, the table compares the national percentile scores with the child’s proficiency levels.

Relationship Between national Percentile Scores and
Proficiency Levels

Min Perf Basic Proficient Advanced
Reading
Grade 4 26 or less 27-45 46-90 91 or greater
Grade 8 35 or less 36-51 52-88 89 or greater
Grade 10 28 or less 29-55 56-86 87 or greater
Enhanced Language
Grade 4 26 or less 27-66 67-96 97 or greater
Grade 8 30 or less 31-85 86-97 98 or greater
Grade 10 35 or less 36-79 80-94 95 or greater
Mathematics
Grade 4 20 or less 21-58 59-89 90 or greater
Grade 8 42 or less 43-79 80-94 95 or greater
Grade 10 60 or less 61-83 84-96 97 or greater
Science
Grade 4 18 or less 19-44 45-88 89 or greater
Grade 8 29 or less 30-59 60-89 90 or greater
Grade 10 38 or less 39-68 69-94 95 or greater
Social Studies
Grade 4 28 or less 29-47 48-83 84 or greater
Grade 8 23 or less 24-44 45-79 80 or greater
Grade 10 31 or less 32-50 51-80 81 or greater

* Examples Showing How to Read the Table: A fourth grade student who scores at the 91st national percentile in reading (doing better than 91% of students in the national comparison group) would be “Advanced.” A 4th grade student who scored at the 44th percentile in reading would be “Basic.” In mathematics, a 10th grade student scoring at the national average--the 50th percentile-- would receive a “Minimal Performance” score. In order to be “Advanced” in Mathematics, a 10th grade student would have to score at the 97th percentile or higher.

7. International Assessments

Critics of American education often argue that the results of domestic assessments are no longer relevant because the United States is now part of a highly competitive, global economy. They call attention to the relatively poor performance of U.S. students on selected international assessments, while failing to mention favorable results. For example, in the 1991 international assessment of reading, U.S. 4th grade students scored second after Finland, and our 9th graders ranked ninth among 31 participating countries.

In recent years, the performance of U.S. students in mathematics and science has been more positive.

1995 TIMSS Study--Eighth Grade Results

In the 1995 Third International Mathematics and Science Study (TIMSS) more than one-half million students in five grade levels from 41 nations were tested.

In mathematics, U.S. eighth graders scored slightly below the international average of the 41 participating countries. However, scores of U.S. students were on a par with other industrialized nations, including Canada, Germany, and Great Britain.

In science, eighth graders were above the international average. Again, scores were not significantly different from those of Canada, Great Britain, and Germany.

1995 TIMSS Study--Fourth Grade Results

U.S. fourth graders performed above average in mathematics (ranking 8th out of 26 participating countries). In science, U.S. fourth grade students ranked second, trailing only Korea.

Performance Levels of "The Consortium"

The results of the Third International Mathematics and Science Study (TIMSS) were released in November, 1996. Overall, eighth graders in the United States scored an average of 500 on the mathematics test, compared with an international average of 513. However, eighth graders in a consortium of school districts mainly from the north and northwest suburbs of Chicago scored an average of 587, far better than American students as a group. (The consortium, calling itself the “First in the World Consortium,” consisted of districts representing 32 elementary schools, 17 middle schools, and 6 high schools. The eighth graders from the consortium districts were outperformed significantly only by students in Singapore (UCSMP Newsletter, No. 21, Spring 1996-97).

This high level of performance found among the consortium schools reminds us of the limits of international comparisons in which the scores of an entire country often are reduced to a single statistic, typically a scale score or rank.

Interpreting the Results of International Assessments

In order to understand fully the results of any international assessment, it is crucial that one recognize that there are at least three problems associated with international assessments: (1) the selection of samples, (2) the practice of rank-ordering countries, and (3) the use of a single statistic to describe a country’s quality of education.

The Selection of Samples

Rotberg (1990) alerts us to the fact that in many international assessments the performance of representative, national samples of U.S. students has been compared with elite populations of students in other countries. She also points out that in some of the assessments of 12th grade students, it was found that countries which test a greater percentage of their twelfth grade students have the weakest overall performance. Conversely, countries which tested smaller percentages did the best. For example, on an eighth-grade mathematics assessment, Japan was top-ranked, whereas Hong Kong was in the middle. By 12th grade, however, Hong Kong was top-ranked, and Japan was second. This apparent “decline” in the performance of Japanese students was a consequence of the difference in the size of the populations from which the samples were drawn. A much smaller and more select group of students was tested in Hong Kong, (only 3% of the students take mathematics in grade 12), compared with a much larger group of students in Japan.

Other studies, comparing the performance of smaller groups of younger students also have created headlines about the poor relative performance of U.S. students. In a 1996 article written for Educational Researcher, Bracey is especially critical of the research by Stevenson, Stigler and others who have compared elementary U.S. students with students from China, Japan, and Taiwan. In general, these studies suggest that the best U. S. elementary school students would be only average students in Japan or China.

Bracey offers the following comments about these small scale studies:

“The various articles (studies) do not reveal how the schools were selected or how representative they are. It would be naive in the extreme to believe that a nation as closed, a nation as obsessed with its public image as The People’s Republic of China . . . would give an American researcher free access to a random sample of schools” (p. 7).

Furthermore, “over 20% of the Chicago children did not speak English at home. The Chicago sample was thus not a representative sample of the United States, nor was it comparable to the Beijing sample on many important demographic variables. The Chicago sample is heavily weighted with variables associated with low achievement” (p. 8).

Rank-ordering of Countries

In addition to problems associated with sampling, the results themselves are frequently misunderstood. Whenever results are reported by the media, average scores for an entire country are reduced to a single statistic--a rank among all countries. Average scores for participating countries tend to be closely bunched, but when countries are ranked from top to bottom, the small differences in scores tend to become large differences in ranks. For example, if the scores of U.S. nine- and thirteen-year-olds on the 1992 Second International Assessment of Educational Progress had been only slightly different, their ranks would have varied considerably. “If U.S. 13-year-olds had scored 72% correct in science, instead of 67, they would have finished 5th rather than 13th. Similarly, if the third-ranked 9-year-olds had scored 60 instead of 65, they would have finished 12th. Most countries score close together such that small differences in scores make large differences in ranks”

The Use of a Single Statistic

This is especially critical in a country such as the United States which is extremely diverse and has great variation in the quality of its public schools. For example, in the 1992 international assessment of mathematics, U.S. 13-year-olds ranked 13th among 15 nations. However, if other reporting categories are used, a far different picture emerges. In this instance, Asian-American students scored the highest on this assessment, while students from Iowa and North Dakota tied with Korea for third.

  • Asian students, U.S. Schools (287)
  • Taiwan (285)
  • Korea, Iowa, North Dakota (283)
  • Advantaged urban students, U.S. (283)
  • White students, U.S. schools (277)
  • Hungary, Wisconsin (277)

In contrast, the lowest ranked categories were as follows:

  • Jordan (246)
  • Mississippi (246)
  • Hispanic students, U.S. schools (245)
  • Disadvantaged urban students, U.S. (239)
  • Black students, U.S. (236)
  • District of Columbia (234)

    Back to the Table of Contents