A r c h i v e d  I n f o r m a t i o n

Assessment of Student Performance April 1997

CHAPTER 3

Part 5

South Brunswick's Sixth Grade Research Performance Assessment: Windermere Elementary School

Site Visit Dates: June 2-3, 1994

After realizing its students were leaving elementary school without the research skills the district wanted them to possess, the South Brunswick Township Public Schools developed the Sixth Grade Research Performance Assessment (RPA). Supported by the district's move toward resource-based teaching methods, the RPA helps teachers monitor student progress and provides feedback for instruction.

The Sixth Grade Research Performance Assessment

The district drew from the work and ideas of several different organizations and individuals in developing its assessment. In 1992, following a pilot test in 1991, the assessment was implemented districtwide. The RPA, administered in May and June each year, comprises:

Training of Assessors

The South Brunswick Township Public Schools invites members of the community and other interested individuals to participate in the Sixth Grade Research Performance Assessment as assessors. Before they begin to score students' work, assessors undergo a two-hour training session that includes:

  • Receiving an overview of the assessment process;
  • Reviewing copies of the scoring rubrics;
  • Reading and discussing five "benchmark" papers; and
  • Reviewing directions for using the scoring rubrics and for recording agreed-upon scores.

Based on the training they receive, assessors then work in pairs to judge the presentations of 6th-grade students. Although the assessment is popular among teachers, adequate training of the assessors remains an issue of concern.

Four components of students' work are scored:

Each of the four components is scored with a rubric. The rubric applied to the written report is a five-point holistic rubric: all dimensions associated with a score must be present in the written report for a student to earn that score. The other three rubrics use trait scales: five characteristics of the component are present (1), partially present (?), or not present (0). The outside assessors are responsible for scoring students' reports, presentations, and visuals, while an adult within the school community (but not the student's classroom teacher) assigns students' process scores. Overall, a score of 12 (an average score of three out of five on each of the four components) is considered a passing score on the RPA.

Although it is not the South Brunswick Township Public Schools' only performance assessment (the district has also introduced an early childhood portfolio, and students in grades 3 through 12 maintain a "Best Works Portfolio" that is passed on with them from year to year), the RPA is not viewed as a part of a larger system of performance assessments. Rather, it is an assessment designed to guide curriculum and instruction and to assess specific student skills.

RPA at Windermere Elementary School

Windermere Elementary School is one of seven elementary schools in South Brunswick. During the 1993-94 school year, the school served 500 students in grades K through 6. Sixty-nine percent of those students were white, 12 percent were African American, 16 percent were Asian American and 3 percent were Hispanic; 9 percent of the school's students qualified for a free or reduced-price lunch.

Teachers, students, and parents at Windermere Elementary School all have positive things to say about the RPA. Each of the three 6th-grade teachers interviewed described how he or she had modified instruction in the classroom to emphasize the skills students need to develop for the research assessment. For example, one 6th-grade teacher explained that he had held practice oral presentations throughout the school year.

In addition, the school's librarian is also very enthusiastic about the RPA and attributed increased collaboration between teachers and librarians to the adoption of the assessment. She asserted, and the school's 6th-grade teachers concurred, that ". . . teachers are teaching to the assessment, which is exactly what we want them to be doing."

Both the librarian and the 6th-grade teachers further reported that the school's 4th- and 5th-grade teachers also now emphasize research skills more than they did formerly.

Windermere's 6th-grade teachers say that, as a result of changing instruction, their students' research skills have improved over the three years since the RPA was introduced. Teachers say their students typically like the research assessments because they are more "real" to the students than multiple-choice, standardized tests.

Some students, however, find the assessment process to be very stressful. Parents and teachers share this concern, especially with respect to students with disabilities, who are assessed upon the same criteria and held to the same standards of performance as their non-disabled peers.

According to Windermere teachers and students, the RPA is a valuable educational experience. However, as it exists now, the assessment is a one-time, low-stakes experience with little or no impact on the students' future educational experiences. Students move on to junior high school, where their performance on the RPA is not used for placement decisions or for identifying students who require remediation. Furthermore, the RPA provides only limited feedback for instruction. Because the assessment takes place at the end of the 6th-grade year, teachers can use the information obtained from the assessment only to modify the instructional practices they will use with their next group of students.

Future Plans

The district has plans to develop an 8th-grade research assessment, thereby introducing greater coordination in the teaching and assessment of research skills across elementary and junior high school. When such a plan is put into effect, the RPA promises to become an integrated feature of a more comprehensive system of performance assessment.

New York Regents Portfolios: Hudson High School

Site Visit Dates: May 12-13, 1994

New York has the oldest and largest state testing program in the nation The Regents Examinations. The tests have long been used to maintain state-set educational standards, to influence instruction, to provide accountability (scores are reported to the public annually), to demonstrate individual competency, and to make college admissions decisions in the state of New York.

In 1992, however, the New York Commissioner of Education and the Board of Regents moved to redesign the state's testing program in recognition of the following limitations of the current system:

The Commissioner and Regents consequently adopted the New York New Compact for Learning, which envisions an education system in which all high school students would assemble a locally-designed but state-approved K-through-12 Regents Portfolio for graduation. The portfolio will include both discipline-specific and multi-disciplinary student work samples (papers, projects, and exhibitions) and projects that demonstrate competence across the seven curriculum areas and the state's set of Essential Skills and Dispositions. In addition, under the New Compact for Learning, the Regents Examinations would be replaced by a set of as of yet unspecified performance assessments.

The state of New York intends to phase in these changes gradually. In order to support its move toward a performance-based testing system, the State Department of Education has allowed individual schools to petition for waivers from Regents Examination requirements. The waivers allow teachers to develop and use performance assessments with their students in satisfaction of the Regents requirements, in full or in part. The development work taking place at individual schools will then inform the state's development and introduction of both its locally driven portfolio assessment system and its statewide performance-based assessments.

Performance Assessment Development at Hudson High School

Located in the Hudson Valley near Albany and Schenectady, Hudson High School serves nearly 1,400 9th through 12th graders. In 1993-94, the school's student body was composed primarily of white students (91 percent), with other students being African-American (3 percent), Asian-American (3 percent), Hispanic (2 percent), and "other" (1 percent).

Even prior to the state's adoption of the New Compact for Learning, two enterprising global studies and English teachers designed an interdisciplinary, team-taught course that made connections between history and literature and focused on writing assignments and performance-oriented tasks. The teachers found, however, that the current Regents Examinations, because of their content coverage emphases, prevented them from implementing their new course successfully. They petitioned the state for a waiver that would allow them to substitute for the Regents global studies and English exams a portfolio that would include a reading response journal, a multiple-source paper, a persuasive essay, and a biography project. The request was approved in 1992.

In 1993, the state expanded schools' opportunities to apply for waivers from the Regents Examinations by formally inviting high schools to develop and pilot performance assessments in Regents subject areas. Approved pilot projects could serve as partial (up to 35 percent) satisfaction of the Regents Examinations.

Responding to this invitation, Hudson's principal persuaded leaders in his science, English, and social studies departments to develop proposals to take advantage of the freedom the state's partial waiver allowed. Hudson teachers have since developed performance assessments as full or partial waivers to the Regents Examinations in: 10th- and 11th-grade English and social studies (through integrated courses), 11th-grade English, 9th- and 10th-grade global studies, 9th-grade earth science, and 10th-grade biology.

A Regents Waiver Course in Earth Science

In Hudson's Earth Science course, students conduct a long-term study that requires understanding of key scientific and geological concepts and that promotes the development of analytical, and investigative skills. At the beginning of the year, students are presented with a "pet rock" of unknown composition and origin. They learn as much as possible about their rock and keep a detailed scientific journal recording their observations, inferences, and predictions about the rock's scientific characteristics, genesis, metamorphosis, geographical location, and commercial value. In the second semester students investigate the rock's relationship to the environment. For instance, they may take a field trip to a site where the rock might be found or interview a professional geologist. Students end their investigation with a multimedia, oral presentation summarizing their year of research.

In the integrated English/Social Studies courses, literature and writing are used to enhance and enrich the Global Studies (10th-grade) or U.S. History and Government (11th-grade) programs. Issues raised in social studies provide subjects for in-depth writing and further reading. In part, the course is designed to foster writing across the curriculum and to encourage students to study different historical viewpoints and to draw their own independent conclusions

Impact of Regents Waiver Course Performance Assessments on Teachers, Students, and Parents

Hudson teachers believe that the performance assessments they have developed provide a more realistic appraisal of students' abilities than do the Regents Examinations alone. The performance assessments provide immediate feedback about student progress, and this feedback allows teachers to modify their instructional strategies to enhance student learning.

Teachers have used the freedom the Regents waivers have provided to be more creative in developing assignments that challenge and engage their students, to make inter-disciplinary and "real-world" connections, and to reinforce critical thinking skills and concepts. As one teacher remarked "this has been the single best professional growth experience I have had in 27 years of teaching."

Despite teachers' enthusiasm about the educational value of the waiver courses they have designed, they feel the workload these courses create is "almost suicidal." Between planning and grading, teachers estimate that one section of a waiver course can create up to three hours of additional work each day. In addition, some Hudson teachers have said that, though they design their own assessment tasks and rubrics, they still harbor some concern about the reliability of their performance assessments because they have received no formal training in rubric development or scoring.

Finally, adequate content coverage remains a concern among some Hudson teachers. (Indeed, some teachers, particularly those who teach higher level mathematics and science courses, have declined to join their colleagues in designing waiver courses because they feel that heavy use of time-consuming performance assessments would impede their ability to cover the breadth of material that is essential, in their minds, for preparing students for higher education.) Hudson teachers who are involved in designing performance assessments agreed that they often must sacrifice portions of their curriculum in order to provide room for the discussions, role-playing, oral reports, and lengthy experimentation that performance assessments entail. However, though content coverage remains an issue, most teachers at Hudson who use performance assessments have come to the conclusion that the depth and variety of learning opportunities that performance assessments provide are more important than mastery of broad content.

Teachers at Hudson said that their students are more motivated and enjoy the learning process more when they are encouraged to express their own opinions and given the freedom to respond creatively to challenges. Students themselves also said that these exercises have helped them improve their verbal communication and persuasion skills, as well as their poise and self-confidence.

Parents interviewed for this study were unanimous in agreeing that performance assessments give them more information about their children's progress than the traditional Regents Examinations. They noted that parents receive Regents scores months after their children take the tests and that the scores provide no information about their children's strengths and weaknesses. In contrast, they found they more frequently saw the fruits of their children's work with performance assessment tasks, allowing them to track their children's progress.

Future Plans

The future of Hudson's performance assessment program is linked closely to the future of the Regents examination system. Meanwhile, the pilot program has worked well at Hudson, as it allows both teachers and students to participate voluntarily in the experimental waiver courses. However, extensive performance assessments clearly present extra costs in terms of teacher workload; steps also must be taken to ensure the reliability and validity of each assessment and to train teachers in the development of sound testing approaches. Successful resolution of these issues will be necessary if the reforms are to continue and ultimately to succeed.

Maryland Assessment Reform:
Walters Middle School

Site Visit Dates: May 9-10, 1994

In 1991, the state of Maryland began developing and implementing the Maryland School Performance Assessment Program (MSPAP) as part of its Schools for Success reform initiative. MSPAP is intended to reflect standards of achievement commensurate with 21st-century expectations and to drive the instructional changes that will help students learn how to apply their knowledge to real world situations and become better problem solvers. This case study summary focuses on the implementation and effects of MSPAP at one Maryland middle school.

MSPAP Characteristics and Implementation

MSPAP assessments in reading, writing, language usage, mathematics, science, and social studies are administered in the spring of each school year to all students in grades 3, 5, and 8. Typically, MSPAP tasks require students to respond to a series of questions that lead to a final problem requiring a solution, recommendation, or a decision. Students also must provide an explanation or rationale for their final response. Thus, MSPAP tasks are intended to reveal both the process and content of students' thinking.

Students' responses to the assessment tasks are scored using task-specific rubrics called "scoring keys." Each rubric provides an overview of the type of competency or skill the task elicits and includes a scale for scoring students' responses to the task. Scoring is performed by certified Maryland teachers during the summer, under the direction of the Maryland Department of Education and Measurement Incorporated, a private consulting firm. Based on their MSPAP scores, students are assigned to one of five proficiency levels in the assessed subject area.

While preliminary testing began in 1991, the 1993 MSPAP results served as the baseline for school accountability reviews. For a school to achieve "satisfactory" performance in a given content area, 70 percent of its students must achieve the "satisfactory" proficiency level on the assessment; for a school to attain "excellent" performance in a given content area, 70 percent of its students must achieve the "satisfactory" proficiency level and 25 percent must achieve the "excellent" proficiency level. All Maryland schools are expected to reach the satisfactory standard in all subject areas by the year 2000.

MSPAP and Other Assessments at Walters Middle School

Walters Middle School serves about 860 students in Walters, Maryland, a "bedroom community" for the cities of Baltimore and Washington, D.C. Walters students come from middle and upper-middle class backgrounds and are predominantly white (96 percent), other students being African American (2 percent), Hispanic (2 percent), and Asian American (1 percent). In the first years of MSPAP, students at the school have scored higher than both state and district averages.

Teachers and students at Walters in recent years have begun working not only with MSPAP but also with its school district's own performance assessment system, the Criterion-Referenced Evaluation System (CRES). CRES was developed to support the district's newly restructured Essential Curriculum. The Essential Curriculum incorporates subject area goals, individual course objectives, and five interdisciplinary outcome Learner Behaviors effective communication skills; problem solving and critical thinking; social cooperation and self-discipline; responsible citizenship in the community and environment; and lifelong learning. CRES is intended to monitor students' achievement of these Learner Behaviors and mastery of the Essential Curriculum.

CRES requires students to respond to open-ended types of performance tasks that assess logic, reasoning, and comprehension skills, as well as content mastery. The system consists of both formative (on-going) assessments and end-of-the-year summative assessments.

Formative assessments, which are used at the teacher's discretion, include a variety of extended response tasks intended to provide teachers with information about student progress so as to foster the adjustment of instruction and curriculum and the re-teaching of skills over the course of the school year.

The end-of-year summative CRES assessment, required of students at all grade levels, are intended to provide information to: improve instruction; help teachers measure whether the Essential Curriculum has been learned; and provide helpful information to school improvement teams. The CRES summative exams are scored by individual classroom teachers using rubrics and anchor papers to guide the scoring process.

Impact of MSPAP and CRES

According to the district administrators, the state-initiated MSPAP and the district-initiated CRES are intended to meet similar purposes, and the content and process skills they tap are also similar. For these reasons, district administrators say they are using the MSPAP, in part, as an external validation of their own CRES system, and, indeed, relative trends in MSPAP and CRES scores at the districts' schools have proven remarkably parallel. However, they harbor concerns about the implementation of MSPAP.

In fact, the implementation of two similar assessment systems is taking a toll on Walters teachers. Teachers — and administrators and school board members as well — expressed concerns about valuable instructional time lost to administering the two assessments, and a number of teachers complained about the large amounts of time required to set up the MSPAP group experiment components. One district administrator acknowledged that "...there is a great need to find some relief for teachers" from the labor-intensive nature of performance assessments and to add additional staff development days to the school calendar.

Furthermore, from teachers' perspective, the value added of MSPAP is unclear, and they suggest that the impact MSPAP has had on their teaching practices has been, at most, marginal. This is because MSPAP currently is limited to just three grade levels, whereas students at every grade level take CRES. In addition, CRES assessments are administered and scored by classroom teachers, and individual students' results are available immediately. On the other hand, MSPAP assessments are not scored at the local school level, and the scores have not been received by schools until at least six months after the assessments were administered.

For these reasons, though they believe that MSPAP is superior to multiple-choice testing, Walters teachers identify more instructional value in their own CRES assessments than in the state's assessment. Individuals interviewed for this study suggested that CRES has encouraged teachers "...to think carefully about what they want students to know..." and "...to bring their own critical thinking skills to bear on the teaching styles and methods they are familiar with."

Though their teachers are concerned about the amount of time MSPAP and CRES take, students at Walters Middle School expressed enthusiasm for both performance assessments. Students noted in particular that they enjoyed the group work and experimentation and the opportunity and encouragement to express their personal opinions. Walters teachers and parents also said they believe MSPAP and CRES assessment tasks are "more interesting" and "more relevant" to students than are standard testing formats, and both groups feel that the new tests motivate students to do better work. Several teachers, however, expressed their concerns that both MSPAP and CRES tasks present too great a challenge for students with disabilities.

Parents interviewed for this study were strongly supportive of CRES, which they said provides clear information about the academic standards being set for their children and about their children's strengths and weaknesses with respect to those standards. These parents also indicated that they had received enough information about both the CRES and MSPAP assessments to inform them of assessment content and goals.

Summary

At Walters Middle School, MSPAP joins a district-initiated performance assessment quite similar in format, purpose, content, and skill coverage. Thus, at this school, it seems that perceived (and perhaps actual) redundancy among assessments has resulted in a limited impact of the state assessment on teachers' instructional strategies. However, despite concerns about redundancy and time demands of the two assessments, overall sentiment within the Walters community seems to suggest that MSPAP, as well as CRES, not only represents a departure from old ideas and policies no longer in the best interest of students, but also offers a glimpse of the future of education.

Arizona Assessment Reform:
Manzanita High School

Site Visit Dates: April 27-29, 1994

During the late 1980s, the Arizona legislature became concerned about the quality of public education in the state. In May of 1990, through the collaborative effort of Arizona's Joint Legislative Committee on Goals for Education Excellence, the State Board of Education, and the Arizona Department of Education, Governor Mofford signed into law the Arizona Student Assessment Program (ASAP), a program designed to measure students' progress toward attaining the state's curriculum standards, called Essential Skills.

The Arizona Department of Education (ADE) contracted with the Riverside Publishing Company to develop the ASAP performance assessments. The assessments were piloted in all school districts during the 1991-92 school year. (An evaluation of the validity and reliability of ASAP performance assessments was conducted by Riverside, but their report was not available.) The first full-scale administration of the ASAP was conducted in March 1993.

Arizona Student Assessment Program

ASAP tests all 3rd-, 8th-, and 12th- grade students who are not exempt (by virtue of provisions in their IEPs) from the program in the areas of reading, writing, and mathematics. The assessment system is comprised of an on-demand performance event during which students are asked to construct responses to a variety of tasks. The assessment at each grade level maintains a single theme across the three subject areas. For instance, in 1994 the 12th-grade theme focused on "consumer decisions." Students are allowed two hours to complete each section of the assessment.

The rubrics used to score the tasks are tailored to each individual assessment task and are printed in the response booklets for students to refer to as needed. A small percentage of the assessments are scored by Arizona teachers, with the rest being scored by Measurement, Inc.

Resource and Professional Support

The Arizona legislature allocated no new funds to support the development and implementation of ASAP. All development, training, and administration costs were supported under existing ADE testing budgets.

In the fall of 1990, Arizona conducted a state-wide conference to introduce representatives from each district to ASAP. Over the course of the following school year, ADE officials traveled to regional sites to work with teachers, conducted a "trainer of trainers" seminar to teach scoring, and produced a videotape about ASAP for distribution to all districts. ADE also publishes a quarterly newsletter, Measuring Up, that updates teachers and administrators about developments in ASAP.

ASAP and Other Assessments at Manzanita High School

Manzanita High School is located in suburban Phoenix. The district was the highest performing district on the 12th-grade component of ASAP in both 1993 and 1994. One of the smaller high schools in its district, Manzanita served 1,035 9th through 12th graders during the 1993-94 school year. These students were white (78 percent), Hispanic (15 percent), African American (4 percent), Asian American (2 percent), and Native American (1 percent).

Manzanita's district has an extensive and complex assessment program of its own. Through this program, the district administers multiple-choice, criterion-referenced tests — both pre-tests and post-tests — in all subject areas at all grade levels (9 through 12; the district is comprised only of high schools). In addition, the district administers performance assessments in several subject areas and intends to expand its use of performance assessments over the next few years to include all subjects. The centerpiece of the district's performance assessment program is the "multiparagraph essay" assessment required of all students each year. Thus, in this district, ASAP joined what was already a demanding, time-consuming assessment program.

Impact on Teachers, Students, and Parents

Although most Manzanita teachers support both the concept of performance assessment and may of its manifestations in the district's assessment program, they are, to date, less than enthusiastic about ASAP. They are critical of the current ASAP assessment instruments because they do not believe that the assessment instruments cover the curriculum they are teaching. Even conceding that ASAP aims only to "audit" students' progress toward a subset of the state's Essential Skills, teachers dispute the validity of the instruments. One mathematics teacher said, "The 12th-grade math assessment covers only basic math skills (adding, subtracting, multiplying, and dividing). In 1994 one problem required that students graph a line, but no other 1994 and no 1993 problems required students to perform any algebra, geometry, trigonometry, or calculus."

Manzanita teachers who have participated in ASAP scoring sessions also report difficulty in using the rubrics, finding them to be too general. Furthermore, teachers assert that the combined effect of ASAP and the district's extensive assessment program is to take a lot of instructional time out of the school year.

Students, too, are critical of their experience with ASAP. Two honors students remarked that the essay they were asked to read for the ASAP reading assessment was poorly written, illogical, and contradictory. These two students did not find the ASAP challenging, and they suggested that the assessments might have been appropriate for students at lower performance levels than their own. However, another student who struggles with his school work failed to see any relevance of ASAP to his life. "It's stupid. We don't need to do that stuff." However, one student, bound for a military career, said that, while he did not enjoy the ASAP, he could see the relevance of the skills it tested. "Writing is hard for me, but it's relevant to my future. I'll have to write a lot of reports and things like that."

One parent interviewed knew that her daughter had taken the ASAP exam, but she did not know much about the examination itself. The other parent, a member of Manzanita's site-based planning team, was better informed about ASAP. In her opinion, ASAP is a "minimum competency exam" that tells her nothing about her children's performance that she doesn't learn from their grades and their performance on Advanced Placement exams.

Future Plans

The Arizona Department of Education had originally intended to add to ASAP performance assessments in science and social studies during the next couple of years and to make satisfactory performance on the 12th-grade ASAP a requirement for graduation. Originally, this plan was to have gone into effect in 1996, but the ADE has since extended the time line, now planning to institute ASAP as a graduation requirement for the class of 2004.

This expansion of ASAP is still officially part of the ADE's plan. However, in January of 1995, the new state superintendent, citing technical problems with the assessments, temporarily postponed ASAP in order that the program in its current form could be revisited and evaluated. Consequently, the program was not administered in the winter of 1995. The superintendent intends to reinstate ASAP after the program has been reviewed and, in all likelihood, modified.


-###-


[Chapter 3: Case Study Summaries Part 4 of 5]  [Contents]  [Chapter 4: Cross-Case Analysis 1: Part 1 of 4]