A r c h i v e d  I n f o r m a t i o n

School Reform and Student Diversity - September 1995

J. Assessment of the Outcomes of the Reform

The process of assessment--the gathering and interpretation of data about the knowledge and achievement of students--must be appropriate to those being assessed. Earlier chapters of this report suggested an important finding from the study: Comparable data to assess the impact of the reforms on student achievement across the sites are not available. The lack of data to assess the impact of reforms on the achievement of LEP students stems from both technical and logistical issues. This chapter first examines some of the technical issues around the use of standardized tests to assess LEP students’ achievement and then looks more specifically at the assessment issues faced by the exemplary schools in our study.

Technical Issues in Assessing LEP Students

Calls for reform of schools have been linked to an emphasis on holding schools accountable for improved student outcomes. Historically, standardized tests have been used in large-scale assessment systems to determine the effectiveness of educational programs, schools, and school districts. Many who advocate fundamental school reform argue that students must be held to higher standards and that those standards should be linked to performance assessment systems as a more effective way of measuring student achievement than traditional standardized tests.1

The use of standardized tests to measure the achievement of LEP students has raised serious concerns about the ability of standardized tests in English to provide accurate assessments of LEP students’ level of academic achievement. One set of concerns raised about standardized tests centers around whether the tests have been validated and normed for similar students. Valid inferences from standardized tests can only be drawn for the population for which the test has been validated and normed. Without such validation and norming with a particular group of LEP students (or former LEP students), scores obtained from standardized tests are not likely to accurately reflect the achievement of those students.2 Typically, standardized tests are not normed with LEP students or with former LEP students calling into question the validity of using such tests for LEP or former LEP students.

A second set of issues relates to the ability of standardized tests to disentangle LEP students’ mastery of content from their mastery of academic English. When LEP students are tested on English standardized tests, they, not surprisingly, do less well than their English-only counterparts. Standardized, norm-referenced English language achievement tests do not allow us to sort out the LEP student’s grasp of content (mathematics, for example) from their competence in English.

Assessment in content areas should allow students to demonstrate their mastery of the content being assessed, however assessing students who are in the process of learning English in English does not provide an unambiguous assessment of those students’ mastery of content. The standardized test would measure both the students’ mastery of content as well as the students’ mastery of English. The score resulting from such an assessment would confound the two issues and would likely underestimate the students’ mastery of content.3

Assessment at the Exemplary Schools

The exemplary schools were committed to providing high quality and challenging content to LEP students--the same high quality content that they provided to their English-dominant students. Individual schools developed school-based methods of assessing student mastery of curricular content and used the results of those assessments to improve their programs. Including LEP students in a comprehensive statewide or districtwide system of assessment of academic achievement has proven more problematic. The technical issues described above--the appropriateness of standardized tests for measuring LEP student achievement and the confounding of English language mastery and the mastery of English acquisition for LEP students--were recognized by the exemplary sites.

In this study, we were limited to collecting available data from our exemplary sites. We were unable to test students on comparable instruments or to track their progress over time. Available data had a number of limitations, including the lack of academic achievement data for most LEP students. In most cases, LEP students were not tested using standardized tests for academic achievement because such tests were only available in English. In cases where LEP students were tested in English, school staff were unable to disentangle whether the tests measured academic achievement or content mastery.

We did, however, learn a great deal about assessment of LEP student progress toward English language fluency and the ways in which schools, districts, and states were assessing academic achievement. The remainder of this chapter examines these two critical assessment issues.

State and District Level Assessment of English Language Development

Each of the states had policies for identification of LEP students. Each state had a definition of a LEP student detailed in legislation and regulation. The definitions were very similar; the Massachusetts language provides a good illustration of how states defined a LEP student:

(a) children who were not born in the United States, whose native tongue is a language other than English and who are incapable of performing ordinary classwork in English; and

(b) children who were born in the United States of non-English speaking parents and who are incapable of performing ordinary classwork in English.

Massachusetts General Law, Chapter 71A

States used a combination of methods to determine whether students fit the definition of limited English proficient. In California, Illinois, and Texas, for example, all parents completed a home language survey upon enrolling their child in school. In Massachusetts, parents and students are interviewed about languages used in the home and student language use. In addition, every state required that districts use one of the standard measures of English language fluency--the Language Assessment Scales (LAS), the Bilingual Syntax Measure (BSM), or the Idea-Oral Language Proficiency Test (IPT). On a yearly basis, states required districts, which in turn required schools, to assess LEP students’ progress toward full fluency. In Texas, for example, a mandated Language Proficiency Assessment Committee (LPAC) at each school monitors LEP students’ yearly progress toward fluency. The LPAC is composed of the principal, a bilingual teacher, an ESL teacher, and a parent; the committee meets as necessary over the course of the year to review student records. In Massachusetts, a district representative helps a site-based committee review each student’s progress toward fluency.

Reclassification. Each of the states established criteria to determine when a student became fully proficient in English. Although the criteria differed in specifics, each state required that students reach the level of full fluency on one of the standard measures described above. In addition, students were required to score at a certain percentile level on a test of academic achievement given in English.

The acceptable tests and percentile scores varied from state to state. In California, students needed to score at the 35th percentile level on the CTBS. In Texas, students had to score at the 40th percentile on one of several norm-referenced tests of achievement, or pass the reading portion of the state performance-based test, the Texas Assessment of Academic Skills (TAAS). In Illinois, students needed to score at the 5th stanine (which corresponds to the 40th percentile) on the Iowa Test of Basic Skills (ITBS). In Massachusetts, once a student had been in a program for three years, a committee composed of the ESL teacher, primary language teacher, principal, and a district representative reviewed student records.

School Level Assessment of English Language Fluency

Teachers in the exemplary schools monitored LEP students’ mastery of English in a number of ways. They tracked students’ ability to speak, read, and write in English and used that information to modify classroom experiences.

Three of the exemplary elementary schools--Del Norte, Linda Vista’s bilingual program, and Hollibrook--had programs designed to develop English language mastery over several years. Students moved from grade to grade with an increased proportion of their instruction in English. Transition to English was gradual and teachers could individualize student transition. At the fourth elementary school, Inter-American, the dual language program also used a gradual transition process. In addition, the program had Spanish language development goals for students who entered as English-only speakers. Each of these schools had a process for redesignating students as fully English proficient, but this typically did not affect classroom placement. Students remained in the same classroom setting which included both LEP students and students who had been reclassified as fully English proficient.

In Linda Vista’s sheltered English program, students moved from one level of the English language development program to the next as they mastered the necessary learning outcomes. Students could move from Entry level, to Sheltered B, to Sheltered A, to Transition B, to Transition A at any point during the year as they mastered the previous level. LEP student classroom placement was a direct result of their mastery of English. Teachers who believed a student was prepared to transition to the next level presented his or her case to the LEP Review Panel. Teachers presented a portfolio including work samples and other evidence of performance that indicated that the student had met the standards for his or her current level.

The exemplary middle schools faced a different set of challenges. They had to provide middle school content to newcomer LEP students, to students who had been enrolled in an elementary school LEP program but were not yet ready to transition to an all-English environment, and to students whose parents wanted them to maintain their Spanish language literacy.

Schools responded to these challenges in several ways. The schools with formal newcomer programs--Wiggs and Horace Mann’s Cantonese bilingual program--expected that a student would remain in the newcomer program for only one year. They structured the program to use English and their students’ native languages so the transition to an all-English instructional environment could be made quickly. At Graham and Parks, the program was structured to allow students to make the transition to English at a more developmental pace.

In Horace Mann’s Spanish bilingual program, a few students were newcomers, but most had either been enrolled in a LEP program at the elementary level or were enrolled to maintain their Spanish literacy. Some who were enrolled for Spanish literacy were native Spanish speakers; others were English speakers who had been enrolled in the district’s Spanish dual immersion program. Because of the program’s goal of maintenance of Spanish literacy, students were not necessarily reassigned once they reached fluency in English.

State-level Standardized Academic Achievement Measures

Each of the four states housing the exemplary schools had an accountability system that assessed the level of student achievement of academic goals. In accordance with the national movement towards performance-based assessment, several of these states had recently revised or were in the process of revising their state-level assessments to be more performance-based. Table J-1 contains an overview of achievement testing in each of the four states.

Table J-1
State Academic Assessment Systems, 1993-94





Name of test

California Learning Assessment System (CLAS)5

Illinois Goals Assessment Program (IGAP)

Massachusetts Educational Assessment System (MEAP)

Texas Assessment of Academic Skills (TAAS)

Grade levels

3, 4, 8, 10

3, 6, 8, 10

4, 8, 10

3-8, 10






When LEP students tested

After 3 years in school program

After 3 years in school program

After 3 years in school program

After 3 years in school program

Spanish version of test available





Two of the four states--Texas and California8 --had moved to a performance-based assessment system. The Texas Assessment of Academic Skills (TAAS) was designed to provide students the opportunity to demonstrate knowledge and higher-order thinking skills. The fourth grade science test, for example, was designed to:

The Massachusetts Educational Assessment Program (MEAP) was designed to provide information to schools and districts so that they could identify areas that needed improvement; no individual student scores were reported.9 The Illinois Goal Assessment Program (IGAP) was linked to statewide grade-level goals for student performance in each content area (language arts, science, social studies, and mathematics). The Program also mandated a local assessment component in which schools were to set their own goals and select or develop complementary assessment tools.

Typically, LEP students were not tested in English until they met some state-established criterion; as shown in Table J-1, the criterion in each case was expressed in terms of length of student enrollment in the program. Each of the states directed that students be tested in English on the standardized state assessment after being enrolled in a program for three years.

At the time of our study, Texas was in the process of developing a Spanish version of their TAAS exam, but in the meantime the three-year rule was in force. In Illinois, where the district was required by the state to choose and administer an additional assessment instrument, the Chicago Public School district had chosen to administer the Iowa Test of Basic Skills (ITBS) to its students. The district had selected La Prueba--a Spanish-language assessment--to administer to Spanish LEP students. Inter-American administered La Prueba beginning in third grade to students who began as either Spanish- or English-dominant. In Massachusetts, students were not tested for their first three years in a program unless their parents requested that they be tested.

Data from state-level assessments were for the most part unavailable for LEP students in our exemplary schools for a number of reasons. First, many LEP students had not been tested using English language assessments because students were not tested for the first three years they were enrolled at the school and in the language development program. For schools with a large influx of immigrants, the three-year rule excluded many of their LEP students from testing. This was the situation at Graham and Parks, Linda Vista, Hanshaw, and Wiggs.

The population of LEP students was transient in many of the exemplary schools. Students moved from school to school within the same district and from district to district. The impact of student transiency on the appropriateness of the measure to determine student achievement was similar to the impact of recent immigration. Although transient students might have been enrolled in schools in the United States for more than three years, the program they had been enrolled in was not necessarily the exemplary program we were examining. This was particularly true at Horace Mann and at Hollibrook.

Inter-American had a stable LEP population and most of their LEP (and English-dominant) students had been enrolled at the school since kindergarten. Due to this stability, Inter-American provided the richest source of data on student achievement. Because the program was dual immersion, all students had the benefit of enrollment in the same program. Table J-2 provides data from Inter-American’s 1993 Illinois Goals Assessment Program (IGAP) test at grade six.

Table J-2
Mean Scores of Inter-American 6th Graders on IGAP,
Compared to the District and State

Subject Area (scale)

School (n = 64)



Reading (0-500)




Mathematics (0-500)




Writing (6-32)




The data show that Inter-American sixth graders performed better on average on the IGAP than did students districtwide. They did not perform as well on average as did students in the remainder of the state, except on the writing assessment. The sixth grade performance was typical of the performance at other grade levels: students outperformed their district but not their state counterparts.

Del Norte Heights Elementary School also had a relatively stable student population. Most of the students in upper grades were at the school long enough to be tested on TAAS and for their scores to reflect the quality of Del Norte’s program. Table J-3 shows the percentage of Del

Table J-3
Percentage of Del Norte Heights 4th Graders Passing TAAS,
Compared to the District and State

















All Tests




Norte Heights’ 4th graders who passed each section of the 1994 TAAS test as well as the percentage that passed all tests.

As Table J-3 shows, Del Norte’s 4th graders outperformed their counterparts in both the district and the state. About half of the students who entered Del Norte at kindergarten were LEP, and of those most were reclassified by the end of fourth grade. The test scores shown above include the scores of these reclassified FEP students, of LEP students who had been in the school for more than three years, as well as English-dominant students. Separate scores were not compiled for LEP, FEP, and English-only students.

School-level Assessment of Achievement

The exemplary schools assessed their students’ progress in the classroom in a number of ways. Staff at many schools had participated in workshops in the use of authentic assessment methods and many were beginning to use those methods in their classrooms. Several schools had implemented portfolio assessment systems; however, for the most part the schools relied on traditional grades as a means of reporting student progress.

Classroom teachers monitored student progress in mastering the curriculum through teacher-developed tests. Many teachers reported that they focused on providing students the opportunity to demonstrate higher-order thinking and problem-solving skills, rather than just regurgitating facts or memorized material. Students were required to complete challenging essay and short-answer examinations that demonstrated their ability to analyze and synthesize what they had learned. Teachers saw classroom assessment as an integral part of the teaching/learning process and used the results to adjust their teaching methods and presentation of material.

Teachers at the exemplary schools used performance-based assessment methods to evaluate student performance. Students wrote, edited, and published book reports and essays, and performed other writing tasks. They gave and received constructive feedback to fellow students and maintained journals in which they analyzed reading assignments. They performed complex science experiments that allowed them to discover scientific principles, and then completed writing assignments that extended the lessons. They worked in groups on complex learning activities that required each member of the group to contribute to the completed assignment.

Assignments at the exemplary schools required students to extend their learning beyond completing worksheets. While practicing skills was clearly important, teachers tried to create opportunities for practice that challenged students to think and problem-solve. Teachers relied on portfolio system as a way to gather student work over the course of the year. Students selected work to include in their portfolios and had opportunities to review past assignments to see their own progress throughout the year. Parents were also able to view their child’s portfolios. When meeting with teachers, they could then view their child’s work and see examples of particular strengths and weaknesses and follow their child’s progress over the grading period.

Moving away from traditional methods of assessing student knowledge presented special challenges for teachers. Planning assessments that challenged students to use higher-order thinking and analytical skills required a considerable amount of teacher preparation. Assessing the results of work performed in groups with integrated group products was difficult for some teachers. Because of the challenges of alternative assessment, teachers engaged in professional development activities to help them develop meaningful assessment tools and learn how to use results. Some schools were striving to develop a comprehensive authentic assessment system; in these cases, schools sought external expertise to assist them in their efforts.

The next section describes the computerized portfolio assessment system in place at Linda Vista Elementary School, which had the most well-developed alternative assessment system among the exemplary schools.

Linda Vista’s Alternative Assessment Program. Linda Vista School’s program was ungraded and age appropriate. The school developed an authentic assessment process for assessing both student academic skills and their working and learning styles. The portfolio assessment system was built around a series of specific learning outcomes or standards. These outcomes established expectations and requirements for students at each of the school’s developmental levels: early childhood, primary, middle, and upper. Within each level the outcomes and requirements were linked to student placement in the English language development program: Spanish bilingual, Sheltered A, Sheltered B, Transition A, Transition B, and English proficient. For example, there were separate sets of specific learning outcomes at the middle and upper levels level for Spanish, Sheltered A, Sheltered B, Transition A, Transition B, and English, therefore making six sets of learning outcomes for each level. (At the early childhood and primary levels there are fewer options for the language development program placement.)

Assessment rubrics (descriptions of student performance) were created for each developmental level for oral language, reading, and written language. Rubrics built on each other as the student progressed through the developmental levels. Table J-4 displays a complete set of the language arts learning outcomes for Upper Level (grades 5 and 6) Transition B and selected rubrics for the Upper Level. The complete rubrics each had five or six categories--for written language the categories included Pre-writer, Emergent Writer, Developing Writer, Moderately Experienced Writer, Experienced Writer, and Exceptionally Experienced Writer. Rubrics had been developed for each language arts area. School staff had also developed anchor papers that illustrated what work at rubric level looked like in practice.

Table J-4
Linda Vista Learning Outcomes and Rubrics
Upper Level: Transition B
. . .

Content Area

Learning Outcomes

Rubric Examples

Portfolio Materials

Oral Language

• Speak clearly and loudly

• Use complete sentences

• Address audience and purpose

• Summarize briefly core literature currently used in classroom

3. Developing listener-speaker

• is an experienced speaker, usually attentive

• occasionally takes part in class activities

• makes relevant responses

• expresses ideas in complete sentences

5. Exceptional listener-speaker

• is confident, effective, attentive

• actively takes part in class activities, consistently in leadership role

• Tape students twice a year reading various types of samples

Formal Reading

• Follow directions

• Note important details

• Sequence correctly

• Identify main ideas, supporting details, paragraph topics

• Draw conclusions

• Predict outcomes

• Use context clues

• Recognize cause and effect

• Categorize orally and in writing

• Thematic approach to topic for language arts by use of fiction, nonfiction, biography, mystery, poetry

3. Moderately experienced reader

• is developing fluency as a reader and reads some books with confidence

• is usually most comfortable reading short books with simple narratives

• relies on re-reading favorite or familiar books

• needs help with reading the content areas, especially using reference and information books

5. Exceptionally experienced reader

• a self-motivated confident reader who pursues his/her own interests through reading

• capable of reading in all content areas and of locating and drawing on a variety of resources to research a topic independently

• is able to evaluate evidence drawn from a variety of sources

• is developing critical awareness as a reader

Written Language

• Communicate ideas clearly and fluently in writing

• Address audience and purpose

• Edit for punctuation, grammar and capitalization

• Write five part directions

• Summarize written and oral material

• Write multiple format book report

• Write poetry, limericks

• Write 1½ pages in daily journal

5. Experienced writer

• is self-motivated and confident writer who uses a wide range of techniques to engage the reader

• collection of work demonstrates:

• clear organization

• use of descriptive words

• complete, varied sentences

• selection of vocabulary appropriate for the writing

• beginning to make revisions

• few errors in convention and spelling

• Three student selected samples of writing that demonstrate the entire writing process

• Three book reports with six parts

(title, author, setting, character, summary, opinion) collected in the first, second, and third trimesters

• Three sample journal writing; samples collected during first, second, and third trimesters

Student work was scanned and stored in a computer file along with the teacher’s application of the appropriate rubric. Thus, a student’s current portfolio could be shared with parents and maintained as a part of his or her permanent portfolio for future teachers to review.


A school’s inability to systematically measure LEP students’ academic achievement relative to other students in the school and in the state is a serious issue that needs attention at the state and federal levels. At the school level, schools were beginning to adopt alternative assessment measures including portfolios and performance-based assessment. Teachers viewed assessment as an integral part of the teaching and learning process. The exemplary schools monitored the level of student fluency in English and monitored their progress toward content mastery, adjusting their curriculum in response to student needs.

1 For a more complete discussion of performance assessment linked to standards, see Resnick and Resnick 1991, Madaus, 1993, and Mitchell, 1992.

2 Discussions of the equity issues surrounding standardized tests for LEP students can be found in Dur?n, 1989, Geisinger 1992, Gandara and Merino, 1993, Haladyna, 1992, and LaCelle-Peterson and Rivera, 1994.

3 LaCelle-Peterson and Rivera, 1994.

4 Illinois also required that districts adopt an assessment instrument to measure learning objectives. Chicago used the Iowa Test of Basic Skills (ITBS) with its students. As an alternative to ITBS, La Prueba was used for Inter-American LEP students. The school also used La Prueba with its students whose first language was English after they had been enrolled in the dual immersion program for three years.

5 CLAS was administered in Spring 1993, the year preceding the fieldwork for this study. Since then, the test has been canceled and a new system is being developed.

6 California was in the process of developing a Spanish version of CLAS.

7 A Spanish version of TAAS is to be administered in Spring 1995 at some grade levels.

8 California's CLAS test was suspended in 1994 after being administered twice. A new assessment system will be developed.

9 MEAP was in place at the time fieldwork for this study was conducted. However, a new assessment system is being developed in Massachusetts; it is scheduled to begin in 1997.

[9. Federal Influences] [Table of Contents] [Resources Needed for Reform]