Assessment of Student Performance April 1997
The major objectives of the three-year longitudinal study Studies of Education Reform: Assessment of Student Performance are as follows:2
Our ultimate purpose in this study was to elucidate the status of assessment reform in U.S. education systems and to offer recommendations for policy and future research.
Our approach to meeting the three objectives outlined above was driven by our conceptualization of the relationships among the factors driving and affecting assessment reform and its outcomes. We conceptualized the key characteristics of performance assessments and the facilitators and barriers in assessment reform as interdependent variables that influence teaching and learning processes and student achievement. Exhibit 2-1 illustrates our conceptual scheme.
Our research design employed a qualitative, case-study approach to collecting data about performance assessments and their impacts at the school level. In addition, during the course of the study, we also collected a library of policy, research, historical, and other documents on performance assessments, assessment reform, and education reform in general. Below, we describe the following aspects of our research design:
We conclude by pointing out the strengths and weaknesses of the research design employed with respect to meeting the purposes of the study.
Collection and Analysis of Background Literature
We collected the following types of documents continuously over the entire span of our study:
We culled relevant information from these documents in a literature review at the beginning of the project, in the Fall of 1992, and again in the Spring of 1995, towards the close of the project. We also used these papers to deepen our understanding of the issues related to the development, implementation, and impact of performance assessments and to inform our analyses of the case-study data.
Qualitative Research Methodology: A Case Study Approach
We employed a qualitative, case-study methodology to investigate the development and implementation of performance assessments and their impacts at the school level. We designed a modified time-series approach for gathering data, which enabled us to obtain both cross-sectional and longitudinal data. Cross-sectional data allowed us to make comparative remarks about assessments and school contexts. The longitudinal data allowed us to document the effects of and changes in performance assessments over time within sites.
We selected 16 sites (the definition of "site" for this study encompasses both a performance assessment and a single school at which it is being used) which a team of two researchers visited a single time during a two-day site visit. We then selected a subset of 7 sites, which the team returned to for a second visit (therefore, longitudinal data were collected for only 7 of the 16 sites).
We conducted the first set of site-visits in the Spring of 1994, and the second set of site- visits in the Spring of 1995. (Two of the single time site-visits were conducted in the Spring of 1995). Exhibit 2-2 shows our site-visit design.
Sample Selection Criteria
As described above, our research design called for two waves of data collection: a first round of visits to all sites included in the sample, followed by a second round of visits to a subset of those sites. Below, we describe the criteria we applied to select sites for inclusion in rounds one and two.
The overarching objective of our site selection process was to identify, insofar as possible, a set of school sites that exhibited the range of experiences American schools are having with the development and implementation of performance assessments. For the purposes of our study, we defined a case study site as a single school where a performance assessment was being implemented. The focus of our research was on the assessment and its implementation in the local context. To select the sites, we delineated two sets of criteria those pertaining to performance assessments and those pertaining to schools.
Performance assessments are marked by a number of variable characteristics, and we attempted to obtain variation in our sample within each characteristic. Selection criteria pertaining to performance assessment characteristics included:
Although we were less concerned with school background characteristics (e.g., size, racial and ethnic composition, and socio-economic background of the student body) than we were with the assessment characteristics delineated above, we attempted also to obtain variation across two school characteristics:
As the sample for this qualitative study was to include 16 schools, it is clear that not all combinations of the six factors described above could be included (if, indeed, they even exist). Rather, we aimed to create a sample in which the range of characteristics for each of the six criteria was represented.
We chose a subset of seven sites for a second round of data collection. We based selection of the seven round two sites upon one or more of the following criteria:
Sixteen performance assessments at 16 school sites were selected to comprise the study sample. The 16 sites are identified in Exhibit 2-3.
Exhibit 2-4 provides information about the characteristics of each of the 16 sites included in the study sample. As illustrated, these characteristics demonstrate the variation we achieved in our sample with respect to our selection criteria.
Exhibit 2-5 identifies the seven sites we selected for a second visit during round two of our data collection.
Data Collection Activities and Instruments
Because we were interested in obtaining information about the performance assessment, the educational context within which it was developed and implemented, and the assessment's impact at the local level, we collected documentary, phone interview, and site-visit data.
Prior to and during each site visit we collected background documentary data about the subject assessment. The available data varied across assessments. Types of data collected include:
These data were collected from state and local education officials, school staff members, and representatives of external groups involved in assessment reform (e.g., the New Standards Project). These data were collected throughout the life of the project.
We also collected documentary data about the school sites we visited. These data included reports describing each school's demographic composition, staff description, financial resources, and other relevant documents.
Prior to each site visit, we also conducted initial telephone interviews with cognizant individuals in state and local education offices, the school site, and external assessment reform organizations. We used an interview protocol tailored to the role of the interviewee and to the performance assessment system under investigation.
In the Spring of 1994 we visited the first 14 schools in our sample. In the Spring of 1995 we revisited seven schools and added two new ones to our sample. In total, we conducted 23 site-visits.
Each site visit lasted one-and-a-half to two days and was conducted by a team of two researchers. The researchers interviewed a number of individuals, observed classrooms, and, whenever possible, observed professional development sessions devoted to the development and use of performance assessments, administration of performance assessments, and other activities related to the implementation of performance assessments.
Exhibit 2-6illustrates the roles and numbers of the individuals we interviewed during our first and second round site visits.
We used semi-structured interview protocols during our site visits. The protocols for both waves of data-collection were quite similar in structure, but wave two protocols contained more probing questions about the use and effects of performance assessments on teaching and learning. All interview protocols appear in the technical appendix to this report.
Researchers also observed various performance assessment-related activities. Exhibit 2-7 illustrates the types and numbers of assessment-related activities observed.
The analyses of our data progressed in two overlapping phases. The two phases, within-case data analysis and cross-case data analysis, are detailed below. We utilized Qualitative Data Analysis (Miles & Huberman, 1995) as a sourcebook to inform our data analysis methods.
The first phase of our data analysis consisted of writing case studies of our sites. To reiterate, the definition of our "site" is a performance assessment and its implementation and impact at one particular school. Data from all sources documents, interviews, and observations were synthesized in the case-study report. One member of each site-visit team wrote the case study, and the second member reviewed it for accuracy. Next the case-study was sent to the appropriate officials and school personnel for review and comments. Based upon their feedback, we revised the case study write-ups.
Each case study write-up is divided into the following four sections:3
The second phase of our data analysis focused on extracting and reorganizing information from our case study write-ups into a cross-case comparative format. Based upon the case-study data and the theoretical, empirical, and policy papers we collected, we developed a categorization system for each of our major variables performance assessments, facilitators and barriers, and teaching and learning processes and outcomes. Next, we organized the data from each case-study into the categorization system. After the categorization exercise, we identified both common patterns and unique features in our data in order to:
Thus, our cross-case analysis report comprises four sections.
In the first section we organized, when appropriate, the data according to the organizational level at which the performance assessment system was initiated, developed, or implemented (national, state, district, or school). We developed this organizational scheme to enable us to identify and understand the systematic differences among performance assessments developed and implemented by different levels of education authority.
Similarly, in the second section, we organized the data by the level of initiation, development, or implementation. We isolated and analyzed the organizational factors that influenced the development and implementation of performance assessments at different levels of education authority. In the third section, we organized the data in terms of the facilitators and barriers that affect teachers' ability and willingness to work with the subject performance assessment. Teachers' "appropriation" of the assessment is seen as a necessary prerequisite of meaningful changes in teaching and, hence, learning. These two sections together are keyed to the second study objective.
In the last section (keyed to the third objective), we were interested in the impact of the performance assessment at the local school level. Thus, we organized the data according to different categories of performance assessments, as we wanted to investigate the effects of different types of performance assessments on the teaching and learning processes and outcomes at the local school.
Our approach to data analysis was primarily inductive, and our findings are offered as informed hypotheses that merit further investigation.
A research design such as the one we used has strengths, but it also necessarily imposes certain limitations on the interpretations that can be drawn from the data. We briefly discuss five general limitations of our study. (Specific limitations to our analyses are discussed in the appropriate chapters in this report.)
First, our taxonomy of performance assessments is based upon a limited sample of performance assessments. Although we attempted to obtain a representative sample of performance assessments, we are not certain that the assessments initiated at the district and school levels are, in fact, representative of all district- and school-initiated performance assessments. Hence, our taxonomic scheme may not be accurate and must be viewed as a work in progress.
Second, a comprehensive description and analysis of each of the performance assessments in our sample was not possible. It was beyond the scope of this study to collect the massive amounts of data required for conducting such an analysis.
Third, our findings regarding the facilitators and barriers in assessment reform, especially at the national- and state-levels may be less comprehensive than for those at the district- and school-levels. This limitation stems from the local-level emphasis of our study. We collected information regarding national- and state-level assessment reform from documents and general, as opposed to detailed and probing, interviews. In addition, we did not conduct in-person interviews with state officials and researchers involved in national-level efforts as we did with district- and school-level personnel.
Fourth, our findings regarding the impact of national-, state-, and district-initiated performance assessments are valid only for the schools included in this study; the results obtained for a particular school cannot be generalized to other schools involved with the same performance assessments.
Finally, interviewees' opinions regarding impact of and problems with performance assessments signal the existence of those impacts and problems, but the absence of such opinions does not necessarily suggest the absence of impact of or problems with performance assessments.
2The specific research questions are presented in Appendix A.
3The full case studies appear in Studies of Education Reform: Assessment of Student Performance - Volume II: Case Studies.
Only summary case studies are presented in this volume.
[Chapter 1: Introduction Part 2 of 2] [Chapter 3: Case Study Summaries Part 1 of 5]