Evaluating Online Learning: Challenges and Strategies for Success
July 2008

Meeting the Needs of Multiple Stakeholders

Every good evaluation begins with a clearly stated purpose and a specific set of questions to be answered. These questions drive the evaluation approach and help determine the specific data collection techniques that evaluators will use. Sometimes, however, program evaluators find themselves in the difficult position of needing to fulfill several purposes at once, or needing to answer a wide variety of research questions. Most stakeholders have the same basic question—Is it working? But not everyone defines working in the same way. While policymakers may be interested in gains in standardized test scores, program leaders may be equally interested in other indicators of success, such as whether the program is addressing the needs of traditionally underrepresented subgroups, or producing outcomes that are only indirectly related to test scores, like student engagement.

Naturally, these questions will not be answered in the same way. For example, if stakeholders want concrete evidence about the impact on student achievement, evaluators might conduct a randomized controlled trial—the gold standard for assessing program effects—or a quasi-experimental design that compares test scores of program participants with students in matched comparison groups. But if stakeholders want to know, for example, how a program has been implemented across many sites, or why it is leading to particular outcomes, then they might opt for a descriptive study, incorporating such techniques as surveys, focus groups, or observations of program participants to gather qualitative process data.

When multiple stakeholders have differing interests and questions, how can evaluators meet these various expectations?

To satisfy the demands of multiple stakeholders, evaluators often combine formative and summative components (see Glossary of Common Evaluation Terms, p. 65). In the case of Alabama Connecting Classrooms, Educators, & Students Statewide Distance Learning (ACCESS), described below, the evaluators have been very proactive in designing a series of evaluations that, collectively, yield information that has utility both for program improvement and for understanding program performance. In the case of Arizona Virtual Academy (AZVA), the school's leadership team has made the most of the many evaluation activities they are required to complete by using findings from those activities for their own improvement purposes and piggybacking on them with data collection efforts of their own. In each instance, program leaders would likely say that their evaluations are first and foremost intended to improve their programs. Yet, when called upon to show achievement results, they can do that as well.

Combine Formative and Summative Evaluation Approaches to Meet Multiple Demands

From the beginning of their involvement with ACCESS, evaluators from the International Society for Technology in Education (ISTE) have taken a combined summative and formative approach to studying this state-run program that offers both Web-based and interactive videoconferencing courses. The original development proposal for ACCESS included an accountability plan that called for ongoing monitoring of the program to identify areas for improvement and to generate useful information that could be shared with other schools throughout the state. In addition, from the beginning, program leaders and state policymakers expressed interest in gathering data about the program's impact on student learning. To accomplish these multiple goals, ISTE completed two successive evaluations for Alabama, each of which had both formative and summative components. A third evaluation is under way.

The first evaluation, during the program's pilot implementation, focused on providing feedback that could be used to modify the program, if need be, and on generating information to share with Alabama schools. Evaluation activities at this stage included a literature review and observation visits to six randomly selected pilot sites, where evaluators conducted interviews and surveys. They also ran focus groups, for which researchers interviewed respondents in a group setting. Evaluators chose these methods to generate qualitative information about how the pilot program was being implemented and what changes might be needed to strengthen it.

The second evaluation took more of a summative approach, looking to see whether or not ACCESS was meeting its overall objectives. First, evaluators conducted surveys and interviews of students and teachers, as well as interviews with school administrators and personnel at the program's three Regional Support Centers. In addition, they gathered student enrollment and achievement data, statewide course enrollment and completion rates, and other program outcome data, such as the number of new distance courses developed and the number of participating schools.

This second evaluation also used a quasi-experimental design (see Glossary of Common Evaluation Terms, p. 65) to provide information on program effects. Evaluators compared achievement outcomes between ACCESS participants and students statewide, between students in interactive videoconferencing courses and students in traditional settings, and between students who participated in online courses and those who took courses offered in the interactive videoconferencing format.

As of early 2008, the evaluators were conducting a third evaluation, integrating the data from the first two studies and focusing on student achievement. Looking ahead, ACCESS leaders plan to continue gathering data annually in anticipation of conducting longitudinal studies that will identify ACCESS's full impact on student progress and achievement.

Together, the carefully planned evaluation activities conducted by ACCESS's evaluators have generated several findings and recommendations that already have been used to strengthen the program. For example, their findings suggest that students participating in the distance learning courses are completing courses at high rates and, in the case of the College Board Advanced Placement (AP) courses,* are achieving scores comparable to students taught in traditional settings. These kinds of data could be critical to maintaining funding and political support for the program in the future. ACCESS also is building a longitudinal database to provide a core of data for use in future evaluations and, program leaders hope, to help determine ACCESS's long-term impact on student progress and achievement. With these data collection tools and processes in place, ACCESS has armed itself to address the needs and expectations of various stakeholders inside and outside the program.

Reasons and Contexts for Formative Versus Summative Evaluations

Formative and summative evaluations can each serve important functions for programs. Formative evaluations, sometimes called "process evaluations," are conducted primarily to find out how a program is being implemented and how it might be strengthened. Summative evaluations, also called "outcome evaluations," are appropriate for better-established programs, when program leaders have settled on their best policies and practices and want to know, for example, what results the program is yielding.

Ideally, formative evaluations are developed as partnerships that give all stakeholders a hand in planning and helping conduct the evaluation. Explicitly framing a formative evaluation as a collaboration among stakeholders can help in more ways than one. Practitioners are more likely to cooperate with and welcome evaluators rather than feel wary or threatened a common reaction. In addition, practitioners who are invited to be partners in an evaluation are more likely to feel invested in its results and to implement the findings and recommendations.

Even more than formative evaluations, summative evaluations can be perceived by practitioners as threatening and, in many cases, program staff are not eager to welcome evaluators into their midst. Even in these situations, however, their reaction can be mitigated if evaluators work diligently to communicate the evaluation's goals. Evaluators should make clear their intention to provide the program with information that can be used to strengthen it, or to give the program credible data to show funders or other stakeholders. In many cases, summative evaluations do not uncover findings that are unexpected; they merely provide hard data to back up the anecdotes and hunches of program leaders and staff.

Program leaders who are contemplating an evaluation also will want to consider the costs of whatever type of study they choose. Some formative evaluations are relatively informal. For example, a formative evaluation might consist primarily of short-term activities conducted by internal staff, like brief surveys of participants, to gather feedback about different aspects of the program. This type of evaluation is inexpensive and can be ideal for leaders seeking ongoing information to strengthen their program. In other instances, formative evaluation is more structured and formal. For instance, an external evaluator may be hired to observe or interview program participants, or to conduct field surveys and analyze the data. Having an external evaluator can bring increased objectivity, but it also adds cost.

In many cases, summative evaluations are more formal and expensive operations, particularly if they are using experimental or quasi-experimental designs that require increased coordination and management and sophisticated data analysis techniques. Typically, external evaluators conduct summative evaluations, which generally extends the timeline and ups the costs. Still, experimental and quasi-experimental designs may provide the most reliable information about program effects.

Finally, program leaders should consider that an evaluation need not be exclusively formative or summative. As the ACCESS case illustrates (see pp. 7-10), sometimes it is best for programs to combine elements of both, either concurrently or in different years.

Make the Most of Mandatory Program Evaluations

While all leaders and staff of online education programs are likely to want to understand their influence on student learning, some have no choice in the matter. Many online programs must deliver summative student outcome data because a funder or regulatory body demands it. In the case of Arizona Virtual Academy (AZVA), a K-12 statewide public charter school, the program must comply with several mandatory evaluation requirements: First, school leaders are required by the state of Arizona to submit an annual effectiveness review, which is used to determine whether or not the school's charter will be renewed. For this yearly report, AZVA staff must provide data on student enrollment, retention, mobility, and state test performance. The report also must include pupil and parent satisfaction data, which AZVA collects online at the end of each course, and a detailed self-evaluation of operational and administrative efficiency.

AZVA also must answer to K12 Inc., the education company that supplies the program's curriculum for all grade levels. K12 Inc. has its own interest in evaluating how well its curriculum products are working and in ensuring that it is partnered with a high-quality school. AZVA's director, Mary Gifford, says that "from the second you open your school," there is an expectation [on the part of K12 Inc.] that you will collect data, analyze them, and use them to make decisions. "K12 Inc. has established best practices for academic achievement. They take great pride in being a data-driven company," she adds. It conducts quality assurance audits at AZVA approximately every two years, which consist of a site visit conducted by K12 Inc. personnel and an extensive questionnaire, completed by AZVA, that documents various aspects of the program, such as instruction, organizational structure, and parent-school relations. K12 Inc. also requires AZVA to produce a detailed annual School Improvement Plan (SIP), which covers program operations as well as student achievement. The plan must include an analysis of student performance on standardized state tests, including a comparison of the performance of AZVA students to the performance of all students across the state.

Each of these mandates—those of the state and those of AZVA's curriculum provider—has an important purpose. But the multiple requirements add up to what could be seen as a substantial burden for any small organization. AZVA's small central staff chooses to look at it differently. Although the requirements generate year-round work for AZVA employees, they have made the most of these activities by using them for their own purposes, too. Each of the many mandated evaluation activities serves an internal purpose: Staff members pore over test scores, course completion data, and user satisfaction data to determine how they can improve their program. The SIP is used as a guiding document to organize information about what aspects of the program need fixing and to monitor the school's progress toward its stated goals. Although the process is time-consuming, everyone benefits: K12 Inc. is assured that the school is doing what it should, AZVA has a structured approach to improving its program, and it can demonstrate to the state and others that student performance is meeting expectations.

AZVA is able to make the most of its mandatory evaluations because the school's culture supports it: Staff members incorporate data collection and analysis into their everyday responsibilities, rather than viewing them as extra burdens on their workload. Furthermore, AZVA's leaders initiate data collection and analysis efforts of their own. They frequently conduct online surveys of parents to gauge the effectiveness of particular services. More recently, they also have begun to survey teachers about their professional development needs and their satisfaction with the trainings provided to them. "We're trying to do surveys after every single professional development [session], to find out what was most effective," says Gifford. "Do they want more of this, less of this? Was this too much time? Was this enough time? That feedback has been very good." AZVA's K-8 principal, Bridget Schleifer, confirms that teachers' responses to the surveys are taken very seriously. "Whenever a survey comes up and we see a need," she says, "we will definitely put that on the agenda for the next month of professional development."

Together, these many efforts provide AZVA with comprehensive information that helps the school address external accountability demands, while also serving internal program improvement objectives. Just as important, AZVA's various evaluation activities are integrated and support each other. For instance, the SIP is based on the findings from the evaluation activities mandated by the state and K12 Inc., and the latter's audit process includes an update on progress made toward SIP goals. More broadly, the formative evaluation activities help the school leaders to set specific academic goals and develop a plan for reaching them, which ultimately helps them improve the achievement outcomes assessed by the state. One lesson AZVA illustrates is how to make the most of evaluations that are initiated externally by treating every data collection activity as an opportunity to learn something valuable that can serve the program.


As the above program evaluations demonstrate, sometimes the best approach to meeting the needs of multiple stakeholders is being proactive. The steps are straightforward but critical: When considering an evaluation, program leaders should first identify the various stakeholders who will be interested in the evaluation and what they will want to know. They might consider conducting interviews or focus groups to collect this information. Leaders then need to sift through this information and prioritize their assessment goals. They should develop a clear vision for what they want their evaluation to do and work with evaluators to choose an evaluation type that will meet their needs. If it is meant to serve several different stakeholder groups, evaluators and program leaders might decide to conduct a multi-method study that combines formative and summative evaluation activities. They might also consider developing a multiyear evaluation plan that addresses separate goals in different years. In the reporting phase, program leaders and evaluators can consider communicating findings to different stakeholder audiences in ways that are tailored to their needs and interests.

In instances where online programs participate in mandatory evaluations, program leaders should seek to build on these efforts and use them for internal purposes as well. They can leverage the information learned in a summative evaluation to improve the program, acquire funding, or establish the program's credibility. Evaluators can help program leaders piggyback on any mandatory assessment activities by selecting complementary evaluation methods that will provide not just the required data but also information that program staff can use for their own improvement purposes.

*Run by the nonprofit College Board, the Advanced Placement program offers college-level course work to high school students. Many institutions of higher education offer college credits to students who take AP courses.

   10 | 11 | 12
Print this page Printable view Bookmark  and Share
Last Modified: 10/20/2009