Evaluating Online Learning: Challenges and Strategies for Success
July 2008

Building on the Existing Base of Knowledge

Under normal circumstances, evaluators frequently begin their work by reviewing available research literature. They also may search for a conceptual framework among similar studies or look for existing data collection tools, such as surveys and rubrics, that can be borrowed or adapted. Yet, compared to many other topics in K-12 education, the body of research literature on K-12 online learning is relatively new and narrow. Available descriptive studies are often very specific and offer findings that are not easily generalized to other online programs or resources. Empirical studies are few. Other kinds of tools for evaluators are limited, too. Recent efforts have led to multiple sets of standards for K-12 online learning (see Standards for K-12 Online Learning, p. 13). However, there still are no widely accepted education program outcome measures, making it difficult for evaluators to gauge success relative to other online or traditional programs.

Given the existing base of knowledge on K-12 online learning, how should evaluators proceed? Of course, evaluators will first want to consult the K-12 online learning research that does exist. Although the field is comparatively limited, it is growing each year and already has generated a number of significant resources (see appendix A, p. 59). Among other organizations, the North American Council for Online Learning (NACOL) has developed and collected dozens of publications, research studies, and other resources useful for evaluators of K-12 online learning programs. In some cases, evaluators also may want to look to higher education organizations, which have a richer literature on online learning evaluation, including several publications that identify standards and best practices (see appendix A). In some cases, these resources can be adapted for K-12 settings, but in other cases, researchers have found, they do not translate well.

Another approach is to develop evaluation tools and techniques from scratch. As described below, this may be as simple as defining and standardizing the outcome measures used among multiple schools or vendors, like the evaluators of Digital Learning Commons did. Or it may be a much more ambitious effort, as when Appleton eSchool's leaders developed a new model for evaluating virtual schools with an online system for compiling evaluation data. Finally, some evaluators respond to a limited knowledge base by adding to it. For example, the evaluators of Louisiana's Algebra I Online program have published their evaluation findings for the benefit of other evaluators and program administrators. In a different but equally helpful fashion, the leaders of Appleton eSchool have contributed to the field, by developing a Web site that allows administrators of online programs to share their best practices in a public forum.

Standards for K-12 Online Learning

As online learning in K-12 education has expanded, there has been an effort to begin to codify current best practices into a set of standards that educators can look to in guiding their own performance. Several organizations have released sets of these emerging standards based on practitioner input to date.

In late 2006, the Educational Technology Cooperative of the Southern Regional Education Board (SREB) issued Standards for Quality Online Courses. These standards are available along with other key resources from SREB at http://www.sreb.org/programs/EdTech/SVS/index.asp.

A year later, NACOL published National Standards of Quality for Online Courses, which endorsed SREB's standards and added a few others. The national standards cover six broad topic areas: course content, instructional design, student assessment, technology, course evaluation and management, and 21st-century skills.

NACOL also developed National Standards for Quality Online Teaching in 2008 and is currently working on program standards. The standards for online courses and teaching can be found at http://www.nacol.org along with many other resources.

The National Education Association also has published standards for online courses and online teachers, both available at http://www.nea.org.

In a different but equally helpful fashion, the leaders of Appleton eSchool have contributed to the field, by developing a Web site that allows administrators of online programs to share their best practices in a public forum.

Clearly Define Outcome Measures

Evaluators must be able to clearly articulate key program goals, define outcomes that align with them, and then identify specific outcome measures that can be used to track the program's progress in meeting the goals. Presently, however, outcome measures for evaluating online learning programs are not consistently defined, which makes it difficult for stakeholders to gauge a program's success, compare it to other programs, or set improvement goals that are based on the experience of other programs. Furthermore, the lack of consistent outcome measures creates technical headaches for evaluators. A recent article coauthored by Liz Pape, president and chief executive officer of Virtual High School Global Consortium, a nonprofit network of online schools, describes the problem this way:

Although standards for online course and program effectiveness have been identified, data-driven yardsticks for measuring against those standards are not generally agreed upon or in use. There is no general agreement about what to measure and how to measure. Even for measures that most programs use, such as course completion rates, there is variation in the metrics because the online programs that measure course completion rates do not measure in the same manner.3

The evaluators of Washington state's Digital Learning Commons (DLC) encountered just such a problem when attempting to calculate the number of online course-takers served by the program. DLC is a centrally hosted Web portal that offers a wide range of online courses from numerous private vendors. When evaluators from Cohen Research and Evaluation tried to analyze DLC's course-taking and completion rates, they found a range of reporting practices among the vendors: Some tracked student participation throughout the course, while others reported only on the number of students who completed a course and received a final grade. Also, in some cases, vendors did not differentiate between students who withdrew from a course and students who received an F, conflating two different student outcomes that might have distinct implications for program improvement.

Pape et al. note additional problems in the ways that course completion is defined:

When does the measure begin? How is completion defined? Do students have a "no penalty" period of enrollment in the online course during which they may drop from the course and will not be considered when calculating the course completion rate? Is completion defined as a grade of 60 or 65? How are students who withdrew from the course after the "no penalty" period counted, especially if they withdrew with a passing grade?4

Following the recommendations of their evaluators, DLC staff made efforts to communicate definitions of course completion and withdrawal that were internally consistent and made sure each vendor was reporting accurate data based on these conversations. The result was a higher level of consistency and accuracy in data reporting.

There is growing attention to the problem of undefined outcome measures in the field of evaluating online learning. A 2004 report by Cathy Cavanaugh et al., specifically recommended that standards be developed "for reporting the academic and programmatic outcomes of distance learning programs."5 A NACOL effort is under way to develop benchmarks for measuring program effectiveness and overall standards for program quality. Meanwhile, the best evaluators can do is to ensure internal consistency in the outcome measures used across all of their own data sources. Although the research base is limited, evaluators may be able to find similar studies for ideas on how to define outcomes. Moving forward, online program evaluators can proactively find ways to share research methods and definitions and reach a consensus on the best ways to measure program effectiveness.

Work Collaboratively to Develop New Evaluation Tools

In the early years of Appleton eSchool, school leaders became aware that they needed an overall evaluation system to determine the school's strengths and weaknesses. After failing to find an existing comprehensive tool that would fit their needs, Ben Vogel, Appleton's principal and Governance Board chair, and Connie Radtke, Appleton's program leader, began to develop their own evaluation process. Their goal was to design an instrument that would identify the core components necessary for students to be successful in an online program. In addition, they wanted to create a process that could be used to prompt dialogue among program leaders, staff, governance board members, and external colleagues, about the components of a successful online learning experience, in order to provide direction for future growth and enhancement.

Through extensive internal discussions and consultation with external colleagues, including staff from such online programs as Virtual High School and Florida Virtual School, Vogel and Radtke identified eight key program components and developed a rubric for measuring them called the Online Program Perceiver Instrument (OPPI). (See Key Components of Appleton eSchool's Online Program Perceiver Instrument, p. 16.) Vogel says, "[Our goal was] to figure out those universal core components that are necessary in a K-12 online program to allow students to be successful.… We were able to share [our initial thoughts] with other people in the K-12 realm, and say, 'What are we missing? And what other pieces should we add?' It just kind of developed from there."

The core components identified in the OPPI are what Appleton leaders see as the essential building blocks for supporting students in an online environment. The eight components address the online program user's entire experience, from first learning about the course or program to completing it.

When developing the OPPI, Vogel and Radtke researched many existing online program evaluations in higher education, but found them i-nsufficient for building a comprehensive rubric at the K-12 level. Vogel notes, for example, that having face-to-face mentors or coaches for students taking online courses is critical at the K-12 level, whereas it is not considered so important for older students who are studying online in higher education. To capture this program element, Vogel and Radtke included "Program Support" as a key component in the OPPI rubric, focusing on the training given to mentors (usually parents in the Appleton eSchool model) as well as training for local school contacts who support and coordinate local student access to online courses (see table 2, Excerpt from Appleton eSchool's Online Program Perceiver Instrument, p. 17). To assess mentor perception of program quality, the evaluators surveyed them following the completion of each online course.

As part of the OPPI process, Vogel and Radtke developed a three-phase approach to internal evaluation, which they refer to as a "self-discovery" process. In the Discovery Phase, program personnel fill out a report that describes the school's practices in each of the eight areas identified in the OPPI. Then program decision-makers use the rubric to determine what level of program performance is being attained for each element: deficient, developing, proficient, or exemplary. In addition, program leaders e-mail surveys to students, mentors (usually parents), and teachers at the end of each course, giving them an opportunity to comment on the program's performance in each of the eight OPPI areas. In the Outcome Phase, results from the Discovery Phase report and surveys are summarized, generating a numerical rating in each program area. At the same time, information on student outcomes is reviewed, including student grades, grade point averages, and course completion rates. Program decision-makers synthesize all of this data in an outcome sheet and use it to set goals for future growth and development. Finally, in the Sharing of Best Practices Phase, program leaders may select particular practices to share with other programs. Appleton has partnered with other online programs to form the Wisconsin eSchool Network, a consortium of virtual schools that share resources. The Network's Web site includes a Best Practices Portfolio and schools using the OPPI are invited to submit examples from their evaluations.6 Practices that are determined to be in the "proficient" or "exemplary" range are considered for placement in the portfolio, and participating schools are cited for their contributions. The entire evaluation system is Web-based, allowing for streamlined data collection, analysis, and sharing.

Key Components of Appleton eSchool's Online Program Perceiver Instrument (OPPI)

Practitioners and program administrators use the OPPI to evaluate program performance in eight different areas:

  1. Program Information: System provides and updates information necessary for prospective users to understand the program being offered and determine whether it may be a good fit for students.

  2. Program Orientation: System provides an introduction or orientation that prepares students to be successful in the online course.

  3. Program Technology: System provides and supports program users' hardware and software needs in the online environment.

  4. Program Curriculum: System provides and supports an interactive curriculum for the online course.

  5. Program Teaching: System provides and supports teaching personnel dedicated to online learning and their online students.

  6. Characteristics and Skills Displayed by Successful Online Students: System identifies and provides opportunities for students to practice characteristics necessary for success in an online environment.

  7. Program Support: System provides and supports a system of support for all online students and mentors (e.g., parents) and coaches.

  8. Program Data Collection: System collects data and uses that data to inform program decision-makers and share information with other programs.

Appleton's decision to develop its own evaluation rubric and process provides several advantages. Besides resulting in a perfectly tailored evaluation process, Appleton leaders also have the ability to evaluate their program at any time without waiting for funding or relying on a third-party evaluator. Still, developing an evaluation process is typically expensive and may not be a practical option for many programs. Appleton's leaders spent many hours researching and developing outcome measures (i.e., the descriptions of practice for each program component under each level of program performance). They also invested about $15,000 of their program grant funds to pay a Web developer to design the online system for compiling and displaying evaluation data. For others attempting to develop this type of tailored rubric and process, accessing outside expertise is critical to fill gaps in knowledge or capacity. Appleton leaders collaborated extensively with experienced colleagues from other virtual schools, particularly as they were developing their rubric.

Share Evaluation Findings With Other Programs and Evaluators

Although the OPPI rubric was developed specifically for Appleton, from the beginning Vogel and Radtke intended to share it with other programs. This rubric is currently accessible free of charge through the Web site of the Wisconsin eSchool Network, described above. Vogel explains that the OPPI and its umbrella evaluation system are readily adaptable to other programs: "Internally, this system doesn't ask people to have an overwhelming amount of knowledge. It allows people to make tweaks as needed for their particular programs, but they don't have to create the whole wheel over again [by designing their own evaluation system]." The OPPI system also allows for aggregation of results across multiple programs-a mechanism that would allow groups of schools in a state, for example, to analyze their combined data. To assist schools using the OPPI for the first time, Appleton offers consultation services to teach other users how to interpret and communicate key findings. Through their efforts to share their evaluation tool and create the online forum, Appleton leaders have developed an efficient and innovative way to build the knowledge base on online learning programs.

Table 2. Excerpt From Appleton eSchool's Online Program Perceiver Instrument

This rubric is currently accessible free of charge through the Web site of the Wisconsin eSchool Network, described above.* Vogel explains that the OPPI and its umbrella evaluation system are readily adaptable to other programs: "Internally, this system doesn't ask people to have an overwhelming amount of knowledge. It allows people to make tweaks as needed for their particular programs, but they don't have to create the whole wheel over again [by designing their own evaluation system]." The OPPI system also allows for aggregation of results across multiple programs- a mechanism that would allow groups of schools in a state, for example, to analyze their combined data. To assist schools using the OPPI for the first time, Appleton offers consultation services to teach other users how to interpret and communicate key findings. Through their efforts to share their evaluation tool and create the online forum, Appleton leaders have developed an efficient and innovative way to build the knowledge base on online learning programs.

In Louisiana, evaluators from Education Development Center (EDC) have used more conventional channels for sharing findings from the evaluation of the state's Algebra I Online program. This program was created by the Louisiana Department of Education to address the state's shortage of highly qualified algebra teachers, especially in urban and rural settings. In addition, districts desiring to provide certified teachers access to pedagogy training and mentoring so they can build capacity for strong mathematics instruction are eligible to participate. In Algebra I Online courses, students physically attend class in a standard bricks-and-mortar classroom at their home school, which is managed by a teacher who may not be certified to deliver algebra instruction. But once in this classroom, each student has his or her own computer and participates in an online class delivered by a highly qualified (i.e., certified) algebra teacher. The in-class teacher gives students face-to-face assistance, oversees lab activities, proctors tests, and is generally responsible for maintaining an atmosphere that is conducive to learning. The online teacher delivers the algebra instruction, answers students' questions via an online discussion board, grades assignments via e-mail, provides students with feedback on homework and tests, and submits grades. The online and in-class teachers communicate frequently with each other to discuss students' progress and collaborate on how to help students learn the particular content being covered. This interaction between teachers not only benefits students; it also serves as a form of professional development for the in-class instructors. In addition to providing all students with high-quality algebra instruction, a secondary goal of the program is to increase the instructional skills of the in-class teachers and support them in earning their mathematics teaching certificate.

Although its founders believed that the Algebra I Online model offered great promise for addressing Louisiana's shortage of mathematics teachers, when the program was launched in 2002 they had no evidence to back up this belief. The key question was whether such a program could provide students with learning opportunities that were as effective as those in traditional settings. If it were as effective, the program could provide a timely and cost-effective solution for the mathematics teacher shortage. Louisiana needed hard evidence to show whether the program was credible.

Following a number of internal evaluation activities during the program's first two years, in 2004 the program's leaders engaged an external evaluation team consisting of Rebecca Carey of EDC, an organization with experience in researching online learning; Laura O'Dwyer of Boston College's Lynch School of Education; and Glenn Kleiman of the Friday Institute for Educational Innovation at North Carolina State University, College of Education. The evaluators were impressed by the program leaders' willingness to undergo a rigorous evaluation. "We didn't have to do a lot of convincing," says EDC's Carey. "They wanted it to be as rigorous as possible, which was great and, I think, a little bit unusual." The program also was given a boost in the form of a grant from the North Central Regional Educational Laboratory (NCREL), a federally funded education laboratory. The grant funded primary research on the effectiveness of online learning and provided the project with $75,000 beyond its initial $35,000 evaluation budget from the state legislature. The additional funding allowed EDC to add focus groups and in-class observations, as well as to augment its own evaluation capacity by hiring an external consultant with extensive expertise in research methodology and analysis.

The EDC evaluators chose a quasi-experimental design (see Glossary of Common Evaluation Terms, p. 65) to compare students enrolled in the online algebra program with those studying algebra only in a traditional face-to-face classroom format. To examine the impact of the Algebra I Online course, they used hierarchical linear modeling to analyze posttest scores and other data collected from the treatment and control groups. To determine if students in online learning programs engaged in different types of peer-to-peer interactions and if they perceived their learning experiences differently than students in traditional classrooms, the evaluators surveyed students in both environments and conducted observations in half of the treatment classrooms. In total, the evaluators studied Algebra I Online courses and face-to-face courses in six districts.

After completing their assessment, the evaluators produced final reports for the Louisiana Department of Education and NCREL and later wrote two articles about the program for professional journals. The first article, published in the Journal of Research on Technology in Education,7 described Algebra I Online as a viable model for providing effective algebra instruction. In the study, online students showed comparable (and sometimes stronger) test scores, stayed on task, and spent more time interacting with classmates about math content than students in traditional classroom settings. The evaluators speculated that this was a result of the program's unique model, which brings the online students together with their peers at a regularly scheduled time. The evaluators found a few areas for concern as well. For example, a higher percentage of online students reported that they did not have a good learning experience, a finding that is both supported and contradicted by research studies on online learning from higher education. The evaluation also found that the Algebra I Online students felt less confident in their algebra skills than did traditional students, a finding the evaluators feel is particularly ripe for further research efforts. (For additional discussion, see Interpreting the Impact of Program Maturity, p. 40.)

The Algebra I Online evaluators wrote and published a second article in the Journal of Asynchronous Technologies that focused on the program as a professional development model for uncertified or inexperienced math teachers.8 In this piece, the evaluators described the programs' pairing of online and in-class teachers as a "viable online model for providing [the in-class] teachers with an effective model for authentic and embedded professional development that is relevant to their classroom experiences."

Of course, not all programs will have the resources to contribute to the research literature in this manner. In Louisiana's case, the evaluators had extensive expertise in online evaluation and took the time and initiative required for publishing their findings in academic journals. In so doing, they served two purposes: providing the program leaders with the evidence they needed to confidently proceed with Algebra I Online and publishing much-needed research to states that might be considering similar approaches.


The literature on K-12 online learning is growing. Several publications and resources document emerging best practices and policies in online learning (see appendix A). For program leaders and evaluators who are developing an evaluation, the quality standards from SREB and NACOL provide a basic framework for looking at the quality of online courses and teachers. There also is a growing body of studies from which evaluators can draw lessons and adapt methods; evaluators need not reinvent the wheel. At the same time, they must exercise caution when applying findings, assumptions, or methods from other studies, as online programs and resources vary tremendously in whom they serve, what they offer, and how they offer it. What works best for one program evaluation may not be appropriate for another.

Given the lack of commonly used outcome measures for online learning evaluations, individual programs should at least strive for internal consistency, as DLC has. If working with multiple vendors or school sites, online program leaders need to articulate a clear set of business rules for what data are to be collected and how, distributing these guidelines to all parties who are collecting information. Looking ahead, without these common guidelines, evaluators will be hard pressed to compare their program's outcomes with others. Some of the evaluators featured in this guide have made contributions to the field of online learning evaluation, like Appleton's leaders, who developed an evaluation model that can be borrowed or adapted by other programs, and the evaluators of Algebra I Online, who published their study findings in professional journals.

* Registration is required to access the OPPI and some consultation with its developers may be needed to implement the process fully.

   10 | 11 | 12
Print this page Printable view Bookmark  and Share
Last Modified: 10/20/2009