Archived Information

Answers in the Tool Box: Academic Intensity, Attendance Patterns, and Bachelor's degree Attainment — June 1999

# IV. Does It Make Any Difference? Common Sense and Multivariate Analysis

Do the complex patterns of attendance described above have any impact on bachelor's degree completion, and, if so, which variables survive in regression analyses and how much explanatory power do they contribute to a model that also includes pre-college characteristics, measures of early postsecondary performance, modes of financial aid, and satisfaction with major aspects of higher education?

We have now arrived at multivariate analysis. In this section of the study, there will be six sequential ordinary least squares (OLS) regression models:

• Background: high school performance, aspirations, demography (6 variables)

• Financial Aid Modes (3 variables)

• Attendance Patterns (9 variables)

• 1st True Year Performance (3 variables)

• Continuing Performance Effects (3 variables)

• Satisfaction Indicators (5 variables)

As each configuration of variables is entered, the strong stories rise to the top in terms of contribution to the model and statistical significance, while the weak ones fade. At the end of this process, out of 29 independent variables introduced, only 11 remain. The story told by those 11 variables provides very strong guidance for improving degree completion rates for all populations, and particularly for minority and low-SES students.

A logistic regression version of the first five equations is then presented. Logistic regression is the statistician's method of choice when the outcome is a dichotomous variable such as did/did not earn bachelor's degree. Logistic regression expresses itself in a different way than does OLS. Its objective is to identify the "maximum likelihood" of a relationship, the "probability of observing the conditions [ital mine] of success" (Cabrera, 1994, p. 227). The principal metric in which it tells its story is an odds ratio--the way in which HIGHMATH was presented above (see p. 17). In some ways, a logistic regression is like an epiphany: it's results make a dramatic statement, the parameters of which are sometimes unexpected. OLS, on the other hand, seeks to minimize the difference between predicted and observed probabilities, that is, between the rational and the empirical. The result is hardly analogous to an epiphany: it is slow and unfolding (Pedhazur, 1982), and expresses its fundamental conclusion in a metric accessible to the general reader: the "percent" of variance accounted for by the model (the R2). Logistic regression has an analogous metric of conclusion, the G2, but it is less accessible.

## Who is in the Universe?

The population to be considered in the mutilvariate equations consists only of those students in the HS&B/So who attended a 4-year college at any time, even though they might also have attended other types of schools. In a longitudinal study that continues to age 30, the mark of someone who intends to earn a bachelor's degree is actual attendance at a bachelor's degree-granting institution, not a statement. These students may have attended other types of schools, but if the dependent variable is bachelor's degree completion, we distort our understanding of what makes a difference if we include people who never really tried.

But what about those students who expected to earn a bachelor's degree, but never attended a 4-year college by age 30? Aren't they students for whom family income and SES play roles much stronger than ACRES or any of its components? Should they be included in the analysis of degree completion, and, if not, what impact will the exclusion have on both our assessment of the validity of the ACRES variable and the analysis of what makes a different in degree attainment. The best way to confront these questions is by comparing two groups of students (1) whose referent first institution of attendance was a community college, (2) who earned more than 10 credits from the community college (that is, they were not incidental students), and (3) who expected to earn a bachelor's degree. One group ultimately attended a 4-year college; the other did not. They are of roughly equal size (a weighted N of 84k for those who did not attend a 4-year college; a weighted N of 96k for those who did). What do they look like? Table 27 sets forth some basic parameters.

#### Table 27.–4-year or no 4-year? A comparative portrait of students whose referent first institution was a community college, who expected to earn a bachelor's degree, and who earned more than 10 college credits, High School &Beyond/Sophomore Cohort, 1982-1993

 Percent by SES Quintile High 2nd 3rd 4th Low No 4-Yr 21 27 23* 22 7* 4-Year 29 36 22* 9 * Percent by Academic Resources Quintile No 4-Yr 11 27 37 19 6* 4-Year 32 32 27 7 3* Percent by Total Undergraduate Credits 11-29 30-59 60-89 90+ No 4-Yr 32 37 22 9 4-Year 3 7 8 82 Percent by Total Credits from CommColl No 4-Yr 33 36* 21 10* 4-Year 16 30* 46 8*

NOTES: (1) All column comparisons are statistically significant at p<.05 except those indicated by asterisks. (2) Bachelor's degree expectation as indicated in 12th grade. (3) Rows may not add to 100% due to rounding. SOURCE: National Center for Education Statistics: High School &Beyond/Sophomore Cohort, NCES CD#98-135.

A more dramatic way of illustrating what makes a difference for these two groups is through a logistic regression with attending a 4-year college as the dependent variable.

#### Table 28.–Among students who began in a community college, aspired to a bachelor's degree, and earned more than 10 college credits, the relative strength of family and high school background in relation to ultimate attendance at a 4-year college: a logistic analysis

 (Beta)Estimate s.e. t p OddsRatio Intercept -3.2456 .604 3.61 Family Income -0.0934 .078 0.80 -- 0.91 SES 0.4361 .120 2.43 .05 1.55 Academic Resources 0.6031 .120 3.37 .02 1.83

NOTES: (1) Weighted N=181k; (2) standard errors adjusted for design effect; (3) Design effect=1.49. SOURCE: National Center for Education Statistics: High School & Beyond/Sophomore cohort, NCES CD #98-135.

People, including those who aspire to bachelor's degrees, attend community colleges for many reasons. Within the groups at issue here, family income does not play a role in whether they attend 4-year colleges as well. The odds ratio for family income is very close to 1.0 (which indicates no influence) and the parameter estimate does not meet the criterion for statistical significance at all. SES, which transcends income, is significant; but academic resources seem more significant. What do we make of this?

In both the linear and logistic model series below, SES exerts a modest but declining influence on bachelor's degree attainment as students move into postsecondary education and through their first year. The group of students who started in a community college, expected to earn a bachelor's degree, but never attended a 4-year college by age 30 are much weaker in academic resources than their peers who eventually did attend a 4-year college (table 27). They also exhibit a lower SES distribution, though not dramatically so. If we include them in the multivariate analysis, we can speculate that SES would exert the same modest but declining influence on degree completion but at a slightly higher level. For example, in the first stage of the logistic regression series, SES might carry an odds ratio of 1.26 to 1 instead of 1.22 to 1. The difference is not a compelling reason for including this group in an analysis designed to help us select the tools we will need to use in both maintaining and bringing greater equity to bachelor's degree completion rates.

So, the universe will be limited to those who attended a 4-year college at any time. This sounds simple enough. But in programming the database, there are two potential definitions of the universe of students "who attended a 4-year college at any time." By one definition, we admit only those students for whom we actually received at least one transcript from a 4-year college. By the second definition, we also include cases where one or more transcripts from 4-year colleges were requested but none were received. For the universe created by the second definition, we have all variables that were self-reported in the surveys, for example, all the financial aid modes, and even values for attendance pattern variables derived simply from the number and nature of transcripts requested. However, the small expansion group begins to drop out of the analysis when we reach the Attendance Pattern model because, without transcripts, it is impossible to determine a value for any variable based on dates, for example, continuity of enrollment and delay of entry, let alone those based on credits, grades, and course-taking(40).

The background model—before introduction of attendance, performance and other college experience variables—is offered in table 29. It consists of three standard demographic constructs: SES (in quintiles), RACE (dichotomous Black/Latino/AmerInd v. White/Asian), and SEX (male=1). It also includes a dummy variable, children, marking whether the individual became a parent at anytime up to 1986 (age 22/23)(41), the composite ACRES (academic resources) to indicate the quality of the student's performance (curriculum, class rank, and test scores) in secondary school, and the sharpened construct of educational "anticipations." The dependent variable is bachelor's degree completion by age 30 in 1993.

What does this basic background model say, no matter which universe one uses? In the absence of any other information: (1) the six independent variables in the equation explain between 21 percent (Part A) and 23 percent (Part B) of the variance in long-term bachelor's degree completion among students in the HS&B/So who attended at least one 4-year college at any time up to age 30; (2) of the six independent variables in the equation, that which carries the student's high school background, ACRES, contributes most to the explanation; and (3) of the remaining five independent variables, only the fact of becoming a parent prior to age 22(42) and parents' socioeconomic status contribute anything else of significance to the explanation.

This is the most basic of common sense matters. If the universe had not been limited to students who attended 4-year colleges at some time, ACRES would have contributed slightly more to the explanation of variance in degree completion. When the outcome is degree completion, who you are is less important than the amount and quality of the time you invest in activities that move you toward that goal.

While race and sex fall out of the model very early, no matter which universe of 4-year college students we use, the "anticipations" variable stays in the equation, however marginally. It stays in the equation only because it was set up as a dichotomy with the positive value confined to "bachelor's consistent" expectations, as appropriate to students who attended 4-year colleges. It is for this reason, along with the fact that the next set of variables to be introduced (those reflecting financial aid and student employment while enrolled) do not rely on transcript data that the universe of Part B will be carried forward.

#### Table 29.–Background Model: The relationship of pre-college and family variables to bachelor's degree completion among 4-year college students in the HS&B/So, 1982-1993

##### Part A: Restricted Universe of 4-Year College Attendees

Universe: All students for whom a transcript from a 4-year college was received, who received a high school diploma or equivalent prior to 1988, and who evidenced positive values for all variables in the model. N=4,765. Weighted N=1.131M. Simple s.e.=.702; Taylor series s.e.=1.083; Design effect=1.54.

 Variable Parameter Estimate Adj. s.e. t p Contributionto R2 INTERCEPT 1.782346 .1159 9.98 ACRES 0.133301 .0094 9.21 .001 .1651 Children -0.340264 .0433 5.10 .01 .0282 SES Quintile 0.036144 .0083 2.83 .05 .0122 Anticipations 0.071447 .0218 2.13 .10 .0039 Race -0.073809 .0300 1.60 ---* .0026 Sex -0.040801 .0193 1.37 ---* .0017 *Dropped from model. R-Sq. .2137 Adj. R-Sq. .2127

##### Part B: Unrestricted Universe of 4-Year College Attendees

Universe: All students for whom the evidence confirms 4-year college attendance at any time (whether transcripts were received or not), who received a high school diploma or equivalent prior to 1988, and who evidenced positive values for all variables in the model. N=4,943. Weighted N=1.179M. Simple s.e.=.697; Taylor series s.e.=1.082; Design effect=1.55.

 Variable ParameterEstimate Adj. s.e. t p Contributionto R2 INTERCEPT 1.722264 .1112 9.99 ACRES 0.136088 .0091 9.65 .001 .1829 Children -0.321327 .0410 5.06 .01 .0277 SES Quintile 0.038911 .0081 3.10 .02 .0142 Anticipations 0.086950 .0216 2.60 .10 .0057 Race -0.077506 .0299 1.68 ---* .0024 Sex -0.040627 .0193 1.37 ---* .0019 *Dropped from model. R-Sq. .2347 Adj. R-Sq. .2338

NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix and Skinner, Holt and Smith (1989). (2) Significance level of t (p) based on a two-tailed test.

It is not surprising that race and sex fall out of the model, no matter how generous a statistical selection criterion was used(43). If these variables failed to meet statistical selection criteria at a stage of student history prior to college attendance and in the course of constructing ACRES, their chances of playing any role after the student group has been winnowed to 4-year college attendees is dim indeed. Given the nature and intensity of current and continuing concern with the educational attainment of minority students, however, this study keeps race in the models until it fails to register the slightest impact. It is retained in contrast to other features of students' educational histories that are subject to the kind of change (race is not subject to change) that yield degree completion.

If we subjected the specifications for Part B of the Background Model to a logistic regression, would the relationships be the same? Table 30 presents a portrait that, in a few respects is slightly different from that of the OLS accounting. ACRES is again the strongest of the positive components and becoming a parent by age 22 is the strongest of the negative components. The estimates for ACRES say that for each step up the quintile ladder of that variable, the odds of earning a bachelor's degree increase by nearly 100 percent. The estimates for parenthood say that having children reduces the odds of earning a degree by 86 percent (1 minus the odds ratio of 0.14). Degree anticipations and SES are also positive contributors (as they were in the OLS version), but in the logistic version both the parameter estimate and the odds ratio favor degree anticipations over SES. The explanation is more technical than substantive. Unlike the OLS version, the logistic account keeps race in the model, however tenuously. This results lends modest support to my decision to carry race into Stage 2.

#### Table 30.–Logistic account of the relationship of pre-college and family variables to bachelor's degree completion among 4-year college students in the High School & Beyond/Sophomore cohort, 1982-1993

Universe: All students for whom the evidence confirms 4-year college attendance at any time (whether transcripts were received or not), who received a high school diploma or equivalent prior to 1988, and who evidenced positive values for all variables in the model. N=4,943. Weighted N=1.179M. Simple s.e.=.697; Taylor series s.e.=1.082; Design effect=1.55.

 Variable ParameterEstimate Adj. s.e. t p OddsRatio INTERCEPT -3.174 0.691 2.96 Academic Resources 0.678 0.051 8.58 .001 1.97 Children -1.968 0.283 4.49 .001 0.14 SES 0.202 0.043 3.03 .01 1.22 Anticipations 0.423 0.112 2.44 .02 1.53 Race -0.398 0.156 1.65 .10 0.67 Sex -0.234 0.104 1.08 --- 0.79

NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix and Skinner, Holt and Smith (1989). (2) Significance level of t (p) based on a two-tailed test.

Model Iteration, Stage 2: Modes of Financial Aid and Student Work

Where do we turn to explore and strengthen the explanation of degree completion? There are four sets of variables to which one's attention is drawn for this task: financial aid and the extent to which a student worked while enrolled, attendance patterns, satisfaction with various aspects of postsecondary education, and first-year performance. Because financial aid has been shown to be critical to initial enrollment decisions (Jackson, 1988) and first-year retention (Stampen and Cabrera, 1986), our story regards it as antecedent to attendance patterns. Financial aid and student work thus enter the model at this point.

The HS&B/So limits our utilization of financial aid variables with confidence. The database relies principally on student self-reports concerning their financing of college education, and students usually don't know where their financial aid is coming from or how much is at issue. For example, the data set includes an unobtrusive Pell Grant file, and I found about 6000 (weighted) students in that file who never claimed--in the surveys--to have received financial aid. Pell Grants, of course, apply to only a fraction of college students; and third-party files for other programs (Stafford Loans, State Supplementary Grants, and others) were not available. A variable was created that simply marked whether a student had ever received a scholarship or grant of any type between 1982 and 1986; and a parallel variable was formulated for loans. With the exception of describing combinations of types of financial aid (including work-study) offered to students (St. John and Noell, 1989), that is the extent of confidence in financial aid analyses for the HS&B/So (and the NLS-72 as well).

The third component of "financial aid" was derived from student responses to a loop of questions concerning the financing of postsecondary education, year-by-year, from 1982 through 1986, with a focus on work. Neither the dollar amount nor the number of hours worked are in question here, rather the fact, sources, timing, and purpose of work. For each year students indicated that they worked to pay some of the costs of their postsecondary education a dummy variable was constructed. A positive value was registered if the sources of the student's earnings were Work-Study, Co-Op placements, Teaching/Research Assistantships, and/or "other earnings while in school," i.e. work activities concurrent with enrollment. These activities are more likely to take place on campus than off-campus. Savings from work prior to entering higher education or during periods of stop-out, as well as earnings from summer jobs were excluded. A composite variable was constructed to indicate whether the student "worked" (as defined by these boundaries) in each of the four academic years in question (1982-1986), and then dichotomized to mark whether, in two or more of the four academic years covered by the loop of financing questions, the student worked concurrently with enrollment and in order to cover costs of education. The variable is called STUWORK.

The literature on this issue (e.g. Cuccaro-Alamin and Choy, 1998; Horn, 1998) is usually based on the number of hours worked, a datum not available in the HS&B/So. The consensus seems to be that a modest amount of work while enrolled enhances retention, that work on campus certainly intensifies student involvement and contributes to completion (Astin, 1993), but that an excess of work (particularly off-campus) is negatively related to persistence. In the Beginning Postsecondary Students longitudinal studies, NCES posed a rather telling question that conditions the way one must judge student reports of hours worked in relation to persistence and completion. Respondents are asked whether they see themselves primarily as students who happen to be working or employees who happen to be going to school. Table 31, produced from the BPS90 data, confining itself to the first year of enrollment and to a population younger than 24 (to render the data parallel to the HS&B/So universe), provides guidance for interpreting hours worked under that dichotomy. The average hours worked for those who considered themselves students first seems high at 25.9, but it is weighted to the upside by those claiming 40+ hours of work per week and by students attending trade schools.

#### Table 31.–Hours worked per week during the first year of enrollment (1989-1990), for traditional-aged students in the Beginning Postsecondary Students longitudinal study, by primary role

 Primarilya Student(76% of all) Primarilyan Employee(24% of all) Proportion employed during >80 percent of months enrolled 52.7% (1.4) 66.0% (3.9) Proportion of students byrange of work hrs./week while enrolled 1-14 32.4% (1.3) 21.3% (2.0) 15-21 18.0 (1.1) 17.5 (2.2) 22-30 22.8 (1.2) 19.7 (2.1) 31+ 26.8 (1.3) 41.5 (2.7) Total: 100.0% 100.0% Average work hrs./weekwhile enrolled 25.9 (0.3) 29.9 (0.6)

Note: Standard errors are in parentheses. Source : National Center for Education Statistics: Beginning Postsecondary Students, 1989-94: Data Analysis System.

The contrasts based on primary role would be much greater if the universe had not been confined to traditional-age students (17-24). But even among traditional-age students, one out of four considers himself/herself an employee first, a judgment confirmed by higher average work week hours, proportion employed during most of the months in which they were enrolled, and an obviously higher percentage working full-time (over 30 hours/week).

Because the postsecondary history of the HS&B/So can be as long as 11 years, the role of work in models of bachelor's degree completion by the end of that period may be attenuated. People change status. They become independent; they become employees who happen to be going to school. Combine these changes with growing and swirling patterns of multi-institutional attendance, and Cuccaro-Alamin and Choy's observation that the average work-week of students does not change over time, then one doubts that work will emerge as a significant factor in explaining bachelor's degree attainment.

#### Table 32.–Degree Completion Model: Financial Aid and Employment Variables

Universe: All students for whom the evidence confirms 4-year college attendance at any time, who received a high school diploma or equivalent prior to 1988, and who evidence positive values for the variables in the Background Model. N=4,943; Weighted N=1.179M. Simple s.e.=.697; Taylor series s.e.=1.082; Design effect=1.55.

 Variable ParameterEstimate Adj. s.e. t p Contributionto R2 INTERCEPT 1.397895 .1139 7.92 Academic Resources 0.123885 .0093 8.60 .001 .1829 Children -0.300230 .0407 4.76 .01 .0277 SES Quintile 0.046395 .0082 3.65 .01 .0142 STUWORK 0.075257 .0209 2.32 .05 .0081 Anticipations 0.079826 .0215 2.40 .05 .0052 Grant-in-Aid 0.067087 .0212 2.04 .10 .0042 Race -0.083451 .0292 1.84 .10 .0030 Loan 0.019876 .0209 0.61 ---* .0003 *Dropped from model. R-Sq. .2456 Adj. R-Sq. .2443

NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix and Skinner, Holt and Smith (1989). (2) Significance level of t (p) based on a two-tailed test.

Indeed, in their initial appearance in table 32, it is obvious that the financial aid/work variables do not strengthen the explanatory power of the model appreciably: the adjusted R2 increases by barely one percent (.2338 to .2443). Loans (at least the way the data base forces us to define loans) are a very weak addition, while STUWORK is the strongest of the three new variables in terms of its statistical properties in the model. Whether these relationships will hold in the next iteration of the model depends on their interactions with the new variables. Remember that all of the financial aid/work data were derived from student responses to questionnaire items in 1984 and 1986, while the censoring date for bachelor's degree completion is 1993. It is very possible that students received grants and worked while enrolled as undergraduates after 1986, but we don't know. When the calendar for independent variables is truncated and one introduces long-term attendance pattern variables, we might witness changes in the strength of Grant-in-Aid and STUWORK.

Stage 2 of the model (table 32) demonstrates a virtue of carrying forward race, even though the variable did not meet the statistical criteria for retention in the OLS version of the background model. Race is obviously a marginal contributor. Its effects on degree completion for the population of 4-year college students, while slightly negative, are nowhere near the negative effects of having a child by age 22. The more variables in the model, though, the greater the degrees of freedom in statistical analyses, hence the lower the threshold t statistic for inclusion.

Model Iteration, Stage 3: Attendance Patterns

The third stage of model development is presented in table 33. This is the first iteration that follows the analysis of attendance patterns, and it changes (and shrinks) the universe. Seven variables were carried forward from the previous iteration. Nine others were introduced here. One of the new variables, number of schools attended, did not meet the p<.2 selection criterion at all, even in two different formulations: a dichotomous form (one school only v. more than one), and a trichotomous form (one, two, and more than two). All the data we observed in Part III above suggested that this would happen: in an age of multi-institutional attendance, the number of schools attended simply will not be related to degree completion.

The attendance variables brought into play in this iteration can be parsed in three groupings:

(1) Number and order of institutions attended.

NUMSCHL: a simple dichotomy—one school v. more than one.

TRANSFER: a classic pattern in which the student attended a community college first, earned more than 10 credits from the community college, and subsequently earned more than 10 credits from a 4-year college (the "early transfers" are thus excluded); and

NO RETURN: a pattern in which the student attended more than one school, but did not return to the first institution of attendance.

(2)Characteristics of the "referent" first institution of attendance.

FIRST4: the first institution of attendance was a 4-year college;

DOCTORAL: the first institution was a doctoral degree-granting institution;

SELECTIVITY: the first institution was selective/highly selective v. non-selective/open door.

(3) Other features of attendance.

NODELAY: the first date of attendance at the first institution occurred 10 or fewer months after high school graduation;

OUTSYS: at some time during his/her undergraduate career, the student attended an institution other than a traditional 4-year or 2-year college; and

NOSTOP: the student was continuously enrolled as an undergraduate.

There are some very dramatic changes in the model with the introduction of attendance pattern variables. The first is that the explanatory power of the model leaps by a factor of nearly 50 percent--from an R2 of .2456 to one of .3623. This significant advance occurs not as a result of the characteristics of the "referent" institution of attendance, rather as a by-product of student movement among institutions and the temporal dimensions of enrollment.

The most notable changes in this iteration of the model are the dominance of continuous enrollment (NOSTOP), the contracting strength of the pre-collegiate variables (ACRES and SES Quintile), and the superficially contradictory positions of No Return and Transfer. This apparent contradiction is fairly easy to explain. In the No-Return variable, the student attends more than one institution and does NOT return to the first, and this behavior has a negative relationship to degree completion. As defined, Transfer involves a No Return-type situation, but with a specific sequence and criteria (the student must earn more than 10 credits in a 2-year college before earning more than 10 credits in a 4-year college). Transfer has a positive relationship to degree completion. Students in a classic transfer pattern are moving toward a bachelor's degree. Students in No-Return positions include reverse transfers who move away from the path toward the bachelor's degree.

On the other hand, selectivity of the referent first institution of attendance has a significant role to play, even though its statistical position is tenuous. Covariance analysis provides some clues as to the lower standing of selectivity in the model: it is entangled with the pre-collegiate variables ACRES and Anticipations, the former being strong enough to reduce the effects of selectivity. This is another case of common sense: one assumes that most students who start out in selective or highly selective institutions take an academically intense curriculum in high school, perform decently (if not very well) on SATs and ACTs, stand toward the top of their classes., and are committed to earning at least bachelor's degrees. Among other place-related variables, starting in a doctoral degree-granting institution and stepping outside the secular higher education system were so weak as to fall below minimum significance.

#### Table 33.–Degree Completion Model; Attendance Pattern Iteration

Universe: All students who attended a 4-year college at any time, whose transcript records were complete, and who evidenced positive values for all variables in the model. N=4,538. Weighted N=1.079M. Simple s.e.=.714; Taylor series s.e.=1.087; Design effect=1.52.

 Variable ParameterEstimate Adj. s.e. t p Contributionto R2 INTERCEPT 0.610047 .1910 2.10 NOSTOP 0.323020 .0204 10.42 .001 .1994 Academic Resources 0.091391 .0090 6.68 .001 .0989 Children -0.215601 .0413 3.43 .01 .0135 Transfer 0.231072 .0313 4.86 .001 .0115 No Return -0.106825 .0215 3.27 .01 .0106 SES Quintile 0.030959 .0078 2.61 .02 .0091 Grant-in-Aid 0.064003 .0185 2.28 .05 .0053 Anticipations 0.050385 .0206 1.61 ---* .0036 Selectivity 0.082775 .0259 2.10 .05 .0031 Race -0.081385 .0276 1.94 .10 .0028 STUWORK 0.051003 .0191 1.76 .10 .0023 FIRST4 0.063635 .0319 1.31 ---* .0023 OUTSYS -0.080847 .0530 1.00 ---* .0008 No Delay 0.047080 .0294 1.05 ---* .0008 DOCTORAL 0.025436 .0204 0.82 ---* .0005 *Dropped from model. R-Sq. .3644 Adj. R-Sq. .3623

NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix. (2)Significance level of t (p) based on a two-tailed test.

Probably more shocking to conventional analysis, however, are the elimination of No-Delay and Anticipations from the model. Remember, though, just how much the definition of these variables departs from those of conventional analysis. No-Delay allows into the model students who did not graduate from high school on time but who enrolled in postsecondary education within 10 months of their graduation date, and not as a false start. These students will attenuate the effects of a traditional "delayed entry" account. While the Anticipation variable appears to contribute more to the explanatory power of the model than other marginal independent variables such as selectivity of first institution, race, and STUWORK, it did not meet the t statistic threshold criterion.

Model Iteration, Stage 4: Performance in the First "True" Year

The next group of variables to enter in this model involves various configurations of academic performance in the referent first year of attendance. The justification for focusing on this group of variables lies in a tradition of the literature emphasizing the critical role of freshman year performance in retention. The literature, though, unfortunately relies on student self-reports of grades, and without reference to credits attempted or earned, and is not very rigorous concerning what it means by the "freshman year" (e.g. Kanoy, Wester and Latta, 1989). There are exceptions (e.g. Smith, 1992), but they are rare.

The "referent first year" performance variables used in this study are:

• Freshman GPA. GPAs were determined for the first full-year of attendance, and were set out in quintiles. Freshman GPA is a dummy variable that divides performance in the top two quintiles from the bottom three. For the HS&B/Sophomore cohort, the dividing line turns out to be 2.70.

• Credit Ratio. The ratio of credits earned to credits attempted during the first "true" year of attendance. Students who earned less than 90 percent of attempted credits were declared on one side of a dichotomy. But students who attempted 10 or fewer credits were excluded from this calculation.

• Low-Credits. Students who earned less than 20 credits in their first "true" year of attendance stand on one side of a benchmark. Some of these students did not attempt more than 20 credits. A combination of Credit-Ratio and Low-Credits was tested as a proxy for part-time status in the first true year. It was not as convincing as the DWI Index described above (see p. 55-56), but DWI is derived from an entire undergraduate career and hence is not part of the "true first year" performance variable configuration.

Table 34 brings these variables into the model. Compared to the population in the previous iteration, we lose a weighted N of about 50,000 students. We lose some of them because as soon as the independent variables are based on grades, credits, and a distinct time period, we can no longer include students with missing transcripts, even if we know a great deal about the institutions of those missing documents. We also lose those students whose entire first "true" year of attendance was consumed with non-credit, no grade remedial courses. Calculations for any of the three first year attendance variables for these obviously weaker students are impossible under those conditions.

The loss of these students inevitably skews the model. Weaker students don't finish degrees. The large group that remains in the equation will thus exhibit less variance in degree completion. The new contracted universe thus slows down the increase in the explanatory power of the model: the adjusted R2 moves by less than one percent (from .3623 in the stage 3 iteration to .3696).

The freshman year performance variables have a significant impact on both the composition of the model and the relative weights of the remaining attendance pattern variables. Selectivity of first institution of attendance, financial aid in the form of scholarships or grants-in-aid, and, finally, race, fall out of the model altogether. Race falls out of the model for the same reason that the contribution of SES declines even more from the Attendance Pattern iteration (stage 3): as one moves across the college access line, across the 4-year college attendance line, and into course-taking and academic performance, demographic variables are less and less important. What you do becomes more and more important than where you came from, though the effects of SES will never wholly be washed away. The last of the characteristics of first institution of attendance to remain in the model, selectivity, also falls away for an analogous reason: what you do is more important than where you are.

But the case of Grant-in-Aid calls for explanation. In the Attendance Pattern iteration (table 33), its contribution to the explanatory factor (the R2) was larger than that of STUWORK. Now the positions are reversed, and the student work variable survives, while the scholarship variable does not. Why? The answer is a by-product of the way STUWORK was defined as a dichotomous variable: the positive value was assigned only when the student worked while enrolled in college for more than one of the first four years following scheduled high school graduation in 1982, whereas a positive value was assigned to Grant-in-Aid if the student received a grant-in-aid or scholarship in any one year during the same period. Persistence is thus implicit in STUWORK, not in Grant-in-Aid.

The most significant change between the Attendance Pattern and First Year Performance models, though, is that the "academic resources" (ACRES) variable comes back to the top of the list, changing places with continuous enrollment (NOSTOP). The fact that the academic resources brought forward from secondary school comes to play such a robust role in the model only confirms the power of true first year performance: strong academic resources provide students with momentum into their college years--whenever those years start. Of the other "family" background variables, the contribution of "children" has been unaffected in any of the iterations of this model, indicating that if one starts a family at a young age, one's chances of completing a bachelor's degree by age 30 are negatively affected no matter what one does (see Waite and Moore, 1978).

#### Table 34.–Degree Completion Model: First Year Performance Indicators

Universe: All students who attended a 4-year college at any time, whose transcript records were complete, and who evidenced positive values for all variables in the model. N=4,264. Weighted N=1.029M. Simple s.e.=.723; Taylor series s.e.=1.079; Design effect=1.49.

 Variable ParameterEstimate Adj. s.e. t p Contributionto R2 INTERCEPT 1.391811 .1547 6.04 Academic Resources 0.084774 .0087 6.54 .001 .1668 NOSTOP 0.237503 .0213 7.48 .001 .1017 Low Credits -0.183009 .0268 4.58 .001 .0351 Freshman GPA 0.096158 .0190 3.40 .01 .0154 Children -0.250517 .0425 3.96 .01 .0152 Transfer 0.158913 .0238 4.48 .001 .0099 No-Return -0.109283 .0211 3.48 .01 .0096 SES Quintile 0.033787 .0075 3.02 .02 .0076 Credit Ratio -0.097777 .0316 2.08 .10 .0036 STUWORK 0.047759 .0184 1.74 .10 .0026 Selective 0.062054 .0251 1.66 ---* .0019 Grant-in-Aid 0.035557 .0184 1.30 ---* .0010 Race -0.048902 .0277 1.18 ---* .0009 *Dropped from model. R-Sq. .3715 Adj. R-Sq. .3696

NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix. (2) Significance level of t (p) determined by a two-tailed test.

As suspected, the ratio of credits earned to credits attempted in the first true year of attendance might have been a stronger contributor to the model had I found a way to include remedial students who attempted no additive credits during that period. But the presence of Low Credits, reflecting both part-time students and those who earned less than 90 percent of the credits attempted, is a far better explanation for the marginal impact of the credit ratio. Low-Credits is an umbrella for behaviors that hamper momentum toward degrees. As such, it will suppress variables expressing sub-sets of those behaviors.

If any of the variables describing referent institution of first attendance had been retained in the model through this iteration of true first year performance, we would be invited to dig further into institutional characteristics looking for significant effects. Roughly 30 percent of the HS&B/So postsecondary students who earned more than 10 credits and attended a 4-year college at some time also attended more than one school and never returned to the first institution of attendance. For this group of students, institutional effects are very difficult to attribute, even if their institutions were campuses in the same system(44). Another 12 percent attended three or more institutions but returned to the first. These groups constitute such a large proportion of 4-year college students as to give pause to any institutional effects analyses (e.g. Velez, 1985) in national data bases.

Model Iteration, Stage 5: Continuing Effects of the First Year

But other variables that can be derived from the rich archives of NCES longitudinal studies suggest that student experience at the first institution of attendance may have continuing effects. The extensions of three performance variables encourage us to explore this hypothesis. After all, as one moves beyond the first true year of attendance, the model can--and should--be used to guide post-matriculation advisement. To fill the tool box with appropriate instruments and directions, we should transcend the boundaries of traditional prediction models.

## DWI Again

One of these variables is the "DWI Index". When students withdraw or leave incomplete a significant percentage of their attempted courses, the behavior is bound to have a negative effect on degree completion. In table 37, DWI is a dummy variable with a cut-point of 20 percent. That is, the variable is positive if students dropped, withdrew from, left incomplete, or repeated more than 1 out of 5 courses attempted during their undergraduate careers. The correlation of DWI with the first-year variable, Low-Credits, is unsurprisingly high in a sample of this size and in a matrix with eleven other variables (.233; t=10.3). If you are withdrawing from a significant number of courses, the chances are reasonably high that your credit count will be low, no matter which year of your undergraduate career is at issue.

### GPA Trend

The second variable that extends first year history is the trend of a student's GPA. First true year grade performance, as we have seen, plays a modestly positive role in the model of degree completion. But common sense tells us that for some students, first year grades will be lower than final GPA for people who complete degrees. Students begin to major, and grades in major courses are inevitably higher than grades in prerequisites or distribution requirements. The variable GPA Trend is a dichotomous version of the ratio of final year GPA to first year GPA. If the range of that ratio was .95 to 1.05, there was basically no change in performance. Below .95 indicates a falling GPA; above 1.05 reflects a rising GPA. GPA-Trend places a positive value on the rising GPA, and a negative value on any ratio of .95 or lower.

How did the HS&B/So students who attended 4-year colleges at any time fare in each of these trends in terms of final GPA? Table 35 tells an interesting story in this matter. Among those who earn bachelor's degrees, students whose GPAs don't change that much between first and final undergraduate years sport a higher final GPA than those whose GPAs rise. But the bachelor's degree attainment rate of those with "stable" GPAs is lower than that of those whose grades rise over time. Doing better (with a rising GPA as the indicator) appears to be an indirect proxy for determination.

#### Table 35.–Final undergraduate GPA of students who attended 4-year colleges at any time, by trend in undergraduate GPA and bachelor's degree attainment, High School & Beyond/Sophomore cohort, 1982-1993

 Final GPA of Students WhoDid Not Earn Bachelor's Final GPA of Students Who Earned Bachelor's % of TrendGroup WhoEarned BA TrendPct.of All Mean S.D. s.e. Mean S.D. s.e. GPA TREND: Rising 2.39 0.651 .0246 2.87 0.442 .0110 69.8% 41.1% Stable 2.32 0.800 .0031 3.02 0.485 .0014 64.1 34.1 Falling 1.97 0.599 .0023 2.70 0.429 .0016 52.2 24.9

NOTES: (1) Universe consists of all HS&B/So students who attended a 4-year college at any time and for whom an undergraduate GPA could be computed. Weighted N=1.25M. (2) Standard errors adjusted for design effect of 1.49. (3) Differences in bachelor's attainment rates are significant at p<.05. SOURCE : National Center for Education Statistics: High School & Beyond/Sophomore cohort, NCES CD#98-135.

## Remedial Problems

A third variable seeks to add the effects of remedial problems. The college transcript samples of NCES longitudinal studies teach us that there are different kinds of remedial course work, and that some are more serious than others. If the type of remediation matters, so does the amount. Of the HS&B/So students who were assigned to remedial reading, 74 percent were enrolled in two other remedial courses. Of those whose only remedial mathematics work in college was pre-collegiate algebra, only 16 percent were enrolled in two or more remedial courses (and of this group, 75 percent were assigned to remedial reading). The first case is a remedial problem case; the second is not. These are further matters of common sense: (1) people with reading deficiencies cannot read mathematics problems, either; (2) people whose only problem on a basic skills placement test stems from a bad Algebra 2 course in high school can proceed toward a degree with minimal disruption.

The "remedial problem" variable used in the regression model is a trichotomy derived from the observations of table 36 on the relationship between remedial coursework and degree completion among those who attended a 4-year college at any time: 1=any remedial reading, 2=other types of remedial work, and 3=no remedial work.

#### Table 36.–Bachelor's degree attainment of 4-year college students with different types and amounts of remedial coursework, High School & Beyond/Sophomore cohort, 1982-1993

 Percent ofAll Students Percent EarningBachelor's Degree Any remedial reading 10.2 39.3 No remedial reading, but>2 other remedial courses 18.7 46.5 No remedial reading, but1 or 2 other remedialcourses 20.4 59.6 No remedial coursework 50.7 68.9

NOTES: (1) Universe consists of students who attended a 4-year college at any time and for whom transcript data on remedial coursework were available; Weighted N=1.38M; (2) all column pair comparisons are significant at p<.05.; (3) For the definition of remedial courses, see footnote #8. SOURCE : National Center for Education Statistics: High School & Beyond/Sophomore cohort, NCES CD#98-135.

The "remedial problem" variable is treated as a continuing effect, and not a first year performance variable, for a very empirical reason. Among the HS&B/So students who attended a 4-year college at any time and took one or more remedial courses, slightly more than half of those courses (52.2 percent) were taken during the first calendar year of attendance. In fact, by the end of the second calendar year of attendance 68.6 percent of the total 11-year remedial course load had been completed; and by the end of the fourth year, 84.4 percent. These data suggest that to isolate the impact of remediation problems on degree completion, we should look beyond the first year.

Table 37 presents the "extended first year performance" iteration of the regression model in which these three variables are introduced. It is not surprising that first year Credit Ratio is pushed out of the model by DWI and GPA Trend. The DWI index will also reduce the contribution of Low Credits (from .0382 to .0262) because it overlaps Low-Credits (the correlation of .2396 in a matrix with a dozen other variables is strong). The remaining attendance pattern variables, No-Return and Transfer, also decline in influence in the face of the new variables built on grades. In other words, after the first year of attendance, academic performance (with grades as its proxy) becomes more important for degree completion than place-referenced attendance.

#### Table 37.–Degree Completion Model: Extending First Year Performance

Universe: All students who attended a 4-year college at any time, whose transcript records were complete, and who evidenced positive values for all variables in the model. N=4,264. Weighted N=1.029M. Simple s.e.=.723; Taylor series s.e.=1.079; Design effect=1.49.

 Variable ParameterEstimate Adj. s.e. t p Contributionto R2 INTERCEPT 1.616038 .1409 7.70 Academic Resources 0.077990 .0081 6.46 .001 .1668 NOSTOP 0.206917 .0204 6.81 .001 .1017 DWI Index -0.254825 .0249 6.87 .001 .0642 Low-Credits -0.178315 .0256 6.97 .001 .0225 Freshman GPA 0.141591 .0190 5.00 .001 .0225 GPA Trend 0.155576 .0179 5.83 .001 .0153 Children -0.229453 .0405 3.80 .01 .0119 No-Return -0.096415 .0201 3.22 .01 .0071 Transfer 0.129201 .0227 3.82 .01 .0068 SES Quintile 0.029912 .0069 2.91 .02 .0053 STUWORK 0.049976 .0176 1.91 .10 .0026 Credit Ratio -0.060221 .0300 1.35 ---* .0018 Remedial Problem -0.019793 .0139 0.96 ---* .0006 *Dropped from model R-Sq. .4291 Adj. R-Sq. .4275

NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix. (2) Significance level of t (p) determined by a two-tailed test. (3) DWI=Drops, Withdrawals, and Incompletes.

The new Remedial Problem variable fails all tests for entry into the model. Why? In a way, whatever remedial variable might be introduced at this point is doomed a priori by the strength of the secondary school Academic Resources (ACRES) variable. The correlation between ACRES and Remedial Problem, in the matrix with a dozen other variables, is a stunningly high -.4295. Common sense wins again: students entering college with a low degree of academic resources evidence continuing remedial problems dominated by reading and do not earn degrees. ACRES already accounts for this story, as do other variables such as DWI and Low-Credits, which are inclusive in their reflection of sub-par academic work.

More significantly is that the influence of SES begins to fall (from .007 in table 34 to .0053 in table 37), and that the contributions of continuous enrollment (NOSTOP) and academic resources (ACRES), are unaffected by the introduction of post-first year performance variables. These two variables keep the fundamental story alive.

Model Iteration, Stage 6: Can We Get Satisfaction?

If only three attendance pattern variables remain in the model (continuous enrollment, No-Return, and transfer), and the model accounts for about 43 percent of the variance in degree completion, what's left to add? How much further can we push the envelope of explanation of bachelor's degree completion for the HS&B/So cohort? An R2 of .4275 in a model such as this is considered extraordinarily persuasive. Some would argue to stop at this point, that the models have reached a plateau of explanation. Others would note that in the more traditional terms of predictive modeling, we reached a plateau of explanation in Stage 4 (first year performance). But there is a substantial body of research concerned with student responses to, and assessments of their postsecondary experience that suggests these responses might influence persistence (e.g. Tracey and Sedlacek, 1987; Pascarella, Terenzini, and Wolfle, 1986). Too, in trying to replicate Horn's (1998) analysis of the BPS90 students who left postsecondary education during or by the end of their first year, and exploring what distinguished those who returned at a later point in time (the stop-outs) from those who never returned (what Horn calls the "stay-outs"), I found an almost linear relationship between degree of dissatisfaction and permanent "stay-out" status. The group at issue was too small to yield significant findings, but the experience suggested that the construct of satisfaction might be profitably pursued in a multivariate context.

The fifth collection of variables used in the iteration covers four aspects of student satisfaction with their postsecondary careers: academic, environmental, work preparation, and cost. The questions were asked only once, retrospectively, in 1986 (four years after scheduled high school graduation). Of the four categories, only two--academic and environmental--offer enough items to create a separate index(45). For each item, e.g. satisfaction with "my intellectual growth" (an academic category) or "sports and recreation facilities" (an environmental category), students were offered a scale of five responses. I turned each question into a dummy variable (dissatisfied/not dissatisfied), aggregated the responses, and turned each of the aggregates into another dummy variable. A composite "dissatisfaction index" was then built from all four categories with a minimum score of 4 (highly satisfied) to a maximum of 8 (highly dissatisfied). The composite was again dichotomized, with scores from 6 to 8 signifying some degree of overall dissatisfaction.

Only one question about satisfaction with the costs of postsecondary education was asked. To isolate the contribution of financial aid to this dimension of satisfaction, an enhanced dummy variable was created(46). Students were assigned a positive value if they received either a grant or a loan and were dissatisfied with the costs of postsecondary education. The components of this variable were first treated separately, and the difference in dissatisfaction rates between those with grants and those with loans was found to be small (30.8 percent to 33.6 percent) and statistically insignificant.

One reviewer of this study suggested that one should turn the coin, so to speak, on these features, and emphasize satisfaction, not dissatisfaction. Given dichotomous variables, too, there is a statistical argument for choosing the larger group (those who are satisfied) as the reference point. But there is substantial body of research--Horn's (1998) included-- demonstrating that indications of global satisfaction are almost mindless reflex responses of students, and that, if one wishes to isolate a strong attitude in an explanatory context, the negative attitude will be more revealing. People responding to surveys have to go out of their way to tell you that they are unhappy.

We are unfortunately limited by the data base in our grasp of those aspects of student experience that Astin (1977, 1989, 1993) and Tinto (1975, 1993) and others have described in such terms as "academic integration" and "social integration," though, as Cabrera, Nora and Castaneda (1993) demonstrate, academic integration is expressed indirectly through GPA. High School & Beyond never asked (as does the Beginning Postsecondary Students, 1989-94 study) how much contact with faculty students enjoyed outside of class, for example. And with the exception of athletics, we have a very limited sense of their participation in extra-curricular activities in college. These questions may be important, but when students are attending two or three schools, and when the fact of multi-institutional attendance doesn't seem to matter in explaining degree completion, then it is impossible to attach the academic and social experiences elicited by these questions to any one institution, unless the student's career involved only one institution (see Cabrera, Casteneda, Nora, and Hengstler, 1992), or, if more than one, was dominated by an institution at which one began and to which one returned.

Table 38 presents the results of the stage 6 iteration of the model. We lose a few people here as a by-product of non-response to the satisfaction questions, and the design effect drops from 1.49 to 1.46, resulting in slightly higher critical t-values than would have been the case with a larger sample. The input variables include the eleven carried forward from stage 5 (Extending First-Year Performance), three satisfaction indicators, and the combination financial/satisfaction dummy variable. Given the degree of overlap in the satisfaction variables (confirmed by covariance analysis), the selection set a generous inclusion threshold of p<.2 if for no other reason than to demonstrate just how marginal some of these variables would be. Even then, only one of the four satisfaction variables, that indicating dissatisfaction with academic experiences, passed the test--only to be dropped within the dynamics of the regression.

#### Table 38.–Degree Completion Model: Satisfaction Variables

Universe: All students who attended a 4-year college at any time, whose transcript records were complete, and who evidenced positive values for all variables in the model. N=3,807. Weighted N=914k. Simple s.e.=.741; Taylor series s.e.=1.083; Design effect=1.46.

 Variable ParameterEstimate Adj. s.e. t p Contributionto R2 INTERCEPT 1.688707 .1448 7.99 NOSTOP 0.210741 .0213 6.78 .001 .1581 Academic Resources 0.079632 .0086 6.34 .001 .1011 DWI* Index -0.272824 .0259 7.22 .001 .0714 Low-Credits -0.192789 .0252 5.24 .01 .0207 GPA-Trend 0.125959 .0182 4.74 .01 .0173 Children -0.230705 .0431 3.67 .01 .0136 Freshman GPA 0.133546 .0193 4.74 .01 .0087 No-Return -0.105192 .0205 3.52 .01 .0087 Transfer 0.120377 .0228 3.62 .01 .0057 SES Quintile 0.027110 .0070 2.65 .05 .0045 STUWORK 0.042087 .0176 1.64 ---* .0018 Acad Dissatisfaction -0.035955 .0198 1.24 ---* .0011 *Dropped from model. R-Sq. .4128 Adj. R-Sq. .4109

NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix. (2) Significance level of t (p) determined by a two-tailed test. (3) *DWI=Drops, Withdrawals and Incompletes.

It is obvious that, after the first true year of higher education, the financial aid and satisfaction factors are peripheral. The overall explanatory power of the model actually decreases with the interactions of the satisfaction measures. They have the effect of reversing the positions of ACRES and NOSTOP again, and slightly reducing the contribution of both those key variables. The satisfaction index proved to be unproductive, as did the attempt to tease out dissatisfaction with the costs of higher education. Very little of this was a result of reducing the universe of students by eliminating those who did not answer any or enough satisfaction questions in 1986 to yield a satisfaction index score. Had we not eliminated those students, there would have been virtually no change in the R2 indicator.

Reintroducing Race, Gender and Aspirations

If there were aspects of the fundamental story reflected in table 37 that differed significantly by race and sex, the reintroduction of those variables at this point in model-construction would alter the relationships we have observed. Following Astin, Tsui and Avalos (1996), I made one more attempt to bring them into the model, starting with the variables that survived Stage 5 (Extending First-Year Performance) and this time with four separate race variables (white v. others; black v. others; Latino v. others; and Asian v. others). None of the race variables was accepted into the equation, and the effect of the attempt basically left the R2 unchanged at .4263.

But the modified aspirations variable (Anticipations) did pass the threshold criterion of statistical significance, reentering the model, and moving ahead of SES Quintile and STUWORK in its contribution to the explanatory power of the model. While carrying a comparatively low t (2.29), and a low degree of significance (.10), its reintroduction slightly boosts the adjusted R2 from .4275 to .4309. In light of the fact that all regression analysis involves some degree of multicollinearity (Schroeder, Sjoquist, and Stephan, 1986), and the Anticipations variable evidences a high degree of correlation with the performance variables in the model, I wouldn't make that much of its reappearance. It can move in and out of a model such as this, but always on the margin. Its minor contribution at this point indicates that, in fact, a plateau of explanation was reached in the first-year performance extension iteration. Given the dependent variable, bachelor's degree completion, anything else would muddle and distort the guidance this story offers.

## Confirming the Guidance: the Logistic Story

The preferred statistical technique for telling this story involves logistic regression. To put the difference between the Ordinary Least Squares linear regression models and the logistic model too simply, the former seeks to minimize the errors in the measurement of an event, while the latter seeks to estimate the maximum likelihood of an event.

Table 39 takes the first five equations from the OLS story-that is, up to the point at which the story reached a plateau-and presents them in a logistic form. By displaying the equations together, we can both observe the changes in the position of the independent variables from step to step, and assess whether the sequence of models provides an increasingly convincing explanation ("goodness-of-fit"). What do we see in Table 39? First, there are five statistics at the bottom of page 81 that help us judge the explanatory power of the sequence of models and that tell us which stages make the greatest difference. To put it simply, everything that is supposed to happen in a series of logistic models such as this (Cabrera, 1994), happens: the G2 declines; the ratio of G2 to the degrees of freedom declines; the Chi-square rises in the face of an increasing number of variables. All of this says that the logistic analysis becomes increasingly efficient. The relative changes of these statistics also tell us that the Financial Aid model does not add that much to the potential guidances in our tool box, and that the Attendance Pattern model adds the most (the same conclusions reached in the OLS sequence).

To appreciate the convergences and differences between logistic and linear models of this story, one must watch the changes in the odds ratios for a given variable and the statistical significance of its parameter estimates. For example, the variable indicating that the first college attended was a 4-year institution exhibits seemingly impressive odds ratios in all three of its appearances, but only in one of those cases is the estimate statistically significant, and even in that case, barely. The linear version did not even allow the "First Was 4-Year" variable into its models. On the other hand, in the logistic version, SES evidences more modest odds ratios, but its estimates are statistically significant in all five appearances.

#### Table 39.–Logistic version of 5 sequential regression models with bachelor's degree attainment by age 30 as the outcome. High School & Beyond/Sophomore Cohort, 1982-1993

 1: Background 2:Financial Aid 3: Attendance Patts 4:1st Year Perform 5.Extending Perform Estimate Odds Ratio Estimate Odds Ratio Estimate Odds Ratio Estimate Odds Ratio Estimate Odds Ratio Intercept -3.174 -3.248 -4.212 -3.609 -3.548 ACRES 0.678* 1.97 0.628* 1.87 0.544* 1.72 0.403* 1.50 0.389* 1.48 Children -1.968* 0.14 -1.924* 0.15 -1.469** 0.23 -1.517** 0.22 -1.480** 0.23 SES 0.202** 1.22 0.253** 1.29 0.194*** 1.21 0.162++ 1.18 0.161++ 1.18 Anticipations 0.423*** 1.53 0.386+ 1.47 0.226 1.25 0.322 1.38 0.478+ 1.61 Race -0.398++ 0.67 -0.461++ 0.63 -0.461++ 0.63 -0.325 0.72 -0.069 0.93 Sex -0.234 0.79 -0.201 0.82 -0.150 0.86 -0.091 0.91 0.006 1.01 Grant-in-Aid 0.370+ 1.44 0.405+ 1.50 0.215 1.24 0.201 1.22 Loan 0.134 1.11 0.056 1.06 0.059 1.06 0.044 1.05 STUWORK 0.396++ 1.49 0.259 1.30 0.200 1.22 0.228 1.26 # of Schools -0.009 0.99 0.015 1.02 -0.077 0.93 First Was 4-Yr 0.403 1.50 0.603++ 1.83 0.441 1.55 First Was Doctoral 0.153 1.17 0.187 1.21 0.171 1.19 First Was Select 0.693+ 2.00 0.600++ 1.82 0.561 1.75 OUTSYS -0.652 0.52 -0.564 0.56 -0.782 0.46 Transfer 1.349* 3.85 1.449* 4.26 1.255* 3.51 No Return -0.691** 0.50 -0.634** 0.53 -0.522+ 0.59 No Delay 0.262 1.30 0.166 1.18 0.214 1.24 No Stop 1.553* 4.73 1.408* 4.09 1.349* 3.85 1st Yr Grades 0.726* 2.07 1.104* 3.02 1st Yr Low Creds -0.887* 0.41 -0.927** 0.40 1st Yr Cred Ratio -0.573++ 0.56 -0.501 0.61 DWI Index# -1.558* 0.21 Grade Trend 1.167* 3.21 Remedial Problem -0.137 0.87 G2 5386.5 5304.6 4106.7 3643.1 3300.3 df 4929 4926 4419 4237 4234 G2/df 1.0928 1.0768 0.9293 0.8598 0.7795 X2(df) 35.61 (6) 36.74 (9) 41.15 (18) 41.64 (21) 45.57 (24) p .001 .001 .01 .01 .01

NOTES: (1) The universe for each stage in the model is the same as that used for the parallel steps in the Ordinary Least Squares regressions above. (2) Standard errors used in the determination of statistical significant of the Beta estimates are adjusted by the same design effects as in the OLS versions. (3) Keys to significance levels: *=.001 **=.01 ***=.02 +=.05 ++=.10. (4) #DWI Index=Drops, Withdrawals and Incompletes. SOURCE: National Center for Education Statistics: High School & Beyond/Sophomore cohort. NCES CD#98-135.

The logistic story, unlike the linear story, truly disentangles the Transfer and No Return variables. There is a dramatic diversion between the two, and Transfer, in particular, turns out to be much stronger in the "maximum likelihood" approach to bachelor's degree completion than it does in the linear model. The Transfer variable is very distinct in this account. It does not mean merely that you attended both a 2-year and a 4-year college, rather that you started at the 2-year, earned more than 10 credits, then moved to the 4-year and earned more than 10 credits there, too. This definition truly sorts people moving toward a bachelor's degree from those multi-institutional attendees engaged in less direct routes. It also filters out the "early transfers," those who did not wait for the community college to provide a sufficient comfort level in higher education. As noted above (p. 52) those who jump ship early to the 4-year college are much less likely to complete a bachelor's degree. The odds ratios for Transfer so defined are very high: 3.85:1 in the Attendance Pattern model, 4.26:1 when we add the 1st Year Performance variables, and 3.51:1 in the Extended Performance model. The only other variable in the logistic story that exceeds those odds ratios is NOSTOP, that is, continuous enrollment. The transfer focus thus becomes a critical direction in the tool box. If we know that students who meet the transfer sequence criteria succeed as well as they do, then we should guide them into that sequence instead of allowing them to leave the community college too early.

What else is different when we compare the logistic to the linear account? The Academic Resources variable, while statistically significant in all its appearances in the logistic model, exhibits an odds ratio decline to the point at which its power appears to be less than first year grades, overall GPA trend, Transfer, and continuous enrollment. The power is still considerable, but not as overwhelming as the linear story would lead us to believe. While this is a disappointment to the tenor of the analysis to date, the most significant drop in the odds ratio for ACRES occurs in the 1st Year Performance iteration (Stage 4), when the restricted Transfer variable enters. Otherwise, the two stories-linear and logistic-are the same, and the same 11 variables remain statistically significant in the final step of the model.

-###-
III. The Age of Multi-Institutional Attendance V. Conclusion: the Tool Box Story