Archived Information
Answers in the Tool Box: Academic Intensity, Attendance Patterns, and Bachelor's degree Attainment — June 1999Do the complex patterns of attendance described above have any impact on bachelor's degree completion, and, if so, which variables survive in regression analyses and how much explanatory power do they contribute to a model that also includes precollege characteristics, measures of early postsecondary performance, modes of financial aid, and satisfaction with major aspects of higher education?
We have now arrived at multivariate analysis. In this section of the study, there will be six sequential ordinary least squares (OLS) regression models:
As each configuration of variables is entered, the strong stories rise to the top in terms of contribution to the model and statistical significance, while the weak ones fade. At the end of this process, out of 29 independent variables introduced, only 11 remain. The story told by those 11 variables provides very strong guidance for improving degree completion rates for all populations, and particularly for minority and lowSES students.
A logistic regression version of the first five equations is then presented. Logistic regression is the statistician's method of choice when the outcome is a dichotomous variable such as did/did not earn bachelor's degree. Logistic regression expresses itself in a different way than does OLS. Its objective is to identify the "maximum likelihood" of a relationship, the "probability of observing the conditions [ital mine] of success" (Cabrera, 1994, p. 227). The principal metric in which it tells its story is an odds ratiothe way in which HIGHMATH was presented above (see p. 17). In some ways, a logistic regression is like an epiphany: it's results make a dramatic statement, the parameters of which are sometimes unexpected. OLS, on the other hand, seeks to minimize the difference between predicted and observed probabilities, that is, between the rational and the empirical. The result is hardly analogous to an epiphany: it is slow and unfolding (Pedhazur, 1982), and expresses its fundamental conclusion in a metric accessible to the general reader: the "percent" of variance accounted for by the model (the R^{2}). Logistic regression has an analogous metric of conclusion, the G^{2}, but it is less accessible.
The population to be considered in the mutilvariate equations consists only of those students in the HS&B/So who attended a 4year college at any time, even though they might also have attended other types of schools. In a longitudinal study that continues to age 30, the mark of someone who intends to earn a bachelor's degree is actual attendance at a bachelor's degreegranting institution, not a statement. These students may have attended other types of schools, but if the dependent variable is bachelor's degree completion, we distort our understanding of what makes a difference if we include people who never really tried.
But what about those students who expected to earn a bachelor's degree, but never attended a 4year college by age 30? Aren't they students for whom family income and SES play roles much stronger than ACRES or any of its components? Should they be included in the analysis of degree completion, and, if not, what impact will the exclusion have on both our assessment of the validity of the ACRES variable and the analysis of what makes a different in degree attainment. The best way to confront these questions is by comparing two groups of students (1) whose referent first institution of attendance was a community college, (2) who earned more than 10 credits from the community college (that is, they were not incidental students), and (3) who expected to earn a bachelor's degree. One group ultimately attended a 4year college; the other did not. They are of roughly equal size (a weighted N of 84k for those who did not attend a 4year college; a weighted N of 96k for those who did). What do they look like? Table 27 sets forth some basic parameters.
Percent by SES Quintile 
High  2nd  3rd  4th  Low 
No 4Yr  21  27  23*  22  7* 
4Year  29  36  22*  9  * 
Percent by Academic Resources Quintile 

No 4Yr  11  27  37  19  6* 
4Year  32  32  27  7  3* 
Percent by Total Undergraduate Credits 
1129  3059  6089  90+  
No 4Yr  32  37  22  9  
4Year  3  7  8  82  
Percent by Total Credits from CommColl 

No 4Yr  33  36*  21  10*  
4Year  16  30*  46  8* 
NOTES: (1) All column comparisons are statistically significant at p<.05 except those indicated by asterisks. (2) Bachelor's degree expectation as indicated in 12^{th} grade. (3) Rows may not add to 100% due to rounding. SOURCE: National Center for Education Statistics: High School &Beyond/Sophomore Cohort, NCES CD#98135.
A more dramatic way of illustrating what makes a difference for these two groups is through a logistic regression with attending a 4year college as the dependent variable.
(Beta) Estimate 
s.e.  t  p  Odds Ratio 

Intercept  3.2456  .604  3.61  
Family Income  0.0934  .078  0.80    0.91 
SES  0.4361  .120  2.43  .05  1.55 
Academic Resources  0.6031  .120  3.37  .02  1.83 
NOTES: (1) Weighted N=181k; (2) standard errors adjusted for design effect; (3) Design effect=1.49. SOURCE: National Center for Education Statistics: High School & Beyond/Sophomore cohort, NCES CD #98135.
People, including those who aspire to bachelor's degrees, attend community colleges for many reasons. Within the groups at issue here, family income does not play a role in whether they attend 4year colleges as well. The odds ratio for family income is very close to 1.0 (which indicates no influence) and the parameter estimate does not meet the criterion for statistical significance at all. SES, which transcends income, is significant; but academic resources seem more significant. What do we make of this?
In both the linear and logistic model series below, SES exerts a modest but declining influence on bachelor's degree attainment as students move into postsecondary education and through their first year. The group of students who started in a community college, expected to earn a bachelor's degree, but never attended a 4year college by age 30 are much weaker in academic resources than their peers who eventually did attend a 4year college (table 27). They also exhibit a lower SES distribution, though not dramatically so. If we include them in the multivariate analysis, we can speculate that SES would exert the same modest but declining influence on degree completion but at a slightly higher level. For example, in the first stage of the logistic regression series, SES might carry an odds ratio of 1.26 to 1 instead of 1.22 to 1. The difference is not a compelling reason for including this group in an analysis designed to help us select the tools we will need to use in both maintaining and bringing greater equity to bachelor's degree completion rates.
So, the universe will be limited to those who attended a 4year college at any time. This sounds simple enough. But in programming the database, there are two potential definitions of the universe of students "who attended a 4year college at any time." By one definition, we admit only those students for whom we actually received at least one transcript from a 4year college. By the second definition, we also include cases where one or more transcripts from 4year colleges were requested but none were received. For the universe created by the second definition, we have all variables that were selfreported in the surveys, for example, all the financial aid modes, and even values for attendance pattern variables derived simply from the number and nature of transcripts requested. However, the small expansion group begins to drop out of the analysis when we reach the Attendance Pattern model because, without transcripts, it is impossible to determine a value for any variable based on dates, for example, continuity of enrollment and delay of entry, let alone those based on credits, grades, and coursetaking^{(40)}.
The background model—before introduction of attendance, performance and other college experience variables—is offered in table 29. It consists of three standard demographic constructs: SES (in quintiles), RACE (dichotomous Black/Latino/AmerInd v. White/Asian), and SEX (male=1). It also includes a dummy variable, children, marking whether the individual became a parent at anytime up to 1986 (age 22/23)^{(41)}, the composite ACRES (academic resources) to indicate the quality of the student's performance (curriculum, class rank, and test scores) in secondary school, and the sharpened construct of educational "anticipations." The dependent variable is bachelor's degree completion by age 30 in 1993.
What does this basic background model say, no matter which universe one uses? In the absence of any other information: (1) the six independent variables in the equation explain between 21 percent (Part A) and 23 percent (Part B) of the variance in longterm bachelor's degree completion among students in the HS&B/So who attended at least one 4year college at any time up to age 30; (2) of the six independent variables in the equation, that which carries the student's high school background, ACRES, contributes most to the explanation; and (3) of the remaining five independent variables, only the fact of becoming a parent prior to age 22^{(42)} and parents' socioeconomic status contribute anything else of significance to the explanation.
This is the most basic of common sense matters. If the universe had not been limited to students who attended 4year colleges at some time, ACRES would have contributed slightly more to the explanation of variance in degree completion. When the outcome is degree completion, who you are is less important than the amount and quality of the time you invest in activities that move you toward that goal.
While race and sex fall out of the model very early, no matter which universe of 4year college students we use, the "anticipations" variable stays in the equation, however marginally. It stays in the equation only because it was set up as a dichotomy with the positive value confined to "bachelor's consistent" expectations, as appropriate to students who attended 4year colleges. It is for this reason, along with the fact that the next set of variables to be introduced (those reflecting financial aid and student employment while enrolled) do not rely on transcript data that the universe of Part B will be carried forward.
Universe: All students for whom a transcript from a 4year college was received, who received a high school diploma or equivalent prior to 1988, and who evidenced positive values for all variables in the model. N=4,765. Weighted N=1.131M. Simple s.e.=.702; Taylor series s.e.=1.083; Design effect=1.54.
Variable  Parameter Estimate 
Adj. s.e.  t  p  Contribution to R^{2} 
INTERCEPT  1.782346  .1159  9.98  
ACRES  0.133301  .0094  9.21  .001  .1651 
Children  0.340264  .0433  5.10  .01  .0282 
SES Quintile  0.036144  .0083  2.83  .05  .0122 
Anticipations  0.071447  .0218  2.13  .10  .0039 
Race  0.073809  .0300  1.60  *  .0026 
Sex  0.040801  .0193  1.37  *  .0017 
*Dropped from model.  RSq.  .2137  
Adj.  RSq.  .2127 
Universe: All students for whom the evidence confirms 4year college attendance at any time (whether transcripts were received or not), who received a high school diploma or equivalent prior to 1988, and who evidenced positive values for all variables in the model. N=4,943. Weighted N=1.179M. Simple s.e.=.697; Taylor series s.e.=1.082; Design effect=1.55.
Variable  Parameter Estimate 
Adj. s.e.  t  p  Contribution to R^{2} 
INTERCEPT  1.722264  .1112  9.99  
ACRES  0.136088  .0091  9.65  .001  .1829 
Children  0.321327  .0410  5.06  .01  .0277 
SES Quintile  0.038911  .0081  3.10  .02  .0142 
Anticipations  0.086950  .0216  2.60  .10  .0057 
Race  0.077506  .0299  1.68  *  .0024 
Sex  0.040627  .0193  1.37  *  .0019 
*Dropped from model.  RSq.  .2347  
Adj.  RSq.  .2338 
NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix and Skinner, Holt and Smith (1989). (2) Significance level of t (p) based on a twotailed test.
It is not surprising that race and sex fall out of the model, no matter how generous a statistical selection criterion was used^{(43)}. If these variables failed to meet statistical selection criteria at a stage of student history prior to college attendance and in the course of constructing ACRES, their chances of playing any role after the student group has been winnowed to 4year college attendees is dim indeed. Given the nature and intensity of current and continuing concern with the educational attainment of minority students, however, this study keeps race in the models until it fails to register the slightest impact. It is retained in contrast to other features of students' educational histories that are subject to the kind of change (race is not subject to change) that yield degree completion.
If we subjected the specifications for Part B of the Background Model to a logistic regression, would the relationships be the same? Table 30 presents a portrait that, in a few respects is slightly different from that of the OLS accounting. ACRES is again the strongest of the positive components and becoming a parent by age 22 is the strongest of the negative components. The estimates for ACRES say that for each step up the quintile ladder of that variable, the odds of earning a bachelor's degree increase by nearly 100 percent. The estimates for parenthood say that having children reduces the odds of earning a degree by 86 percent (1 minus the odds ratio of 0.14). Degree anticipations and SES are also positive contributors (as they were in the OLS version), but in the logistic version both the parameter estimate and the odds ratio favor degree anticipations over SES. The explanation is more technical than substantive. Unlike the OLS version, the logistic account keeps race in the model, however tenuously. This results lends modest support to my decision to carry race into Stage 2.
Universe: All students for whom the evidence confirms 4year college attendance at any time (whether transcripts were received or not), who received a high school diploma or equivalent prior to 1988, and who evidenced positive values for all variables in the model. N=4,943. Weighted N=1.179M. Simple s.e.=.697; Taylor series s.e.=1.082; Design effect=1.55.
Variable  Parameter Estimate 
Adj. s.e.  t  p  Odds Ratio 
INTERCEPT  3.174  0.691  2.96  
Academic Resources  0.678  0.051  8.58  .001  1.97 
Children  1.968  0.283  4.49  .001  0.14 
SES  0.202  0.043  3.03  .01  1.22 
Anticipations  0.423  0.112  2.44  .02  1.53 
Race  0.398  0.156  1.65  .10  0.67 
Sex  0.234  0.104  1.08    0.79 
NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix and Skinner, Holt and Smith (1989). (2) Significance level of t (p) based on a twotailed test.
Model Iteration, Stage 2: Modes of Financial Aid and Student Work
Where do we turn to explore and strengthen the explanation of degree completion? There are four sets of variables to which one's attention is drawn for this task: financial aid and the extent to which a student worked while enrolled, attendance patterns, satisfaction with various aspects of postsecondary education, and firstyear performance. Because financial aid has been shown to be critical to initial enrollment decisions (Jackson, 1988) and firstyear retention (Stampen and Cabrera, 1986), our story regards it as antecedent to attendance patterns. Financial aid and student work thus enter the model at this point.
The HS&B/So limits our utilization of financial aid variables with confidence. The database relies principally on student selfreports concerning their financing of college education, and students usually don't know where their financial aid is coming from or how much is at issue. For example, the data set includes an unobtrusive Pell Grant file, and I found about 6000 (weighted) students in that file who never claimedin the surveysto have received financial aid. Pell Grants, of course, apply to only a fraction of college students; and thirdparty files for other programs (Stafford Loans, State Supplementary Grants, and others) were not available. A variable was created that simply marked whether a student had ever received a scholarship or grant of any type between 1982 and 1986; and a parallel variable was formulated for loans. With the exception of describing combinations of types of financial aid (including workstudy) offered to students (St. John and Noell, 1989), that is the extent of confidence in financial aid analyses for the HS&B/So (and the NLS72 as well).
The third component of "financial aid" was derived from student responses to a loop of questions concerning the financing of postsecondary education, yearbyyear, from 1982 through 1986, with a focus on work. Neither the dollar amount nor the number of hours worked are in question here, rather the fact, sources, timing, and purpose of work. For each year students indicated that they worked to pay some of the costs of their postsecondary education a dummy variable was constructed. A positive value was registered if the sources of the student's earnings were WorkStudy, CoOp placements, Teaching/Research Assistantships, and/or "other earnings while in school," i.e. work activities concurrent with enrollment. These activities are more likely to take place on campus than offcampus. Savings from work prior to entering higher education or during periods of stopout, as well as earnings from summer jobs were excluded. A composite variable was constructed to indicate whether the student "worked" (as defined by these boundaries) in each of the four academic years in question (19821986), and then dichotomized to mark whether, in two or more of the four academic years covered by the loop of financing questions, the student worked concurrently with enrollment and in order to cover costs of education. The variable is called STUWORK.
The literature on this issue (e.g. CuccaroAlamin and Choy, 1998; Horn, 1998) is usually based on the number of hours worked, a datum not available in the HS&B/So. The consensus seems to be that a modest amount of work while enrolled enhances retention, that work on campus certainly intensifies student involvement and contributes to completion (Astin, 1993), but that an excess of work (particularly offcampus) is negatively related to persistence. In the Beginning Postsecondary Students longitudinal studies, NCES posed a rather telling question that conditions the way one must judge student reports of hours worked in relation to persistence and completion. Respondents are asked whether they see themselves primarily as students who happen to be working or employees who happen to be going to school. Table 31, produced from the BPS90 data, confining itself to the first year of enrollment and to a population younger than 24 (to render the data parallel to the HS&B/So universe), provides guidance for interpreting hours worked under that dichotomy. The average hours worked for those who considered themselves students first seems high at 25.9, but it is weighted to the upside by those claiming 40+ hours of work per week and by students attending trade schools.
Primarily a Student (76% of all) 
Primarily an Employee (24% of all) 

Proportion employed during >80 percent of months enrolled 
52.7% (1.4)  66.0% (3.9) 
Proportion of students by range of work hrs./week while enrolled 

114  32.4% (1.3)  21.3% (2.0) 
1521  18.0 (1.1)  17.5 (2.2) 
2230  22.8 (1.2)  19.7 (2.1) 
31+  26.8 (1.3)  41.5 (2.7) 
Total:  100.0%  100.0% 
Average work hrs./week while enrolled 
25.9 (0.3)  29.9 (0.6) 
Note: Standard errors are in parentheses. Source : National Center for Education Statistics: Beginning Postsecondary Students, 198994: Data Analysis System.
The contrasts based on primary role would be much greater if the universe had not been confined to traditionalage students (1724). But even among traditionalage students, one out of four considers himself/herself an employee first, a judgment confirmed by higher average work week hours, proportion employed during most of the months in which they were enrolled, and an obviously higher percentage working fulltime (over 30 hours/week).
Because the postsecondary history of the HS&B/So can be as long as 11 years, the role of work in models of bachelor's degree completion by the end of that period may be attenuated. People change status. They become independent; they become employees who happen to be going to school. Combine these changes with growing and swirling patterns of multiinstitutional attendance, and CuccaroAlamin and Choy's observation that the average workweek of students does not change over time, then one doubts that work will emerge as a significant factor in explaining bachelor's degree attainment.
Universe: All students for whom the evidence confirms 4year college attendance at any time, who received a high school diploma or equivalent prior to 1988, and who evidence positive values for the variables in the Background Model. N=4,943; Weighted N=1.179M. Simple s.e.=.697; Taylor series s.e.=1.082; Design effect=1.55.
Variable  Parameter Estimate 
Adj. s.e.  t  p  Contribution to R^{2} 
INTERCEPT  1.397895  .1139  7.92  
Academic Resources  0.123885  .0093  8.60  .001  .1829 
Children  0.300230  .0407  4.76  .01  .0277 
SES Quintile  0.046395  .0082  3.65  .01  .0142 
STUWORK  0.075257  .0209  2.32  .05  .0081 
Anticipations  0.079826  .0215  2.40  .05  .0052 
GrantinAid  0.067087  .0212  2.04  .10  .0042 
Race  0.083451  .0292  1.84  .10  .0030 
Loan  0.019876  .0209  0.61  *  .0003 
*Dropped from model.  RSq.  .2456  
Adj.  RSq.  .2443 
NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix and Skinner, Holt and Smith (1989). (2) Significance level of t (p) based on a twotailed test.
Indeed, in their initial appearance in table 32, it is obvious that the financial aid/work variables do not strengthen the explanatory power of the model appreciably: the adjusted R^{2} increases by barely one percent (.2338 to .2443). Loans (at least the way the data base forces us to define loans) are a very weak addition, while STUWORK is the strongest of the three new variables in terms of its statistical properties in the model. Whether these relationships will hold in the next iteration of the model depends on their interactions with the new variables. Remember that all of the financial aid/work data were derived from student responses to questionnaire items in 1984 and 1986, while the censoring date for bachelor's degree completion is 1993. It is very possible that students received grants and worked while enrolled as undergraduates after 1986, but we don't know. When the calendar for independent variables is truncated and one introduces longterm attendance pattern variables, we might witness changes in the strength of GrantinAid and STUWORK.
Stage 2 of the model (table 32) demonstrates a virtue of carrying forward race, even though the variable did not meet the statistical criteria for retention in the OLS version of the background model. Race is obviously a marginal contributor. Its effects on degree completion for the population of 4year college students, while slightly negative, are nowhere near the negative effects of having a child by age 22. The more variables in the model, though, the greater the degrees of freedom in statistical analyses, hence the lower the threshold t statistic for inclusion.
Model Iteration, Stage 3: Attendance Patterns
The third stage of model development is presented in table 33. This is the first iteration that follows the analysis of attendance patterns, and it changes (and shrinks) the universe. Seven variables were carried forward from the previous iteration. Nine others were introduced here. One of the new variables, number of schools attended, did not meet the p<.2 selection criterion at all, even in two different formulations: a dichotomous form (one school only v. more than one), and a trichotomous form (one, two, and more than two). All the data we observed in Part III above suggested that this would happen: in an age of multiinstitutional attendance, the number of schools attended simply will not be related to degree completion.
The attendance variables brought into play in this iteration can be parsed in three groupings:
(1) Number and order of institutions attended.
NUMSCHL: a simple dichotomy—one school v. more than one.
TRANSFER: a classic pattern in which the student attended a community college first, earned more than 10 credits from the community college, and subsequently earned more than 10 credits from a 4year college (the "early transfers" are thus excluded); and
NO RETURN: a pattern in which the student attended more than one school, but did not return to the first institution of attendance.
(2)Characteristics of the "referent" first institution of attendance.
FIRST4: the first institution of attendance was a 4year college;
DOCTORAL: the first institution was a doctoral degreegranting institution;
SELECTIVITY: the first institution was selective/highly selective v. nonselective/open door.
(3) Other features of attendance.
NODELAY: the first date of attendance at the first institution occurred 10 or fewer months after high school graduation;
OUTSYS: at some time during his/her undergraduate career, the student attended an institution other than a traditional 4year or 2year college; and
NOSTOP: the student was continuously enrolled as an undergraduate.
There are some very dramatic changes in the model with the introduction of attendance pattern variables. The first is that the explanatory power of the model leaps by a factor of nearly 50 percentfrom an R^{2} of .2456 to one of .3623. This significant advance occurs not as a result of the characteristics of the "referent" institution of attendance, rather as a byproduct of student movement among institutions and the temporal dimensions of enrollment.
The most notable changes in this iteration of the model are the dominance of continuous enrollment (NOSTOP), the contracting strength of the precollegiate variables (ACRES and SES Quintile), and the superficially contradictory positions of No Return and Transfer. This apparent contradiction is fairly easy to explain. In the NoReturn variable, the student attends more than one institution and does NOT return to the first, and this behavior has a negative relationship to degree completion. As defined, Transfer involves a No Returntype situation, but with a specific sequence and criteria (the student must earn more than 10 credits in a 2year college before earning more than 10 credits in a 4year college). Transfer has a positive relationship to degree completion. Students in a classic transfer pattern are moving toward a bachelor's degree. Students in NoReturn positions include reverse transfers who move away from the path toward the bachelor's degree.
On the other hand, selectivity of the referent first institution of attendance has a significant role to play, even though its statistical position is tenuous. Covariance analysis provides some clues as to the lower standing of selectivity in the model: it is entangled with the precollegiate variables ACRES and Anticipations, the former being strong enough to reduce the effects of selectivity. This is another case of common sense: one assumes that most students who start out in selective or highly selective institutions take an academically intense curriculum in high school, perform decently (if not very well) on SATs and ACTs, stand toward the top of their classes., and are committed to earning at least bachelor's degrees. Among other placerelated variables, starting in a doctoral degreegranting institution and stepping outside the secular higher education system were so weak as to fall below minimum significance.
Universe: All students who attended a 4year college at any time, whose transcript records were complete, and who evidenced positive values for all variables in the model. N=4,538. Weighted N=1.079M. Simple s.e.=.714; Taylor series s.e.=1.087; Design effect=1.52.
Variable  Parameter Estimate 
Adj. s.e.  t  p  Contribution to R^{2} 
INTERCEPT  0.610047  .1910  2.10  
NOSTOP  0.323020  .0204  10.42  .001  .1994 
Academic Resources  0.091391  .0090  6.68  .001  .0989 
Children  0.215601  .0413  3.43  .01  .0135 
Transfer  0.231072  .0313  4.86  .001  .0115 
No Return  0.106825  .0215  3.27  .01  .0106 
SES Quintile  0.030959  .0078  2.61  .02  .0091 
GrantinAid  0.064003  .0185  2.28  .05  .0053 
Anticipations  0.050385  .0206  1.61  *  .0036 
Selectivity  0.082775  .0259  2.10  .05  .0031 
Race  0.081385  .0276  1.94  .10  .0028 
STUWORK  0.051003  .0191  1.76  .10  .0023 
FIRST4  0.063635  .0319  1.31  *  .0023 
OUTSYS  0.080847  .0530  1.00  *  .0008 
No Delay  0.047080  .0294  1.05  *  .0008 
DOCTORAL  0.025436  .0204  0.82  *  .0005 
*Dropped from model.  RSq.  .3644  
Adj.  RSq.  .3623 
NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix. (2)Significance level of t (p) based on a twotailed test.
Probably more shocking to conventional analysis, however, are the elimination of NoDelay and Anticipations from the model. Remember, though, just how much the definition of these variables departs from those of conventional analysis. NoDelay allows into the model students who did not graduate from high school on time but who enrolled in postsecondary education within 10 months of their graduation date, and not as a false start. These students will attenuate the effects of a traditional "delayed entry" account. While the Anticipation variable appears to contribute more to the explanatory power of the model than other marginal independent variables such as selectivity of first institution, race, and STUWORK, it did not meet the t statistic threshold criterion.
Model Iteration, Stage 4: Performance in the First "True" Year
The next group of variables to enter in this model involves various configurations of academic performance in the referent first year of attendance. The justification for focusing on this group of variables lies in a tradition of the literature emphasizing the critical role of freshman year performance in retention. The literature, though, unfortunately relies on student selfreports of grades, and without reference to credits attempted or earned, and is not very rigorous concerning what it means by the "freshman year" (e.g. Kanoy, Wester and Latta, 1989). There are exceptions (e.g. Smith, 1992), but they are rare.
The "referent first year" performance variables used in this study are:
Table 34 brings these variables into the model. Compared to the population in the previous iteration, we lose a weighted N of about 50,000 students. We lose some of them because as soon as the independent variables are based on grades, credits, and a distinct time period, we can no longer include students with missing transcripts, even if we know a great deal about the institutions of those missing documents. We also lose those students whose entire first "true" year of attendance was consumed with noncredit, no grade remedial courses. Calculations for any of the three first year attendance variables for these obviously weaker students are impossible under those conditions.
The loss of these students inevitably skews the model. Weaker students don't finish degrees. The large group that remains in the equation will thus exhibit less variance in degree completion. The new contracted universe thus slows down the increase in the explanatory power of the model: the adjusted R^{2} moves by less than one percent (from .3623 in the stage 3 iteration to .3696).
The freshman year performance variables have a significant impact on both the composition of the model and the relative weights of the remaining attendance pattern variables. Selectivity of first institution of attendance, financial aid in the form of scholarships or grantsinaid, and, finally, race, fall out of the model altogether. Race falls out of the model for the same reason that the contribution of SES declines even more from the Attendance Pattern iteration (stage 3): as one moves across the college access line, across the 4year college attendance line, and into coursetaking and academic performance, demographic variables are less and less important. What you do becomes more and more important than where you came from, though the effects of SES will never wholly be washed away. The last of the characteristics of first institution of attendance to remain in the model, selectivity, also falls away for an analogous reason: what you do is more important than where you are.
But the case of GrantinAid calls for explanation. In the Attendance Pattern iteration (table 33), its contribution to the explanatory factor (the R^{2}) was larger than that of STUWORK. Now the positions are reversed, and the student work variable survives, while the scholarship variable does not. Why? The answer is a byproduct of the way STUWORK was defined as a dichotomous variable: the positive value was assigned only when the student worked while enrolled in college for more than one of the first four years following scheduled high school graduation in 1982, whereas a positive value was assigned to GrantinAid if the student received a grantinaid or scholarship in any one year during the same period. Persistence is thus implicit in STUWORK, not in GrantinAid.
The most significant change between the Attendance Pattern and First Year Performance models, though, is that the "academic resources" (ACRES) variable comes back to the top of the list, changing places with continuous enrollment (NOSTOP). The fact that the academic resources brought forward from secondary school comes to play such a robust role in the model only confirms the power of true first year performance: strong academic resources provide students with momentum into their college yearswhenever those years start. Of the other "family" background variables, the contribution of "children" has been unaffected in any of the iterations of this model, indicating that if one starts a family at a young age, one's chances of completing a bachelor's degree by age 30 are negatively affected no matter what one does (see Waite and Moore, 1978).
Universe: All students who attended a 4year college at any time, whose transcript records were complete, and who evidenced positive values for all variables in the model. N=4,264. Weighted N=1.029M. Simple s.e.=.723; Taylor series s.e.=1.079; Design effect=1.49.
Variable  Parameter Estimate 
Adj. s.e.  t  p  Contribution to R^{2} 
INTERCEPT  1.391811  .1547  6.04  
Academic Resources  0.084774  .0087  6.54  .001  .1668 
NOSTOP  0.237503  .0213  7.48  .001  .1017 
Low Credits  0.183009  .0268  4.58  .001  .0351 
Freshman GPA  0.096158  .0190  3.40  .01  .0154 
Children  0.250517  .0425  3.96  .01  .0152 
Transfer  0.158913  .0238  4.48  .001  .0099 
NoReturn  0.109283  .0211  3.48  .01  .0096 
SES Quintile  0.033787  .0075  3.02  .02  .0076 
Credit Ratio  0.097777  .0316  2.08  .10  .0036 
STUWORK  0.047759  .0184  1.74  .10  .0026 
Selective  0.062054  .0251  1.66  *  .0019 
GrantinAid  0.035557  .0184  1.30  *  .0010 
Race  0.048902  .0277  1.18  *  .0009 
*Dropped from model.  RSq.  .3715  
Adj.  RSq.  .3696 
NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix. (2) Significance level of t (p) determined by a twotailed test.
As suspected, the ratio of credits earned to credits attempted in the first true year of attendance might have been a stronger contributor to the model had I found a way to include remedial students who attempted no additive credits during that period. But the presence of Low Credits, reflecting both parttime students and those who earned less than 90 percent of the credits attempted, is a far better explanation for the marginal impact of the credit ratio. LowCredits is an umbrella for behaviors that hamper momentum toward degrees. As such, it will suppress variables expressing subsets of those behaviors.
If any of the variables describing referent institution of first attendance had been retained in the model through this iteration of true first year performance, we would be invited to dig further into institutional characteristics looking for significant effects. Roughly 30 percent of the HS&B/So postsecondary students who earned more than 10 credits and attended a 4year college at some time also attended more than one school and never returned to the first institution of attendance. For this group of students, institutional effects are very difficult to attribute, even if their institutions were campuses in the same system^{(44)}. Another 12 percent attended three or more institutions but returned to the first. These groups constitute such a large proportion of 4year college students as to give pause to any institutional effects analyses (e.g. Velez, 1985) in national data bases.
Model Iteration, Stage 5: Continuing Effects of the First Year
But other variables that can be derived from the rich archives of NCES longitudinal studies suggest that student experience at the first institution of attendance may have continuing effects. The extensions of three performance variables encourage us to explore this hypothesis. After all, as one moves beyond the first true year of attendance, the model canand shouldbe used to guide postmatriculation advisement. To fill the tool box with appropriate instruments and directions, we should transcend the boundaries of traditional prediction models.
One of these variables is the "DWI Index". When students withdraw or leave incomplete a significant percentage of their attempted courses, the behavior is bound to have a negative effect on degree completion. In table 37, DWI is a dummy variable with a cutpoint of 20 percent. That is, the variable is positive if students dropped, withdrew from, left incomplete, or repeated more than 1 out of 5 courses attempted during their undergraduate careers. The correlation of DWI with the firstyear variable, LowCredits, is unsurprisingly high in a sample of this size and in a matrix with eleven other variables (.233; t=10.3). If you are withdrawing from a significant number of courses, the chances are reasonably high that your credit count will be low, no matter which year of your undergraduate career is at issue.
The second variable that extends first year history is the trend of a student's GPA. First true year grade performance, as we have seen, plays a modestly positive role in the model of degree completion. But common sense tells us that for some students, first year grades will be lower than final GPA for people who complete degrees. Students begin to major, and grades in major courses are inevitably higher than grades in prerequisites or distribution requirements. The variable GPA Trend is a dichotomous version of the ratio of final year GPA to first year GPA. If the range of that ratio was .95 to 1.05, there was basically no change in performance. Below .95 indicates a falling GPA; above 1.05 reflects a rising GPA. GPATrend places a positive value on the rising GPA, and a negative value on any ratio of .95 or lower.
How did the HS&B/So students who attended 4year colleges at any time fare in each of these trends in terms of final GPA? Table 35 tells an interesting story in this matter. Among those who earn bachelor's degrees, students whose GPAs don't change that much between first and final undergraduate years sport a higher final GPA than those whose GPAs rise. But the bachelor's degree attainment rate of those with "stable" GPAs is lower than that of those whose grades rise over time. Doing better (with a rising GPA as the indicator) appears to be an indirect proxy for determination.
Final GPA of Students Who Did Not Earn Bachelor's 
Final GPA of Students Who Earned Bachelor's 
% of Trend Group Who Earned BA 
Trend Pct. of All 

Mean  S.D.  s.e.  Mean  S.D.  s.e.  
GPA TREND:  
Rising  2.39  0.651  .0246  2.87  0.442  .0110  69.8%  41.1% 
Stable  2.32  0.800  .0031  3.02  0.485  .0014  64.1  34.1 
Falling  1.97  0.599  .0023  2.70  0.429  .0016  52.2  24.9 
A third variable seeks to add the effects of remedial problems. The college transcript samples of NCES longitudinal studies teach us that there are different kinds of remedial course work, and that some are more serious than others. If the type of remediation matters, so does the amount. Of the HS&B/So students who were assigned to remedial reading, 74 percent were enrolled in two other remedial courses. Of those whose only remedial mathematics work in college was precollegiate algebra, only 16 percent were enrolled in two or more remedial courses (and of this group, 75 percent were assigned to remedial reading). The first case is a remedial problem case; the second is not. These are further matters of common sense: (1) people with reading deficiencies cannot read mathematics problems, either; (2) people whose only problem on a basic skills placement test stems from a bad Algebra 2 course in high school can proceed toward a degree with minimal disruption.
The "remedial problem" variable used in the regression model is a trichotomy derived from the observations of table 36 on the relationship between remedial coursework and degree completion among those who attended a 4year college at any time: 1=any remedial reading, 2=other types of remedial work, and 3=no remedial work.
Percent of All Students 
Percent Earning Bachelor's Degree 

Any remedial reading  10.2  39.3 
No remedial reading, but >2 other remedial courses 
18.7  46.5 
No remedial reading, but 1 or 2 other remedial courses 
20.4  59.6 
No remedial coursework  50.7  68.9 
NOTES: (1) Universe consists of students who attended a 4year college at any time and for whom transcript data on remedial coursework were available; Weighted N=1.38M; (2) all column pair comparisons are significant at p<.05.; (3) For the definition of remedial courses, see footnote #8. SOURCE : National Center for Education Statistics: High School & Beyond/Sophomore cohort, NCES CD#98135.
The "remedial problem" variable is treated as a continuing effect, and not a first year performance variable, for a very empirical reason. Among the HS&B/So students who attended a 4year college at any time and took one or more remedial courses, slightly more than half of those courses (52.2 percent) were taken during the first calendar year of attendance. In fact, by the end of the second calendar year of attendance 68.6 percent of the total 11year remedial course load had been completed; and by the end of the fourth year, 84.4 percent. These data suggest that to isolate the impact of remediation problems on degree completion, we should look beyond the first year.
Table 37 presents the "extended first year performance" iteration of the regression model in which these three variables are introduced. It is not surprising that first year Credit Ratio is pushed out of the model by DWI and GPA Trend. The DWI index will also reduce the contribution of Low Credits (from .0382 to .0262) because it overlaps LowCredits (the correlation of .2396 in a matrix with a dozen other variables is strong). The remaining attendance pattern variables, NoReturn and Transfer, also decline in influence in the face of the new variables built on grades. In other words, after the first year of attendance, academic performance (with grades as its proxy) becomes more important for degree completion than placereferenced attendance.
Universe: All students who attended a 4year college at any time, whose transcript records were complete, and who evidenced positive values for all variables in the model. N=4,264. Weighted N=1.029M. Simple s.e.=.723; Taylor series s.e.=1.079; Design effect=1.49.
Variable  Parameter Estimate 
Adj. s.e.  t  p  Contribution to R^{2} 
INTERCEPT  1.616038  .1409  7.70  
Academic Resources  0.077990  .0081  6.46  .001  .1668 
NOSTOP  0.206917  .0204  6.81  .001  .1017 
DWI Index  0.254825  .0249  6.87  .001  .0642 
LowCredits  0.178315  .0256  6.97  .001  .0225 
Freshman GPA  0.141591  .0190  5.00  .001  .0225 
GPA Trend  0.155576  .0179  5.83  .001  .0153 
Children  0.229453  .0405  3.80  .01  .0119 
NoReturn  0.096415  .0201  3.22  .01  .0071 
Transfer  0.129201  .0227  3.82  .01  .0068 
SES Quintile  0.029912  .0069  2.91  .02  .0053 
STUWORK  0.049976  .0176  1.91  .10  .0026 
Credit Ratio  0.060221  .0300  1.35  *  .0018 
Remedial Problem  0.019793  .0139  0.96  *  .0006 
*Dropped from model  RSq.  .4291  
Adj.  RSq.  .4275 
The new Remedial Problem variable fails all tests for entry into the model. Why? In a way, whatever remedial variable might be introduced at this point is doomed a priori by the strength of the secondary school Academic Resources (ACRES) variable. The correlation between ACRES and Remedial Problem, in the matrix with a dozen other variables, is a stunningly high .4295. Common sense wins again: students entering college with a low degree of academic resources evidence continuing remedial problems dominated by reading and do not earn degrees. ACRES already accounts for this story, as do other variables such as DWI and LowCredits, which are inclusive in their reflection of subpar academic work.
More significantly is that the influence of SES begins to fall (from .007 in table 34 to .0053 in table 37), and that the contributions of continuous enrollment (NOSTOP) and academic resources (ACRES), are unaffected by the introduction of postfirst year performance variables. These two variables keep the fundamental story alive.
Model Iteration, Stage 6: Can We Get Satisfaction?
If only three attendance pattern variables remain in the model (continuous enrollment, NoReturn, and transfer), and the model accounts for about 43 percent of the variance in degree completion, what's left to add? How much further can we push the envelope of explanation of bachelor's degree completion for the HS&B/So cohort? An R^{2} of .4275 in a model such as this is considered extraordinarily persuasive. Some would argue to stop at this point, that the models have reached a plateau of explanation. Others would note that in the more traditional terms of predictive modeling, we reached a plateau of explanation in Stage 4 (first year performance). But there is a substantial body of research concerned with student responses to, and assessments of their postsecondary experience that suggests these responses might influence persistence (e.g. Tracey and Sedlacek, 1987; Pascarella, Terenzini, and Wolfle, 1986). Too, in trying to replicate Horn's (1998) analysis of the BPS90 students who left postsecondary education during or by the end of their first year, and exploring what distinguished those who returned at a later point in time (the stopouts) from those who never returned (what Horn calls the "stayouts"), I found an almost linear relationship between degree of dissatisfaction and permanent "stayout" status. The group at issue was too small to yield significant findings, but the experience suggested that the construct of satisfaction might be profitably pursued in a multivariate context.
The fifth collection of variables used in the iteration covers four aspects of student satisfaction with their postsecondary careers: academic, environmental, work preparation, and cost. The questions were asked only once, retrospectively, in 1986 (four years after scheduled high school graduation). Of the four categories, only twoacademic and environmentaloffer enough items to create a separate index^{(45)}. For each item, e.g. satisfaction with "my intellectual growth" (an academic category) or "sports and recreation facilities" (an environmental category), students were offered a scale of five responses. I turned each question into a dummy variable (dissatisfied/not dissatisfied), aggregated the responses, and turned each of the aggregates into another dummy variable. A composite "dissatisfaction index" was then built from all four categories with a minimum score of 4 (highly satisfied) to a maximum of 8 (highly dissatisfied). The composite was again dichotomized, with scores from 6 to 8 signifying some degree of overall dissatisfaction.
Only one question about satisfaction with the costs of postsecondary education was asked. To isolate the contribution of financial aid to this dimension of satisfaction, an enhanced dummy variable was created^{(46)}. Students were assigned a positive value if they received either a grant or a loan and were dissatisfied with the costs of postsecondary education. The components of this variable were first treated separately, and the difference in dissatisfaction rates between those with grants and those with loans was found to be small (30.8 percent to 33.6 percent) and statistically insignificant.
One reviewer of this study suggested that one should turn the coin, so to speak, on these features, and emphasize satisfaction, not dissatisfaction. Given dichotomous variables, too, there is a statistical argument for choosing the larger group (those who are satisfied) as the reference point. But there is substantial body of researchHorn's (1998) included demonstrating that indications of global satisfaction are almost mindless reflex responses of students, and that, if one wishes to isolate a strong attitude in an explanatory context, the negative attitude will be more revealing. People responding to surveys have to go out of their way to tell you that they are unhappy.
We are unfortunately limited by the data base in our grasp of those aspects of student experience that Astin (1977, 1989, 1993) and Tinto (1975, 1993) and others have described in such terms as "academic integration" and "social integration," though, as Cabrera, Nora and Castaneda (1993) demonstrate, academic integration is expressed indirectly through GPA. High School & Beyond never asked (as does the Beginning Postsecondary Students, 198994 study) how much contact with faculty students enjoyed outside of class, for example. And with the exception of athletics, we have a very limited sense of their participation in extracurricular activities in college. These questions may be important, but when students are attending two or three schools, and when the fact of multiinstitutional attendance doesn't seem to matter in explaining degree completion, then it is impossible to attach the academic and social experiences elicited by these questions to any one institution, unless the student's career involved only one institution (see Cabrera, Casteneda, Nora, and Hengstler, 1992), or, if more than one, was dominated by an institution at which one began and to which one returned.
Table 38 presents the results of the stage 6 iteration of the model. We lose a few people here as a byproduct of nonresponse to the satisfaction questions, and the design effect drops from 1.49 to 1.46, resulting in slightly higher critical tvalues than would have been the case with a larger sample. The input variables include the eleven carried forward from stage 5 (Extending FirstYear Performance), three satisfaction indicators, and the combination financial/satisfaction dummy variable. Given the degree of overlap in the satisfaction variables (confirmed by covariance analysis), the selection set a generous inclusion threshold of p<.2 if for no other reason than to demonstrate just how marginal some of these variables would be. Even then, only one of the four satisfaction variables, that indicating dissatisfaction with academic experiences, passed the testonly to be dropped within the dynamics of the regression.
Universe: All students who attended a 4year college at any time, whose transcript records were complete, and who evidenced positive values for all variables in the model. N=3,807. Weighted N=914k. Simple s.e.=.741; Taylor series s.e.=1.083; Design effect=1.46.
Variable  Parameter Estimate 
Adj. s.e.  t  p  Contribution to R^{2} 
INTERCEPT  1.688707  .1448  7.99  
NOSTOP  0.210741  .0213  6.78  .001  .1581 
Academic Resources  0.079632  .0086  6.34  .001  .1011 
DWI* Index  0.272824  .0259  7.22  .001  .0714 
LowCredits  0.192789  .0252  5.24  .01  .0207 
GPATrend  0.125959  .0182  4.74  .01  .0173 
Children  0.230705  .0431  3.67  .01  .0136 
Freshman GPA  0.133546  .0193  4.74  .01  .0087 
NoReturn  0.105192  .0205  3.52  .01  .0087 
Transfer  0.120377  .0228  3.62  .01  .0057 
SES Quintile  0.027110  .0070  2.65  .05  .0045 
STUWORK  0.042087  .0176  1.64  *  .0018 
Acad Dissatisfaction  0.035955  .0198  1.24  *  .0011 
*Dropped from model.  RSq.  .4128  
Adj.  RSq.  .4109 
NOTES: (1) Standard errors are adjusted in accordance with design effects of the stratified sample used in High School & Beyond. See technical appendix. (2) Significance level of t (p) determined by a twotailed test. (3) *DWI=Drops, Withdrawals and Incompletes.
It is obvious that, after the first true year of higher education, the financial aid and satisfaction factors are peripheral. The overall explanatory power of the model actually decreases with the interactions of the satisfaction measures. They have the effect of reversing the positions of ACRES and NOSTOP again, and slightly reducing the contribution of both those key variables. The satisfaction index proved to be unproductive, as did the attempt to tease out dissatisfaction with the costs of higher education. Very little of this was a result of reducing the universe of students by eliminating those who did not answer any or enough satisfaction questions in 1986 to yield a satisfaction index score. Had we not eliminated those students, there would have been virtually no change in the R^{2} indicator.
Reintroducing Race, Gender and Aspirations
If there were aspects of the fundamental story reflected in table 37 that differed significantly by race and sex, the reintroduction of those variables at this point in modelconstruction would alter the relationships we have observed. Following Astin, Tsui and Avalos (1996), I made one more attempt to bring them into the model, starting with the variables that survived Stage 5 (Extending FirstYear Performance) and this time with four separate race variables (white v. others; black v. others; Latino v. others; and Asian v. others). None of the race variables was accepted into the equation, and the effect of the attempt basically left the R^{2} unchanged at .4263.
But the modified aspirations variable (Anticipations) did pass the threshold criterion of statistical significance, reentering the model, and moving ahead of SES Quintile and STUWORK in its contribution to the explanatory power of the model. While carrying a comparatively low t (2.29), and a low degree of significance (.10), its reintroduction slightly boosts the adjusted R^{2} from .4275 to .4309. In light of the fact that all regression analysis involves some degree of multicollinearity (Schroeder, Sjoquist, and Stephan, 1986), and the Anticipations variable evidences a high degree of correlation with the performance variables in the model, I wouldn't make that much of its reappearance. It can move in and out of a model such as this, but always on the margin. Its minor contribution at this point indicates that, in fact, a plateau of explanation was reached in the firstyear performance extension iteration. Given the dependent variable, bachelor's degree completion, anything else would muddle and distort the guidance this story offers.
The preferred statistical technique for telling this story involves logistic regression. To put the difference between the Ordinary Least Squares linear regression models and the logistic model too simply, the former seeks to minimize the errors in the measurement of an event, while the latter seeks to estimate the maximum likelihood of an event.
Table 39 takes the first five equations from the OLS storythat is, up to the point at which the story reached a plateauand presents them in a logistic form. By displaying the equations together, we can both observe the changes in the position of the independent variables from step to step, and assess whether the sequence of models provides an increasingly convincing explanation ("goodnessoffit"). What do we see in Table 39? First, there are five statistics at the bottom of page 81 that help us judge the explanatory power of the sequence of models and that tell us which stages make the greatest difference. To put it simply, everything that is supposed to happen in a series of logistic models such as this (Cabrera, 1994), happens: the G^{2 }declines; the ratio of G^{2 }to the degrees of freedom declines; the Chisquare rises in the face of an increasing number of variables. All of this says that the logistic analysis becomes increasingly efficient. The relative changes of these statistics also tell us that the Financial Aid model does not add that much to the potential guidances in our tool box, and that the Attendance Pattern model adds the most (the same conclusions reached in the OLS sequence).
To appreciate the convergences and differences between logistic and linear models of this story, one must watch the changes in the odds ratios for a given variable and the statistical significance of its parameter estimates. For example, the variable indicating that the first college attended was a 4year institution exhibits seemingly impressive odds ratios in all three of its appearances, but only in one of those cases is the estimate statistically significant, and even in that case, barely. The linear version did not even allow the "First Was 4Year" variable into its models. On the other hand, in the logistic version, SES evidences more modest odds ratios, but its estimates are statistically significant in all five appearances.
1: Background  2:Financial Aid  3: Attendance Patts  4:1st Year Perform  5.Extending Perform  
Estimate  Odds Ratio  Estimate  Odds Ratio  Estimate  Odds Ratio  Estimate  Odds Ratio  Estimate  Odds Ratio  
Intercept  3.174  3.248  4.212  3.609  3.548  
ACRES  0.678*  1.97  0.628*  1.87  0.544*  1.72  0.403*  1.50  0.389*  1.48 
Children  1.968*  0.14  1.924*  0.15  1.469**  0.23  1.517**  0.22  1.480**  0.23 
SES  0.202**  1.22  0.253**  1.29  0.194***  1.21  0.162++  1.18  0.161++  1.18 
Anticipations  0.423***  1.53  0.386+  1.47  0.226  1.25  0.322  1.38  0.478+  1.61 
Race  0.398++  0.67  0.461++  0.63  0.461++  0.63  0.325  0.72  0.069  0.93 
Sex  0.234  0.79  0.201  0.82  0.150  0.86  0.091  0.91  0.006  1.01 
GrantinAid  0.370+  1.44  0.405+  1.50  0.215  1.24  0.201  1.22  
Loan  0.134  1.11  0.056  1.06  0.059  1.06  0.044  1.05  
STUWORK  0.396++  1.49  0.259  1.30  0.200  1.22  0.228  1.26  
# of Schools  0.009  0.99  0.015  1.02  0.077  0.93  
First Was 4Yr  0.403  1.50  0.603++  1.83  0.441  1.55  
First Was Doctoral  0.153  1.17  0.187  1.21  0.171  1.19  
First Was Select  0.693+  2.00  0.600++  1.82  0.561  1.75  
OUTSYS  0.652  0.52  0.564  0.56  0.782  0.46  
Transfer  1.349*  3.85  1.449*  4.26  1.255*  3.51  
No Return  0.691**  0.50  0.634**  0.53  0.522+  0.59  
No Delay  0.262  1.30  0.166  1.18  0.214  1.24  
No Stop  1.553*  4.73  1.408*  4.09  1.349*  3.85  
1^{st} Yr Grades  0.726*  2.07  1.104*  3.02  
1^{st} Yr Low Creds  0.887*  0.41  0.927**  0.40  
1^{st} Yr Cred Ratio  0.573++  0.56  0.501  0.61  
DWI Index#  1.558*  0.21  
Grade Trend  1.167*  3.21  
Remedial Problem  0.137  0.87  
G^{2}  5386.5  5304.6  4106.7  3643.1  3300.3  
df  4929  4926  4419  4237  4234  
G^{2}/df  1.0928  1.0768  0.9293  0.8598  0.7795  
X^{2}(df)  35.61 (6)  36.74 (9)  41.15 (18)  41.64 (21)  45.57 (24)  
p  .001  .001  .01  .01  .01 
NOTES: (1) The universe for each stage in the model is the same as that used for the parallel steps in the Ordinary Least Squares regressions above. (2) Standard errors used in the determination of statistical significant of the Beta estimates are adjusted by the same design effects as in the OLS versions. (3) Keys to significance levels: *=.001 **=.01 ***=.02 +=.05 ++=.10. (4) #DWI Index=Drops, Withdrawals and Incompletes. SOURCE: National Center for Education Statistics: High School & Beyond/Sophomore cohort. NCES CD#98135.
The logistic story, unlike the linear story, truly disentangles the Transfer and No Return variables. There is a dramatic diversion between the two, and Transfer, in particular, turns out to be much stronger in the "maximum likelihood" approach to bachelor's degree completion than it does in the linear model. The Transfer variable is very distinct in this account. It does not mean merely that you attended both a 2year and a 4year college, rather that you started at the 2year, earned more than 10 credits, then moved to the 4year and earned more than 10 credits there, too. This definition truly sorts people moving toward a bachelor's degree from those multiinstitutional attendees engaged in less direct routes. It also filters out the "early transfers," those who did not wait for the community college to provide a sufficient comfort level in higher education. As noted above (p. 52) those who jump ship early to the 4year college are much less likely to complete a bachelor's degree. The odds ratios for Transfer so defined are very high: 3.85:1 in the Attendance Pattern model, 4.26:1 when we add the 1^{st} Year Performance variables, and 3.51:1 in the Extended Performance model. The only other variable in the logistic story that exceeds those odds ratios is NOSTOP, that is, continuous enrollment. The transfer focus thus becomes a critical direction in the tool box. If we know that students who meet the transfer sequence criteria succeed as well as they do, then we should guide them into that sequence instead of allowing them to leave the community college too early.
What else is different when we compare the logistic to the linear account? The Academic Resources variable, while statistically significant in all its appearances in the logistic model, exhibits an odds ratio decline to the point at which its power appears to be less than first year grades, overall GPA trend, Transfer, and continuous enrollment. The power is still considerable, but not as overwhelming as the linear story would lead us to believe. While this is a disappointment to the tenor of the analysis to date, the most significant drop in the odds ratio for ACRES occurs in the 1^{st} Year Performance iteration (Stage 4), when the restricted Transfer variable enters. Otherwise, the two storieslinear and logisticare the same, and the same 11 variables remain statistically significant in the final step of the model.