Predicting student graduation in higher education using data mining models: a comparison

dc.contributorMcLean, James E.
dc.contributorConerly, Michael D.
dc.contributorGray, J. Brian
dc.contributorKuffel, Lorne
dc.contributor.advisorSchumacker, Randall E.
dc.contributor.authorRaju, Dheeraj A.
dc.contributor.otherUniversity of Alabama Tuscaloosa
dc.date.accessioned2017-03-01T16:25:57Z
dc.date.available2017-03-01T16:25:57Z
dc.date.issued2012
dc.descriptionElectronic Thesis or Dissertationen_US
dc.description.abstractPredictive modeling using data mining methods for early identification of students at risk can be very beneficial in improving student graduation rates. The data driven decision planning using data mining techniques is an innovative methodology that can be utilized by universities. The goal of this research study was to compare data mining techniques in assessing student graduation rates at The University of Alabama. Data analyses were performed using two different datasets. The first dataset included pre-college variables and the second dataset included pre-college variables along with college (end of first semester) variables. Both pre-college and college datasets after performing a 10-fold cross-validation indicated no difference in misclassification rates between logistic regression, decision tree, neural network, and random forest models. The misclassification rate indicates the error in predicting the actual number who graduated. The model misclassification rates for the college dataset were around 7% lower than the model misclassification rates for the pre-college dataset. The decision tree model was chosen as the best data mining model based on its advantages over the other data mining models due to ease of interpretation and handling of missing data. Although pre-college variables provide good information about student graduation, adding first semester information to pre-college variables provided better prediction of student graduation. The decision tree model for the college dataset indicated first semester GPA, status, earned hours, and high school GPA as the most important variables. Of the 22,099 students who were full-time, first time entering freshmen from 1995 to 2005, 7,293 did not graduate (33%). Of the 7,293 who did not graduate, 2,845 students (39%) had first semester GPA < 2.25 with less than 12 earned hours. This study found that institutions can use historical high school pre-college information and end of first semester data to build decision tree models that find significant variables which predict student graduation. Students at risk can be predicted at the end of the first semester instead of waiting until the end of the first year of school. The results from data mining analyses can be used to develop intervention programs to help students succeed in college and graduate.en_US
dc.format.extent207 p.
dc.format.mediumelectronic
dc.format.mimetypeapplication/pdf
dc.identifier.otheru0015_0000001_0000901
dc.identifier.otherRaju_alatus_0004D_11024
dc.identifier.urihttps://ir.ua.edu/handle/123456789/1395
dc.languageEnglish
dc.language.isoen_US
dc.publisherUniversity of Alabama Libraries
dc.relation.hasversionborn digital
dc.relation.ispartofThe University of Alabama Electronic Theses and Dissertations
dc.relation.ispartofThe University of Alabama Libraries Digital Collections
dc.rightsAll rights reserved by the author unless otherwise indicated.en_US
dc.subjectStatistics
dc.subjectEducation
dc.titlePredicting student graduation in higher education using data mining models: a comparisonen_US
dc.typethesis
dc.typetext
etdms.degree.departmentUniversity of Alabama. Department of Educational Leadership, Policy, and Technology Studies
etdms.degree.disciplineEducational Research
etdms.degree.grantorThe University of Alabama
etdms.degree.leveldoctoral
etdms.degree.namePh.D.
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
file_1.pdf
Size:
1.93 MB
Format:
Adobe Portable Document Format