GA-Boost: a genetic algorithm for robust boosting

Show simple item record

dc.contributor Gray, J. Brian
dc.contributor Barrett, Bruce E.
dc.contributor Lee, Junsoo
dc.contributor Conerly, Michael D.
dc.contributor Addy, Samuel N.
dc.contributor.advisor Gray, J. Brian
dc.contributor.author Oh, Dong-Yop
dc.date.accessioned 2017-03-01T16:34:07Z
dc.date.available 2017-03-01T16:34:07Z
dc.date.issued 2012
dc.identifier.other u0015_0000001_0000983
dc.identifier.other Oh_alatus_0004D_11157
dc.identifier.uri https://ir.ua.edu/handle/123456789/1470
dc.description Electronic Thesis or Dissertation
dc.description.abstract Many simple and complex methods have been developed to solve the classification problem. Boosting is one of the best known techniques for improving the prediction accuracy of classification methods, but boosting is sometimes prone to overfit and the final model is difficult to interpret. Some boosting methods, including Adaboost, are very sensitive to outliers. Many researchers have contributed to resolving boosting problems, but those problems are still remaining as hot issues. We introduce a new boosting algorithm "GA-Boost" which directly optimizes weak learners and their associated weights using a genetic algorithm, and three extended versions of GA-Boost. The genetic algorithm utilizes a new penalized fitness function that consists of three parameters (a, b, and p) which limit the number of weak classifiers (by b) and control the effects of outliers (by a) to maximize an appropriately chosen p-th percentile of margins. We evaluate GA-Boost performance with an experimental design and compare it to AdaBoost using several artificial and real-world data sets from the UC-Irvine Machine Learning Repository. In experiments, GA-Boost was more resistant to outliers and resulted in simpler predictive models than AdaBoost. GA-Boost can be applied to data sets with three different weak classifier options. We introduce three extended versions of GA-Boost, which performed very well on two simulation data sets and three real world data sets.
dc.format.extent 146 p.
dc.format.medium electronic
dc.format.mimetype application/pdf
dc.language English
dc.language.iso en_US
dc.publisher University of Alabama Libraries
dc.relation.ispartof The University of Alabama Electronic Theses and Dissertations
dc.relation.ispartof The University of Alabama Libraries Digital Collections
dc.relation.hasversion born digital
dc.rights All rights reserved by the author unless otherwise indicated.
dc.subject.other Statistics
dc.subject.other Computer science
dc.title GA-Boost: a genetic algorithm for robust boosting
dc.type thesis
dc.type text
etdms.degree.department University of Alabama. Dept. of Information Systems, Statistics, and Management Science
etdms.degree.discipline Applied Statistics
etdms.degree.grantor The University of Alabama
etdms.degree.level doctoral
etdms.degree.name Ph.D.


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Browse

My Account