Semiparametric Approaches for Dimension Reduction Through Gradient Descent on Manifold

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
University of Alabama Libraries

High-dimensional data arises at an unprecedented speed across various fields. Statistical models might fail on high-dimensional data due to the "curse of dimensionality". Sufficient dimension reduction (SDR) is to extract the core information through low-dimensional mapping so that efficient statistical models can be built while preserving the regression information in the high-dimensional data. We develop several SDR methods through manifold parameterization. First, we propose a SDR method, gemDR, based on local kernel regression without loss of information of the conditional mean E[Y|X]. The method, gemDR, focuses on identifying the central mean subspace (CMS). Then gemDR is extended to CS-gemDR for central subspace (CS), through the empirical cumulative distribution function. CS-OPG, a modified outer product gradient (OPG) method for CS, is developed as an initial estimator for CS-gemDR. The basis B of the CMS or CS is estimated by a gradient descent algorithm. An update scheme on a Grassmann manifold is to preserve the orthogonality constraint on the parameters. To determine the dimension of the CMS and CS, two consistent cross-validation criteria are developed. Our methods show better performance for highly correlated features. We also develop ER-OPG and ER-MAVE to identify the basis of CS on a manifold. The entire conditional distribution of a response given predictors is estimated in a heterogeneous regression setting through composite expectile regression. The computation algorithm is developed through an orthogonal updating scheme on a manifold. The proposed methods are adaptive to the structure of the random errors and do not require restrictive probabilistic assumptions as inverse methods. Our methods are first-order methods which are computationally efficient compared with second-order methods. Their efficacy is demonstrated through numerical simulation and real data applications. The kernel bandwidth and basis are estimated simultaneously. The proposed methods show better performance in estimation of the basis and its dimension.

Electronic Thesis or Dissertation
Central mean subspace, Central subspace, Gradient descent, Manifold, Sufficient dimension reduction