Abstract:Variable filtering and model estimation have been the hotspot of high dimensional data, and the dimensionality problem of high dimensional data is becoming more and more prominent. The traditional statistical analysis method is no longer applicable due to the instability of the model. In this paper, we review the principle of variable selection method based on regularized regression in high dimensional data, the data type and the advantages and disadvantages, and the selection of adjustment parameters.
荣雯雯, 张奇, 刘艳. 基于正则化回归的变量选择方法在高维数据中的应用[J]. 实用预防医学, 2018, 25(6): 645-648.
RONG Wen-wen, ZHANG Qi, LIU Yan. Application of variable selection method based on regularized regression to high dimensional data. , 2018, 25(6): 645-648.
[1] 王巧智, 黄强, 黄河,等. 大数据下结核患者诊疗质量控制系统架构设计探讨[J]. 实用预防医学, 2016, 23(10):1280-1283. [2] 顾星博, 温琪, 史晓雯,等. 随机森林的并行运算方法及适用条件[J]. 实用预防医学, 2016, 23(2):129-132. [3] 李仲达, 林建浩, 王美今. 大数据时代的高维统计:稀疏建模的发展及其应用[J]. 统计研究, 2015, 32(1):3-11. [4] 邱东. 大数据时代对统计学的挑战[J]. 统计研究, 2014, 31(1):16-22. [5] Breiman L. Heuristics of instability and stabilization in model selection[J]. Ann Stat, 1996, 24(6):2350-2383. [6] Hoerl AE, Kennard RW. Ridge regression: applications to nonorthogonal problems[J]. Technometrics, 1970, 12(1):69-82. [7] Tibshirani R. Regression shrinkage and subset selection with the lasso[J]. J Roy Stat Soc, 1996,58(1):267-288. [8] Efron B, Hastie T, Johnstone I, et al. Least angle regression[J]. Ann Stat, 2004, 32(2):407-451. [9] Zou H. The adaptive lasso and its oracle properties[J]. J Am Stat Assoc, 2006, 101(476):1418-1429. [10] Fan J, Li R. Variable selection via nonconcave penalized likelihood and its oracle properties[J]. J Am Stat Assoc, 2001, 96(456):1348-1360. [11] Zhang CH. Penalized linear unbiased selection[J]. Dept Stat, 2007,3. [12] Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent.[J]. J Stat Softw, 2010, 33(1):1-22. [13] Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal problems[J]. Technometrics, 2000, 42(1):55-67. [14] Zou H, Hastie T. Regularization and variable selection via the elastic net[J]. J Roy Stat Soc, 2005, 67(2):301-320. [15] 刘匆提, 李昂, 门志红,等. 惩罚logistic回归方法在SNPs数据变量筛选研究中的应用[J]. 实用预防医学, 2016, 23(11):1395-1399. [16] Jia JZ, Yu B. On model selection consistency of the elastic net when p》 n[J]. Stat Sin, 2008, 24(2):595-611. [17] Ghosh S. Adaptive elastic net: an improvement of elastic net to achieve oracle properties: IUPUI tech report No.pr07-01[R]. Indianapolis, USA: Department of Mathematical Sciences, Indiana University-Purdue University, 2007. [18] Zou H, Zhang HH. On the adaptive elastic-net with a diverging number of parameters[J]. Ann Stat, 2009, 37(4):1733-1751. [19] Huang J, Breheny P, Ma S, et al. The mnet method for variable selection[J]. Stat Sin, 2010,26(3):718-721. [20] Yuan M, Lin Y. Model selection and estimation in regression with grouped variables[J]. J Roy Stat Soc, 2006, 68(1):49-67. [21] Breheny P, Huang J. Penalized methods for bi-level variable selection[J]. Stat Intfc, 2009, 2(3):369-380. [22] Wang L, Chen G, Li H. Group SCAD regression analysis for microarray time course gene expression data[J]. Bioinformatics, 2007, 23(12):1486-1494. [23] Huang J, Breheny P, Ma S. A selective review of group selection in high-dimensional models[J]. J Inst Math Stat, 2013, 27(4):481-499. [24] Liu J, Huang J, Ma S. Integrative analysis of multiple cancer prognosis datasets under the heterogeneity model[M]. Springer New York, 2013,32 (20):3509-3521. [25] Noah S, Jerome F, Trevor H, et al. A sparse-group lasso[J]. J Comput Graph Stat, 2013, 22(2):231-245. [26] Friedman J, Hastie T, Tibshirani R. A note on the group lasso and a sparse group lasso[J]. Statistics, 2010. [27] Wu TT, Lange K. Coordinate descent algorithms for lasso penalized regression[J]. Ann Appl Stat, 2008, 2(1):224-244. [28] Fang K, Wang X, Zhang S, et al. Bi-level variable selection via adaptive sparse group lasso[J]. J Stat Comput Sim, 2014, 85(1):1-11. [29] 胡局新, 张功杰. 基于K折交叉验证的选择性集成分类算法[J]. 科技通报, 2013,33(12):115-117.