透過Lasso於高維度線性模型之下
訊號的重要性判讀

Chi-Shian Dai; 戴齊賢

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50837

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳宏(Hung Chen)
dc.contributor.author	Chi-Shian Dai	en
dc.contributor.author	戴齊賢	zh_TW
dc.date.accessioned	2021-06-15T13:01:22Z	-
dc.date.available	2016-07-25
dc.date.copyright	2016-07-25
dc.date.issued	2016
dc.date.submitted	2016-07-11
dc.identifier.citation	References Candes, E. and Tao, T. (2005). Decoding by Linear Programming. IEEE Transactions on Information Theory, 51(12):4203–4215. Davidson, K. R. and Szarek, S. J. (2001). Local Operator Theory, Random Matrices and Banach Spaces. Handbook of the Geometry of Banach Spaces, 1:317–366. Efron, B., Hastie, T., Johnstone, I., and Tibshirani, R. (2004). Least angle regression. Annals of Statistics, 32(2):407–499. Johnstone, I. M. (2001). Chi-square oracle inequalities. State of the Art in Probability and Statistics, 36:399–418. Laurent, B. and Massart, P. (2000). Adaptive estimation of a quadratic functional by model selection. Annals of Statistics, 28(5):1302–1338. Lockhart, R., Taylor, J., Tibshirani, R. J., and Tibshirani, R. (2014). A significance test for the lasso. Annals of Statistics, 42(2):413–468. Tibshirani, R. (1994). Regression Selection and Shrinkage via the Lasso. Journal of the Royal Statistical Society B, 58(1):267–288. Tibshirani, R. J. (2013). The lasso problem and uniqueness. Electronic Journal of Statistics, 7(1):1456–1490. Wainwright, M. J. (2009). Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using L1-Constrained Quadratic Programming(Lasso). IEEE Transactions on Information Theory, 55(5):2183–2202. Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. The Journal of Machine Learning Research, 7:2541–2563.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/50837	-
dc.description.abstract	在高維度線性稀疏模型下，Lasso的軌跡解是相當複雜的。採用Lasso的同時，為了選擇微調參數$lambda$，研究Lasso的軌跡解是極為重要的。本文中，當微調參數$lambda$從無限大遞減時，我們給了一個充分條件使得Lasso的軌跡解的支集遞增直到此支集覆蓋住稀疏集(sparsity patten)。根據這個性質，我們可以用變異數分解法來決定微調參數$lambda$要選多少。 Lockhart(2014) 為了決定微調參數$lambda$而提出了 extit{共變異檢定統計} (covariance test statistic)。然而，他們並沒有強調:是否此序貫假設檢定能一路拒絕虛無假設，直到Lasso的解的支集覆蓋住稀疏集(sparsity patten)?為了回答這個問題，我們證明了Lasso的軌跡解有以下的特性，當微調參數$lambda$從無限大遞減時，變數會根據訊號強弱依序進入Lasso的支集。而我們可以根據進入Lasso的支集的先後來判斷訊號的重要性。此外根據以上的性質，我們可以證明 extit{共變異檢定統計}不會乏適 (underfitting).	zh_TW
dc.description.abstract	In the high dimensional sparse linear regression setting, the Lasso solution path would be rather complicated. And while exploiting Lasso, to choose the tuning parameter $lambda$, analyzing the Lasso solution path is crucial. In this thesis, as the tuning parameter $lambda$ decreases from infinity, we provide a sufficient condition such that the support of the Lasso solution increases until its support recover the sparsity pattern. With this property, exploiting variance decomposition could determine which parameter $lambda$ should we choose. Lockhart2014 proposed the covariance test statistic of Lasso to choose the parameter $lambda$. However, they pay little attention to whether the sequential hypothesis testing could reject the null hypothesis until it recovers the sparsity pattern. In order to deal with this issue, we show that the Lasso solution path would have the ordering property by which we could assess the importance of regressors and the assessment is identical to the oracle importance. And with this property, covariance test statistic would not be underfitting.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T13:01:22Z (GMT). No. of bitstreams: 1 ntu-105-R03246006-1.pdf: 521643 bytes, checksum: dfcf4b5319cfd84ff85f55462fba046a (MD5) Previous issue date: 2016	en
dc.description.tableofcontents	Contents 1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . .. 1 1.1 The Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Properties of Lasso . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Goal 1: The Increasing Property of the Lasso Path . . . . . .. .7 2.1 Step 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Step 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Step 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4 Step 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.5 Cross-validation with Dividing Observed Samples . . . . . . . . . . . . . 11 3 Goal 2: The Ordering Property of the Lasso Path . . . . . . . . . . . . . . . .13 3.1 Check (a): Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2 Check (b): Lower bound of knots( 1;…; q) . . . . . . . . . . . . . . . 18 3.3 The Ordering Property of the Lasso Path . . . . . . . . . . . . . . . . . . 19 3.4 Necessary Ratio Gaps for the Ordering Property . . . . . . . . . . . . . . 20 4 Goal 3: Stopping Rules . . . . . . . . . . . . . . 21 4.1 Covariance Test Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . 22 4.2 Cp-Lasso . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 5 Discussion . . . . . . . . . . . . . . 24 6 Proofs and Lemmas 25 6.1 Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 6.2 Proof of Theorem 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 6.3 Proof of Proposition 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 6.4 Proof of Theorem 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6.5 Lemma 6.9:Bound on the maximum of iid normal random variables . . . 43 6.6 Lemma 6.10:Bound on Eigenvalues of Wishart distribution . . . . . . . 43 6.7 Lemma 6.11:Tail bounds of chi-square . . . . . . . . . . . . . . . . . . . 44 6.8 Lemma 6.12:Tail probability bound of sum of Folded Normal Distributions 44 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
dc.language.iso	en
dc.title	透過Lasso於高維度線性模型之下訊號的重要性判讀	zh_TW
dc.title	Lasso and Its Oracle Properties in Importance Assessment of Regressors	en
dc.type	Thesis
dc.date.schoolyear	104-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	銀慶剛(Ching-Kang Ing),江金倉(Chin-Tsang Chiang)
dc.subject.keyword	高維度線性模型,壓所感知,模型選擇,L1 正規化,最小角回歸,重要性判別,共變異檢定統計,	zh_TW
dc.subject.keyword	High dimensional linear regression,Model selection,L1-constraints,Compressed sensing,Lasso path,Least angle regression,important assessment.,	en
dc.relation.page	47
dc.identifier.doi	10.6342/NTU201600735
dc.rights.note	有償授權
dc.date.accepted	2016-07-11
dc.contributor.author-college	理學院	zh_TW
dc.contributor.author-dept	應用數學科學研究所	zh_TW
顯示於系所單位：	應用數學科學研究所

文件中的檔案：

檔案	大小	格式
ntu-105-1.pdf 目前未授權公開取用	509.42 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。