Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 公共衛生學院
  3. 流行病學與預防醫學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/73276
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor洪弘(Hung Hung)
dc.contributor.authorHsun-Chen Changen
dc.contributor.author張訓楨zh_TW
dc.date.accessioned2021-06-17T07:25:58Z-
dc.date.available2019-08-26
dc.date.copyright2019-08-26
dc.date.issued2019
dc.date.submitted2019-06-26
dc.identifier.citationAlladi, S. M., P, S. S., Ravi, V., & Murthy, U. S. (2008). Colon cancer prediction with genetic profiles using intelligent techniques. Bioinformation, 3(3), 130-133.
Beyer, K., Goldstein, J., Ramakrishnan, R., & Shaft, U. (1999). When Is “Nearest Neighbor” Meaningful? Paper presented at the Database Theory — ICDT’99, pp. 217-235, Berlin, Heidelberg.
Cevikalp, H., Verbeek, J., Jurie, F., & Klaser, A. (2008). Semi-supervised dimensionality reduction using pairwise equivalence constraints. In: Proc. VISAPP 2008, pp. 489–496.
Donoho, D. L., & Grimes, C. (2003). Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proceedings of the National Academy of Sciences, 100(10), 5591-5596.
Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M.; Mesirov, J. P. et al. Science (1999). Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring. Science, 286(5439), 531-537.
Hung, H., Jou, Z. Y., & Huang, S. Y. (2018). Robust mislabel logistic regression without modeling mislabel probabilities. Biometrics, 74(1), 145–154.
Margaret, A. S., Ken, N. R., Pablo, T., Andrew, P. W., Jeffery, L. K., Ricardo, C. T. A., Todd, R. G. (2002). Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nature Medicine, 8(1), 68 -74.
Mollah, M. N. H., Eguchi, S., & Minami, M. (2007). Robust Prewhitening for ICA by Minimizing β-Divergence and Its Application to FastICA. Neural Processing Letters, 25(2), 91-110.
Peduzzi, P., Concato, J., Kemper, E., Holford, T. R., & Feinstein, A. R. (1996). A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology, 49(12), 1373-1379.
Petricoin, E. F., Ardekani, A. M., Hitt, B. A., Levine, P. J., Fusaro, V. A., Steinberg, S. M., Liotta, L. A. (2002). Use of proteomic patterns in serum to identify ovarian cancer. The Lancet, 359(9306), 572-577.
Pomeroy, S. L., Tamayo, P., Gaasenbeek, M., Sturla, L. M., Angelo, M., McLaughlin, M. E., Golub, T. R. (2002). Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature, 415, 436.
Shmueli, G. (2011). To Explain or to Predict? Statistical Science, Volume 25, Number 3 (2010), 289-310.
Singh, D., Febbo, P. G., Ross, K., Jackson, D. G., Manola, J., Ladd, C., Sellers, W. R. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 1(2), 203-209.
Tenenbaum, J. B., Silva, V. d., & Langford, J. C. (2000). A Global Geometric Framework for Nonlinear Dimensionality Reduction. Science, 290(5500), 2319 -2323.
Tin Kam, H. (1998). The random subspace method for constructing decision forests. Pattern Analysis and Machine Intelligence, IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832-844.
Wang, S., Lu, J., Gu, X., Du, H., & Yang, J. (2016). Semi-supervised linear discriminant analysis for dimension reduction and classification. Pattern Recognition, 57, 179-189.
Xianfa, C., Jia, W., Guihua, W., & Zhiwen, Y. (2014). Local and Global Preserving Semisupervised Dimensionality Reduction Based on Random Subspace for Cancer Classification. Biomedical and Health Informatics, IEEE Journal of Biomedical and Health Informatics, 18(2), 500-507.
Yu, G., Zhang, G., Domeniconi, C., Yu, Z., & You, J. (2011). Semi-supervised classification based on random subspace dimensionality reduction. Pattern Recognition 45, 1119–1135.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/73276-
dc.description.abstract癌症的精准預測在近幾年發展出許多不同方法,由於資料型態多為高維度資料,必須先將維度降低以利分析,一個基於隨機子集的局部和全局保持的降維法利用隨機子集去建構出半監督式模型,然而,其採用人為的參數設定直接將監督式與非監督式的資訊加起來,再建構出拉普拉斯矩陣表示出資料點間的關係,在此種人為設定下,該參數並無選擇標準,僅能依不同的處理經驗去做設定,因此參數設定高度影響各種資料型態的準確率。
在本篇研究中,為了解決參數無法固定的狀況,本研究改良出另一穩健隨機子集監督式降維法RRS-SDR,改以利用伽馬邏輯斯回歸(r–logistic Regression)直接估計該資料點被分為某一類別的機率,再計算兩資料點被分為同類的機率,並代入拉普拉斯矩陣中,以此取代需要比例混合參數的半監督式學習演算法,此外,對於有錯誤標記的資料集,RRS-SDR也有較佳的分類表現。
zh_TW
dc.description.abstractPrecise cancer classification developed various methods in these years. Because of the high-dimensional data type, dimensionality reduction is an essential preprocessing tool. A local and global preserving semi-supervised dimensionality reduction based on random subspace algorithm (RSLGSSDR) utilized random subspace for semi-supervised dimensionality reduction. It used tuning parameter to combine the information between the supervised and the unsupervised parts, constructing Laplacian matrix which connects the relationship between each data point. Whereas this tuning parameter did not have selecting principle, the characteristic of datasets could be diverse. Thus, it highly influenced the classification accuracy.
In this thesis, to solve the instability of the tuning parameter, we developed Robust Random Subspace-based Supervised Dimension Reduction method (RRS-SDR). We utilized r–logistic regression to estimate the label probability, and then calculated the probability of two data points which are regarded as the same class. By substituting the probability into Laplacian matrix, we replaced semi-supervised learning with our new method. We showed that RRS-SDR has superior classification performance on mislabel datasets.
en
dc.description.provenanceMade available in DSpace on 2021-06-17T07:25:58Z (GMT). No. of bitstreams: 1
ntu-108-R05849015-1.pdf: 2150371 bytes, checksum: 813742413cf00c998e2b2bd629f1e616 (MD5)
Previous issue date: 2019
en
dc.description.tableofcontentsAcknowledgements i
Abstract ii
List of Figures iv
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Introduction of RSLGSSDR 2
1.3 Drawbacks of RSLGSSDR 9
Chapter 2 The Proposed Methods 11
2.1 The construction of RRS-SDR 11
Chapter 3 Experimental Results 16
3.1 Datasets 16
3.2 Settings 19
3.2.1 Preparing the datasets 19
3.2.2 Mislabeling 20
3.2.3 Method comparison 20
3.2.4 Classification process 21
3.3 The number of subset size of random-partition 22
3.4 The number of random-partition 26
3.5 Results on different target dimensionalities 28
3.6 Results on different mislabel rates 33
3.7 ROC curve on different mislabel rates 35
Chapter 4 Conclusion and Future Work 39
Bibliography 43
Appendix 45
A. Robust γ–logistic Regression 45
B. Selection of γ 46
dc.language.isoen
dc.subject癌症分類zh_TW
dc.subject降維度zh_TW
dc.subject拉普拉斯矩陣zh_TW
dc.subject伽馬邏輯斯回歸zh_TW
dc.subject隨機子集演算法zh_TW
dc.subjectr-logistic regressionen
dc.subjectdimensionality reductionen
dc.subjectLaplacian matrixen
dc.subjectrandom subspace methoden
dc.subjectCancer classificationen
dc.title基於隨機子集之穩健監督式降維法zh_TW
dc.titleA Robust Random Subspace-based Supervised Dimension Reduction Methoden
dc.typeThesis
dc.date.schoolyear107-2
dc.description.degree碩士
dc.contributor.oralexamcommittee蕭朱杏(Chu-Hsing Hsiao),盧子彬(Tzu-Pin Lu)
dc.subject.keyword癌症分類,降維度,拉普拉斯矩陣,伽馬邏輯斯回歸,隨機子集演算法,zh_TW
dc.subject.keywordCancer classification,dimensionality reduction,Laplacian matrix,r-logistic regression,random subspace method,en
dc.relation.page47
dc.identifier.doi10.6342/NTU201901081
dc.rights.note有償授權
dc.date.accepted2019-06-27
dc.contributor.author-college公共衛生學院zh_TW
dc.contributor.author-dept流行病學與預防醫學研究所zh_TW
顯示於系所單位:流行病學與預防醫學研究所

文件中的檔案:
檔案 大小格式 
ntu-108-1.pdf
  未授權公開取用
2.1 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved