Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 生醫電子與資訊學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93888
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳中平zh_TW
dc.contributor.advisorChung-Ping Chenen
dc.contributor.author陳宥任zh_TW
dc.contributor.authorYU-JEN CHENen
dc.date.accessioned2024-08-09T16:12:24Z-
dc.date.available2024-08-10-
dc.date.copyright2024-08-09-
dc.date.issued2024-
dc.date.submitted2024-07-29-
dc.identifier.citation[1] Geert JLH Van Leenders, Theodorus H Van Der Kwast, David J Grignon, Andrew J Evans, Glen Kristiansen, Charlotte F Kweldam, Geert Litjens, Jesse K McKenney, Jonathan Melamed, Nicholas Mottet, et al. The 2019 international society of urological pathology (isup) consensus conference on grading of prostatic carcinoma. The American journal of surgical pathology, 44(8):e87–e99, 2020.
[2] Jacques Ferlay ME, Rebecca L Siegel, MD Isabelle Soerjomataram, and DVM Ahmedin Jemal. Global cancer statistics 2022: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. 2024.
[3] Simiao Chen, Zhong Cao, Klaus Prettner, Michael Kuhn, Juntao Yang, Lirui Jiao, Zhuoran Wang, Weimin Li, Pascal Geldsetzer, Till Bärnighausen, et al. Estimates and projections of the global economic cost of 29 cancers in 204 countries and territories from 2020 to 2050. JAMA oncology, 9(4):465–472, 2023.
[4] Fatih Ozsolak and Patrice M Milos. Rna sequencing: advances, challenges and opportunities. Nature reviews genetics, 12(2):87–98, 2011.
[5] Jonathan I Epstein, Lars Egevad, Mahul B Amin, Brett Delahunt, John R Srigley, Peter A Humphrey, Grading Committee, et al. The 2014 international society of urological pathology (isup) consensus conference on gleason grading of prostatic carcinoma: definition of grading patterns and proposal for a new grading system. The American journal of surgical pathology, 40(2):244–252, 2016.
[6] Jeff Heaton. Ian goodfellow, yoshua bengio, and aaron courville: Deep learning: The mit press, 2016, 800 pp, isbn: 0262035618. Genetic programming and evolvable machines, 19(1):305–307, 2018.
[7] Konstantina Kourou, Themis P Exarchos, Konstantinos P Exarchos, Michalis V Karamouzis, and Dimitrios I Fotiadis. Machine learning applications in cancer prognosis and prediction. Computational and structural biotechnology journal, 13:8–17, 2015.
[8] Stephan W Jahn, Markus Plass, and Farid Moinfar. Digital pathology: advantages, limitations and emerging perspectives. Journal of clinical medicine, 9(11):3697, 2020.
[9] Esther Abels, Liron Pantanowitz, Famke Aeffner, Mark D Zarella, Jeroen Van der Laak, Marilyn M Bui, Venkata NP Vemuri, Anil V Parwani, Jeff Gibbs, Emmanuel Agosto-Arroyo, et al. Computational pathology definitions, best practices, and recommendations for regulatory guidance: a white paper from the digital pathology association. The Journal of pathology, 249(3):286–294, 2019.
[10] Gyöngyi Munkácsy, Libero Santarpia, and Balázs Győrffy. Gene expression profiling in early breast cancer—patient stratification based on molecular and tumor microenvironment features. Biomedicines, 10(2):248, 2022.
[11] Adam Brewczyński, Beata Jabłońska, Agnieszka Maria Mazurek, Jolanta MrochemKwarciak, Sławomir Mrowiec, Mirosław Śnietura, Marek Kentnowski, Zofia Kołosza, Krzysztof Składowski, and Tomasz Rutkowski. Comparison of selected immune and hematological parameters and their impact on survival in patients with hpv-related and hpv-unrelated oropharyngeal cancer. Cancers, 13(13):3256, 2021.
[12] Abramowicz Anna and Gos Monika. Splicing mutations in human genetic disorders: examples, detection, and confirmation. Journal of applied genetics, 59:253–268, 2018.
[13] Wenqian Zhang, Ying Yu, Falk Hertwig, Jean Thierry-Mieg, Wenwei Zhang, Danielle Thierry-Mieg, Jian Wang, Cesare Furlanello, Viswanath Devanarayan, Jie Cheng, et al. Comparison of rna-seq and microarray-based models for clinical endpoint prediction. Genome biology, 16:1–12, 2015.
[14] Xiao Xu, Yuanhao Zhang, Jennie Williams, Eric Antoniou, W Richard McCombie, Song Wu, Wei Zhu, Nicholas O Davidson, Paula Denoya, and Ellen Li. Parallel comparison of illumina rna-seq and affymetrix microarray platforms on transcriptomic profiles generated from 5-aza-deoxy-cytidine treated ht-29 colon cancer cells and simulated datasets. BMC bioinformatics, 14:1–14, 2013.
[15] Zhong Wang, Mark Gerstein, and Michael Snyder. Rna-seq: a revolutionary tool for transcriptomics. Nature reviews genetics, 10(1):57–63, 2009.
[16] Debora Fumagalli, Alexis Blanchet-Cohen, David Brown, Christine Desmedt, David Gacquer, Stefan Michiels, Françoise Rothé, Samira Majjaj, Roberto Salgado, Denis Larsimont, et al. Transfer of clinically relevant gene expression signatures in breast cancer: from affymetrix microarray to illumina rna-sequencing technology. BMC genomics, 15:1–12, 2014.
[17] Fadi Alharbi and Aleksandar Vakanski. Machine learning methods for cancer classification using gene expression data: a review. Bioengineering, 10(2):173, 2023.
[18] Nour Eldeen M Khalifa, Mohamed Hamed N Taha, Dalia Ezzat Ali, Adam Slowik, and Aboul Ella Hassanien. Artificial intelligence technique for gene expression by tumor rna-seq data: a novel optimized deep learning approach. IEEE Access, 8:22874–22883, 2020.
[19] Zhezhou Yu, Zhuo Wang, Xiangchun Yu, and Zhe Zhang. Rna-seq-based breast cancer subtypes classification using machine learning approaches. Computational intelligence and neuroscience, 2020, 2020.
[20] Ge Zhang, Zhen Peng, Chaokun Yan, Jianlin Wang, Junwei Luo, and Huimin Luo. A novel liver cancer diagnosis method based on patient similarity network and densegcn. Scientific Reports, 12(1):6797, 2022.
[21] Omar Abdelwahab, Nourelislam Awad, Menattallah Elserafy, and Eman Badr. A feature selection-based framework to identify biomarkers for cancer diagnosis: A focus on lung adenocarcinoma. Plos one, 17(9):e0269126, 2022.
[22] Yi-Hsin Hsu and Dong Si. Cancer type prediction and classification based on rna-sequencing data. In 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 5374–5377. IEEE, 2018.
[23] Yawen Xiao, Jun Wu, Zongli Lin, and Xiaodong Zhao. A deep learning-based multimodel ensemble method for cancer prediction. Computer methods and programs in biomedicine, 153:1–9, 2018.
[24] Padideh Danaee, Reza Ghaeini, and David A Hendrix. A deep learning approach for cancer detection and relevant gene identification. In Pacific symposium on biocomputing 2017, pages 219–229. World Scientific, 2017.
[25] TaeJin Ahn, Taewan Goo, Chan-hee Lee, SungMin Kim, Kyullhee Han, Sangick Park, and Taesung Park. Deep learning-based identification of cancer or normal tissue using gene expression data. In 2018 IEEE international conference on bioinformatics and biomedicine (BIBM), pages 1748–1752. IEEE, 2018.
[26] Jingyuan Chou, Stefan Bekiranov, Chongzhi Zang, Mengdi Huai, and Aidong Zhang. Analysis of meta-learning approaches for tcga pan-cancer datasets. In 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pages 257–262. IEEE, 2020.
[27] Mohanad Mohammed, Henry Mwambi, Innocent B Mboya, Murtada K Elbashir, and Bernard Omolo. A stacking ensemble deep learning approach to cancer type classification based on tcga data. Scientific reports, 11(1):15626, 2021.
[28] Yusaku Nitta, Mitchell Borders, and Simone A Ludwig. Analysis of gene expression cancer data set: Classification of tcga pan-cancer hiseq data. In 2021 IEEE International Conference on Big Data (Big Data), pages 4745–4752. IEEE, 2021.
[29] Xin Ke, Hao Wu, Yi-Xiao Chen, Yan Guo, Shi Yao, Ming-Rui Guo, Yuan-Yuan Duan, Nai-Ning Wang, Wei Shi, Chen Wang, et al. Individualized pathway activity algorithm identifies oncogenic pathways in pan-cancer analysis. EBioMedicine, 79, 2022.
[30] Laiqa Rukhsar, Waqas Haider Bangyal, Muhammad Sadiq Ali Khan, Ag Asri Ag Ibrahim, Kashif Nisar, and Danda B Rawat. Analyzing rna-seq gene expression data using deep learning approaches for cancer classification. Applied Sciences, 12(4):1850, 2022.
[31] Dan C Cireşan, Alessandro Giusti, Luca M Gambardella, and Jürgen Schmidhuber. Mitosis detection in breast cancer histology images with deep neural networks. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2013: 16th International Conference, Nagoya, Japan, September 22-26, 2013, Proceedings, Part II 16, pages 411–418. Springer, 2013.
[32] Angel Cruz-Roa, Ajay Basavanhally, Fabio González, Hannah Gilmore, Michael Feldman, Shridar Ganesan, Natalie Shih, John Tomaszewski, and Anant Madabhushi. Automatic detection of invasive ductal carcinoma in whole slide images with convolutional neural networks. In Medical Imaging 2014: Digital Pathology, volume 9041, page 904103. SPIE, 2014.
[33] Geert Litjens, Clara I Sánchez, Nadya Timofeeva, Meyke Hermsen, Iris Nagtegaal, Iringo Kovacs, Christina Hulsbergen-Van De Kaa, Peter Bult, Bram Van Ginneken, and Jeroen Van Der Laak. Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Scientific reports, 6(1):26286, 2016.
[34] Yun Liu, Krishna Gadepalli, Mohammad Norouzi, George E Dahl, Timo Kohlberger, Aleksey Boyko, Subhashini Venugopalan, Aleksei Timofeev, Philip Q Nelson, Greg S Corrado, et al. Detecting cancer metastases on gigapixel pathology images. arXiv preprint arXiv:1703.02442, 2017.
[35] Andrew Janowczyk and Anant Madabhushi. Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases. Journal of pathology informatics, 7(1):29, 2016.
[36] Mitko Veta, Yujing J Heng, Nikolas Stathonikos, Babak Ehteshami Bejnordi, Francisco Beca, Thomas Wollmann, Karl Rohr, Manan A Shah, Dayong Wang, Mikael Rousson, et al. Predicting breast tumor proliferation from whole-slide images: the tupac16 challenge. Medical image analysis, 54:111–121, 2019.
[37] Marit Lucas, Ilaria Jansen, C Dilara Savci-Heijink, Sybren L Meijer, Onno J de Boer, Ton G van Leeuwen, Daniel M de Bruin, and Henk A Marquering. Deep learning for automatic gleason pattern classification for grade group determination of prostate biopsies. Virchows Archiv, 475:77–83, 2019.
[38] G Prabu Kanna, SJK Jagadeesh Kumar, P Parthasarathi, and Yogesh Kumar. A review on prediction and prognosis of the prostate cancer and gleason grading of prostatic carcinoma using deep transfer learning based approaches. Archives of Computational Methods in Engineering, 30(5):3113–3132, 2023.
[39] Samman Fatima, Sikandar Ali, and Hee-Cheol Kim. A comprehensive review on multiple instance learning. Electronics, 12(20):4323, 2023.
[40] Xi Wang, Hao Chen, Caixia Gan, Huangjing Lin, Qi Dou, Qitao Huang, Muyan Cai, and Pheng-Ann Heng. Weakly supervised learning for whole slide lung cancer image classification. In Medical imaging with deep learning, 2022.
[41] Chengyang Gao, Qiule Sun, Wen Zhu, Lizhi Zhang, Jianxin Zhang, Bin Liu, and Junxing Zhang. Transformer based multiple instance learning for wsi breast cancer classification. Biomedical Signal Processing and Control, 89:105755, 2024.
[42] Yechan Mun, Inyoung Paik, Su-Jin Shin, Tae-Yeong Kwak, and Hyeyoon Chang. Yet another automated gleason grading system (yaaggs) by weakly supervised deep learning. npj Digital Medicine, 4(1):99, 2021.
[43] AJ Schaumberg, MA Rubin, and TJ Fuchs. H&e-stained whole slide deep learning predicts spop mutation state in prostate cancer. biorxiv: 064279. 2016.
[44] Neeraj Kumar, Ruchika Verma, Ashish Arora, Abhay Kumar, Sanchit Gupta, Amit Sethi, and Peter H Gann. Convolutional neural networks for prostate cancer recurrence prediction. In Medical Imaging 2017: Digital Pathology, volume 10140, pages 106–117. SPIE, 2017.
[45] Hanna Källén, Jesper Molin, Anders Heyden, Claes Lundström, and Kalle Åström. Towards grading gleason score using generically trained deep convolutional neural networks. In 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI), pages 1163–1167. IEEE, 2016.
[46] Oscar Jiménez del Toro, Manfredo Atzori, Sebastian Otálora, Mats Andersson, Kristian Eurén, Martin Hedlund, Peter Rönnquist, and Henning Müller. Convolutional neural networks for an automatic classification of prostate tissue slides with highgrade gleason score. In Medical Imaging 2017: Digital Pathology, volume 10140, pages 165–173. SPIE, 2017.
[47] Mathieu Latour, Mahul B Amin, Athanase Billis, Lars Egevad, David J Grignon, Peter A Humphrey, Victor E Reuter, Wael A Sakr, John R Srigley, Thomas M Wheeler, et al. Grading of invasive cribriform carcinoma on prostate needle biopsy: an interobserver study among experts in genitourinary pathology. The American journal of surgical pathology, 32(10):1532–1539, 2008.
[48] Charlotte F Kweldam, Daan Nieboer, Ferran Algaba, Mahul B Amin, Dan M Berney, Athanase Billis, David G Bostwick, Lukas Bubendorf, Liang Cheng, Eva Compérat, et al. Gleason grade 4 prostate adenocarcinoma patterns: an interobserver agreement study among genitourinary pathologists. Histopathology, 69(3):441–449, 2016.
[49] Carolyn Hutter and Jean Claude Zenklusen. The cancer genome atlas: creating lasting value beyond its data. Cell, 173(2):283–285, 2018.
[50] Junjun Zhang, Rosita Bajari, Dusan Andric, Francois Gerthoffert, Alexandru Lepsa, Hardeep Nahal-Bose, Lincoln D Stein, and Vincent Ferretti. The international cancer genome consortium data portal. Nature biotechnology, 37(4):367–369, 2019.
[51] Robert L Grossman, Allison P Heath, Vincent Ferretti, Harold E Varmus, Douglas R Lowy, Warren A Kibbe, and Louis M Staudt. Toward a shared vision for cancer genomic data. New England Journal of Medicine, 375(12):1109–1112, 2016.
[52] Prostate cancer grade assessment (panda) challenge, 2020.
[53] Wouter Bulten, Kimmo Kartasalo, Po-Hsuan Cameron Chen, Peter Ström, Hans Pinckaers, Kunal Nagpal, Yuannan Cai, David F Steiner, Hester Van Boven, Robert Vink, et al. Artificial intelligence for diagnosis and gleason grading of prostate cancer: the panda challenge. Nature medicine, 28(1):154–163, 2022.
[54] Cameron Davidson-Pilon. lifelines: survival analysis in python. Journal of Open Source Software, 4(40):1317, 2019.
[55] J Martin Bland and Douglas G Altman. The logrank test. Bmj, 328(7447):1073, 2004.
[56] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
[57] Peter Bühlmann, Markus Kalisch, and Lukas Meier. High-dimensional statistics with a view toward applications in biology. Annual Review of Statistics and Its Application, 1:255–278, 2014.
[58] Samuel A Lambert, Arttu Jolma, Laura F Campitelli, Pratyush K Das, Yimeng Yin, Mihai Albu, Xiaoting Chen, Jussi Taipale, Timothy R Hughes, and Matthew T Weirauch. The human transcription factors. Cell, 172(4):650–665, 2018.
[59] Nitesh V Chawla, Kevin W Bowyer, Lawrence O Hall, and W Philip Kegelmeyer. Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16:321–357, 2002.
[60] Hui Han, Wen-Yuan Wang, and Bing-Huan Mao. Borderline-smote: a new oversampling method in imbalanced data sets learning. In International conference on intelligent computing, 2005.
[61] Guillaume Lemaître, Fernando Nogueira, and Christos K. Aridas. Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. Journal of Machine Learning Research, 18(17):1–5, 2017.
[62] Dalwinder Singh and Birmohan Singh. Investigating the impact of data normalization on classification performance. Applied Soft Computing, 97:105524, 2020.
[63] Scott M Lundberg and Su-In Lee. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30, 2017.
[64] Corinna Cortes and Vladimir Vapnik. Support-vector networks. Machine learning, 20:273–297, 1995.
[65] Leo Breiman. Random forests. Machine learning, 45:5–32, 2001.
[66] Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, pages 785–794, 2016.
[67] Irina Rish et al. An empirical study of the naive bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence, volume 3, pages 41–46. Citeseer, 2001.
[68] Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. In NIPS-W, 2017.
[69] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
[70] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
[71] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
[72] James M Dolezal, Sara Kochanny, Emma Dyer, Siddhi Ramesh, Andrew Srisuwananukorn, Matteo Sacco, Frederick M Howard, Anran Li, Prajval Mohan, and Alexander T Pearson. Slideflow: deep learning for digital histopathology with real-time whole-slide visualization. BMC bioinformatics, 25(1):134, 2024.
[73] Nobuyuki Otsu. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9(1):62–66, 1979.
[74] Marc Macenko, Marc Niethammer, James S Marron, David Borland, John T Woosley, Xiaojun Guan, Charles Schmitt, and Nancy E Thomas. A method for normalizing histology slides for quantitative analysis. In 2009 IEEE international symposium on biomedical imaging: from nano to macro, pages 1107–1110. IEEE, 2009.
[75] Richard J Chen, Tong Ding, Ming Y Lu, Drew FK Williamson, Guillaume Jaume, Bowen Chen, Andrew Zhang, Daniel Shao, Andrew H Song, Muhammad Shaban, et al. Towards a general-purpose foundation model for computational pathology. Nature Medicine, 2024.
[76] Jialong Zuo, Jiahao Hong, Feng Zhang, Changqian Yu, Hanyu Zhou, Changxin Gao, Nong Sang, and Jingdong Wang. Plip: Language-image pre-training for person representation learning, 2024.
[77] Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1492–1500, 2017.
[78] Zhuchen Shao, Hao Bian, Yang Chen, Yifeng Wang, Jian Zhang, Xiangyang Ji, et al. Transmil: Transformer based correlated multiple instance learning for whole slide image classification. Advancesin Neural Information Processing Systems, 34:2136– 2147, 2021.
[79] Maximilian Ilse, Jakub M Tomczak, and Max Welling. Attention-based deep multiple instance learning. arXiv preprint arXiv:1802.04712, 2018.
[80] Fred S Harman, Christopher J Nicol, Holly E Marin, Jerrold M Ward, Frank J Gonzalez, and Jeffrey M Peters. Peroxisome proliferator–activated receptor-δ attenuates colon carcinogenesis. Nature medicine, 10(5):481–483, 2004.
[81] Yi Liu, Yasunori Deguchi, Rui Tian, Daoyan Wei, Ling Wu, Weidong Chen, Weiguo Xu, Min Xu, Fuyao Liu, Shen Gao, et al. Pleiotropic effects of ppard accelerate colorectal tumorigenesis, progression, and invasion. Cancer research, 79(5):954– 969, 2019.
[82] Karima Begriche, Julie Massart, Marie-Anne Robin, Fabrice Bonnet, and Bernard Fromenty. Mitochondrial adaptations and dysfunctions in nonalcoholic fatty liver disease. Hepatology, 58(4):1497–1507, 2013.
[83] Fu-jia Ren, Xiao-yu Cai, Yao Yao, and Guo-ying Fang. Junb: a paradigm for jun family in immune response and cancer. Frontiersin Cellular and Infection Microbiology, 13:1222265, 2023.
[84] Ichiro Takeuchi, Kumiko Yanagi, Shuji Takada, Toru Uchiyama, Arisa Igarashi, Kenichiro Motomura, Yuka Hayashi, Naoko Nagano, Ryo Matsuoka, Hiroki Sugiyama, et al. Stat6 gain-of-function variant exacerbates multiple allergic symptoms. Journal of Allergy and Clinical Immunology, 151(5):1402–1409, 2023.
[85] X Fang, Y Cai, J Liu, Z Wang, Q Wu, Z Zhang, CJ Yang, L Yuan, and G Ouyang. Twist2 contributes to breast cancer progression by promoting an epithelial– mesenchymal transition and cancer stem-like cell self-renewal. Oncogene, 30(47):4707–4720, 2011.
[86] Yubin Mao, Nini Zhang, Jinfei Xu, Zhijie Ding, Rongrong Zong, and Zuguo Liu. Significance of heterogeneous twist2 expression in human breast cancers. PLoS One, 7(10):e48178, 2012.
[87] Zhiwei Chen, Aimin Cai, Hailun Zheng, Huirong Huang, Rui Sun, Xiao Cui, Weijian Ye, Qing Yao, Ruijie Chen, and Longfa Kou. Carbidopa suppresses prostate cancer via aryl hydrocarbon receptor-mediated ubiquitination and degradation of androgen receptor. Oncogenesis, 9(5):49, 2020.
[88] Kazunori Mizutani, Shizuyo Miyamoto, Takemitsu Nagahata, Noboru Konishi, Mitsuru Emi, and Masamitsu Onda. Upregulation and overexpression of dvl1, the human counterpart of the drosophila dishevelled gene, in prostate cancer. Tumori Journal, 91(6):546–551, 2005.
[89] Elisabeth L Bair, Man Ling Chen, Kathy McDaniel, Kiyotoshi Sekiguchi, Anne E Cress, Raymond B Nagle, and George Timothy Bowden. Membrane type 1 matrix metalloprotease cleaves laminin-10 and promotes prostate cancer cell migration. Neoplasia, 7(4):380–389, 2005.
[90] Balabhadrapatruni VSK Chakravarthi, Darshan S Chandrashekar, Sumit Agarwal, Sai Akshaya Hodigere Balasubramanya, Satya S Pathi, Moloy T Goswami, Xiaojun Jing, Rui Wang, Rohit Mehra, Irfan A Asangani, et al. mir-34a regulates expression of the stathmin-1 oncoprotein and prostate cancer progression. Molecular Cancer Research, 16(7):1125–1137, 2018.
[91] Nicholas J Roberts, Alexis L Norris, Gloria M Petersen, Melissa L Bondy, Randall Brand, Steven Gallinger, Robert C Kurtz, Sara H Olson, Anil K Rustgi, Ann GSchwartz, et al. Whole genome sequencing defines the genetic heterogeneity of familial pancreatic cancer. Cancer discovery, 6(2):166–175, 2016.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93888-
dc.description.abstract核糖核酸定序 (RNA-Seq)是研究癌症最直接的途徑之一,可深入了解癌症的分子機轉,有助於開發標靶治療藥物,並提高癌症診斷和評估預後的準確性。儘管機器學習的興起及定序技術的普及,在探索癌症RNA-Seq上有很大的進展,但始終缺乏生物可解釋性。因此本研究使用美國癌症基因體圖譜計畫 (The Cancer Genome Atlas Program, TCGA) 及基因型組織表現計畫 (The Genotype-Tissue Expression, GTEx) 資料庫,針對影響世界人口罹患、死亡的前五大癌症:肺癌、大腸直腸癌、肝癌、胃癌、乳房癌、攝護腺癌,以及具侵襲性且預後不佳的胰臟癌RNA-Seq進行探討。透過生存分析 (Survival Analysis)、Kaplan-Meier Method (KM method)、套索演算法 (LASSO)及轉錄因子分析,篩選影響病人生存時間的基因。運用這些關鍵的基因,機器學習與深度學習模型可以正確預測健康檢體及癌症檢體,準確性達到97\\\\\\\\\\\\\\\\%以上。此外,我們更進一步在GTEx健康檢體以及TCGA上31種癌症進行分類,透過套索演算法找到的152個轉錄因子,我們設計的集成一維卷積網路 recall 和 F1-score 可以達到95\\\\\\\\\\\\\\\\%以上。我們也使用SHapley Additive exPlanations (SHAP),進一步分析模型的判斷依據。
隨著數位病理全玻片影像的普及,全玻片影像可以作為影像訓練資料,在臨床上輔助醫師進行診斷。過去病理影像的模型訓練需耗費大量人力標記病變區域或是關注區域 (region of interest),本研究使用美國癌症基因體圖譜計畫 (The Cancer Genome Atlas Program, TCGA)及Prostate cANcer graDe Assessment (PANDA)資料庫無標註的攝護腺全玻片影像作為訓練資料,經過補丁抽取、顏色統一化、人為標記移除後,透過多實例學習 (multiple instance learning),模型區分檢體的良惡性準確度達到95\\\\\\\\\\\\\\\\%,格里森分級 (ISUP Gleason grade group)達到F1-score 83.2\\\\\\\\\\\\\\\\%,注意力熱圖也顯示模型判斷的依據與病理醫師認知相同。
總結來說,本研究的機器學習模型除了能以核糖核酸定序準確預測癌症組織及正常組織之外,也可以用於多癌症的分類任務上,透過特徵篩選得到的轉錄因子可作為未來尋找潛在癌症生成路徑的依據。以多實例學習訓練無標註的全玻片影像,也能正確預測攝護腺組織的良惡性及格里森分級。
zh_TW
dc.description.abstractRNA sequencing (RNA-Seq) is an efficient tool in cancer research, offering insights into the molecular mechanisms of cancer, aiding in developing targeted therapies, and enhancing the accuracy of cancer diagnosis and prognosis. Despite the significant contributions of machine learning and deep learning in exploring cancer RNA-Seq, a lack of biological interpretability persists. In light of this, our study utilizes data from The Cancer Genome Atlas Program (TCGA) and The Genotype-Tissue Expression (GTEx) project to investigate RNA-Seq data pertaining to the most prevalent and deadly cancers globally, including lung, colorectal, liver, stomach, breast, and prostate cancers, as well as pancreatic cancer, which is notably aggressive with poor prognosis. We have identified biologically meaningful genes through survival analysis, the Kaplan-Meier Method, LASSO regression, and the analysis of transcription factors. Using these essential genes, machine learning and deep learning models have achieved over 97\\\\\\\\\\\\\\\\% accuracy in distinguishing between healthy individuals and cancer patients. Furthermore, we classified healthy individuals from GTEx and 31 types of cancers from TCGA. Using 152 transcription factors identified by the LASSO algorithm, we achieved a recall and F1-score of over 95\\\\\\\\\\\\\\\\%. Our study also introduced explainable artificial intelligence techniques, specifically SHapley Additive exPlanations (SHAP), to analyze the contribution of each feature to the model's decisions.
With the increasing prevalence of whole slide images (WSIs) in digital pathology, WSIs can serve as valuable training data to assist clinicians in diagnosis. Traditionally, training models on pathological images required substantial manual effort to annotate lesion areas or regions of interest (ROIs). This study used annotaion-free prostate WSIs from TCGA and the Prostate cANcer graDe Assessment (PANDA) dataset as training data. Following patch generation, color normalization, removal of blurred patches, and feature extraction, we employed multiple instance learning (MIL) to train the model. Our model achieved an accuracy of 95\\\\\\\\\\\\\\\\% in distinguishing benign from malignant specimens and an F1-score of 83.2\\\\\\\\\\\\\\\\% in predicting ISUP Gleason grade group. The attention heatmaps generated by the model also demonstrated that the basis of the model’s judgments aligned with the pathologists’ decisions.
In summary, this study demonstrates that machine learning models can accurately predict cancerous and normal tissues using RNA-Seq data and can be applied to pan-cancer classification tasks. The transcription factors identified through feature selection can serve as potential markers for discovering cancer pathways in future research. Moreover, the models can accurately predict prostate tissues' benign or malignant nature and their Gleason grade group by training annotation-free whole slide images using multiple instance learning.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-09T16:12:23Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-08-09T16:12:24Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsAcknowledgements i
摘要 ii
Abstract iv
Contents vii
List of Figures xi
List of Tables xiii
Chapter 1 Introduction 1
1.1 RNA-Seq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Gleason Scoring System . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Artificial Intelligence (AI) . . . . . . . . . . . . . . . . . . . . . . . 3
1.3.1 Machine Learning (ML) . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.2 Deep Learning (DL) . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Digital Pathology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.5 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.6 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Chapter 2 Previous Works 9
2.1 Background on RNA-Seq . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 ML in Cancer Classification . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Deep Learning in WSI . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Multiple Instance Learning (MIL) . . . . . . . . . . . . . . . . . . 14
2.5 Applications in Prostate Cancer . . . . . . . . . . . . . . . . . . . . 15
Chapter 3 Datasets 16
3.1 RNA-Seq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.1 TCGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.2 UCSC Xena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.1.3 ICGC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.2 Whole Slide Image . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.1 TCGA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.2.2 Prostate cANcer graDe Assessment (PANDA) challenge . . . . . . 18
Chapter 4 Material and Methods 19
4.1 RNA-Seq . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1.1 Analytic Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1.2 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.1.3 Feature Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.1.3.1 Survival Analysis . . . . . . . . . . . . . . . . . . . . 22
4.1.3.2 LASSO . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.1.3.3 Transcription factor . . . . . . . . . . . . . . . . . . . 26
4.1.4 Synthetic Minority Over-sampling Technique (SMOTE) . . . . . . 27
4.1.5 Loss function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.6 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.7 Voting Ensembles . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.7.1 Hard Voting . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.7.2 Soft Voting . . . . . . . . . . . . . . . . . . . . . . . . 29
4.1.8 SHapley Additive exPlanations (SHAP) . . . . . . . . . . . . . . . 29
4.1.9 Machine Learning Model . . . . . . . . . . . . . . . . . . . . . . . 29
4.1.9.1 Support Vector Machine (SVM) . . . . . . . . . . . . . 29
4.1.9.2 Random Forest (RF) . . . . . . . . . . . . . . . . . . . 30
4.1.9.3 Extreme Gradient Boosting (XGBoost) . . . . . . . . . 30
4.1.9.4 k-Nearest Neighbors (KNN) . . . . . . . . . . . . . . . 31
4.1.9.5 Logistic Regression (LR) . . . . . . . . . . . . . . . . 31
4.1.9.6 Gaussian Naïve Bayes classifier . . . . . . . . . . . . . 32
4.1.10 Deep Learning Model . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.1.10.1 Convolution Neural Network (CNN) . . . . . . . . . . 32
4.1.10.2 Transformer Encoder . . . . . . . . . . . . . . . . . . 33
4.1.10.3 Long Short-Term Memory (LSTM) Model . . . . . . . 34
4.1.10.4 Artificial Neural Network (ANN) . . . . . . . . . . . . 35
4.1.10.5 ResNet . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.11 ICGC Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2 Whole Slide Image Gleason Grading . . . . . . . . . . . . . . . . . 36
4.2.1 Analytic Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2.2 Preprocssing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2.2.1 Mask and Filter . . . . . . . . . . . . . . . . . . . . . 37
4.2.2.2 Tiles Extraction . . . . . . . . . . . . . . . . . . . . . 38
4.2.2.3 Stain Normalization . . . . . . . . . . . . . . . . . . . 39
4.2.2.4 MIL . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2.2.5 Feature Extractor . . . . . . . . . . . . . . . . . . . . 43
4.2.2.6 Model Development . . . . . . . . . . . . . . . . . . . 44
Chapter 5 Experimental Results 46
5.1 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.1.1 Confusion Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.1.2 Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.1.3 Recall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.1.4 Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.1.5 F1-score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.1.6 Cohen’s Kappa . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.1.7 Matthews correlation coefficient . . . . . . . . . . . . . . . . . . . 48
5.1.8 Quadratic Weighted Kappa . . . . . . . . . . . . . . . . . . . . . . 48
5.1.9 Area Under Curve (AUC) . . . . . . . . . . . . . . . . . . . . . . . 49
5.2 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.2.1 Cancer/Healthy classification . . . . . . . . . . . . . . . . . . . . . 50
5.2.2 ISUP Gleason Grade Group Classification using RNA-Seq . . . . . 54
5.2.3 Pan-cancer Classification . . . . . . . . . . . . . . . . . . . . . . . 54
5.2.4 WSI Gleason Grade Grouping . . . . . . . . . . . . . . . . . . . . 60
Chapter 6 Conclusion and Future Work 68
6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
References 71
Appendix A — Detailed Gene Selection and Analysis Plots 83
-
dc.language.isoen-
dc.subject全玻片影像zh_TW
dc.subject機器學習zh_TW
dc.subject生存評估zh_TW
dc.subject套索演算法zh_TW
dc.subject轉錄因子zh_TW
dc.subject弱監督學習zh_TW
dc.subject核糖核酸定序zh_TW
dc.subjectLASSOen
dc.subjecttranscription factoren
dc.subjectRNA-Seqen
dc.subjectmachine learningen
dc.subjectsurvival analysisen
dc.title機器學習應用於核糖核酸定序及全玻片影像預測癌症種類zh_TW
dc.titleMachine Learning Approaches for Cancer Classification Using RNA-Sequencing and Whole Slide Imagesen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.coadvisor魏安祺zh_TW
dc.contributor.coadvisorAn-Chi Weien
dc.contributor.oralexamcommittee洪宗宏;曾啓新zh_TW
dc.contributor.oralexamcommitteeChung-Hung Hong;Chi-Shin Tsengen
dc.subject.keyword核糖核酸定序,機器學習,生存評估,套索演算法,轉錄因子,弱監督學習,全玻片影像,zh_TW
dc.subject.keywordRNA-Seq,machine learning,survival analysis,LASSO,transcription factor,en
dc.relation.page93-
dc.identifier.doi10.6342/NTU202400900-
dc.rights.note未授權-
dc.date.accepted2024-08-01-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept生醫電子與資訊學研究所-
顯示於系所單位:生醫電子與資訊學研究所

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf
  未授權公開取用
20.11 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved