應用自監督學習及半監督聚類於視網膜光學同調斷層掃描影像探討潛在之亞健康樣本

張翔淯; Hsiang-Yu Chang

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98288

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	黃升龍	zh_TW
dc.contributor.advisor	Sheng-Lung Huang	en
dc.contributor.author	張翔淯	zh_TW
dc.contributor.author	Hsiang-Yu Chang	en
dc.date.accessioned	2025-08-01T16:05:06Z	-
dc.date.available	2025-08-02	-
dc.date.copyright	2025-08-01	-
dc.date.issued	2025	-
dc.date.submitted	2025-07-27	-
dc.identifier.citation	R. Leitgeb, C. Hitzenberger, and A. F. Fercher, "Performance of Fourier domain vs. time domain optical coherence tomography," Opt. Express 11, 889–894 (2003). A. Chiang and C. Y. Chang, "Technology overview and biomedical application of optical coherence," MaterialsNet 370, 108–116 (2017). B. E. Bouma, S.-H. Yun, B. J. Vakoc et al., "Fourier-domain optical coherence tomography: recent advances toward clinical utility," Curr. Opin. Biotechnol. 20, 111–118 (2009). K. H. Nguyen, B. C. Patel, and P. Tadi, "Anatomy, head and neck: Eye retina," in StatPearls [Internet], StatPearls Publishing (2023). U. of Tennessee Health Science Center, "Layers of the Retina," Hamilton Eye Institute (2022). Discovery Eye Foundation, "Layers of the Retina," https://discoveryeye.org /layers-of-the-retina/.[Accessed: May 26, 2025]. V. Mansouri, "X-linked retinitis pigmentosa gene therapy: preclinical aspects," Ophthalmol. Ther. 12, 7–34 (2023). K. A. Hussey, S. E. Hadyniak, and R. J. Johnston Jr, "Patterning and development of photoreceptors in the human retina," Front. Cell Dev. Biol. 10, 878350 (2022). R. S. Molday and O. L. Moritz, "Photoreceptors at a glance," J. Cell Sci. 128, 4039–4045 (2015). B. A. Labib, "Macular Focus," https://www.reviewofoptometry.com/article /macular-focus. [Accessed: May 27, 2025]. J. M. Provis, P. L. Penfold, E. E. Cornish et al., "Anatomy and development of the macula: specialisation and the vulnerability to macular degeneration," Clin. Exp. Optom. 88, 269–281 (2005). I. Rehman, N. Mahabadi, M. Motlagh et al., "Anatomy, head and neck, eye fovea," in StatPearls [Internet], StatPearls Publishing (2023). J. S. Schiffman, N. B. Patel, R. A. Cruz et al., "Optical coherence tomography for the radiologist," Neuroimaging Clin. N. Am. 25, 367–382 (2015). A. T. Fung, J. Galvin, and T. Tran, "Epiretinal membrane: a review," Clin. Exp. Ophthalmol. 49, 289–308 (2021). Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proc. IEEE 86, 2278–2324 (2002). A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Adv. Neural Inf. Process. Syst. 25, (2012). K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 770–778 (2016). C. A. Taiwan, "深度學習：CNN原理," https://cinnamonaitaiwan.medium.com/%E6%B7%B1%E5%BA%A6%E5%AD%B8%E7%BF%92-cnn%E5%8E%9F %E7%90%86-keras%E5%AF%A6%E7%8F%BE-432fd9ea4935 (accessed 15 Jun. 2025). Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature 521, 436–444 (2015). V. Nair and G. E. Hinton, "Rectified linear units improve restricted boltzmann machines," in Proc. 27th Int. Conf. Mach. Learn. (ICML-10), 807–814 (2010). Gopalakrishna Adusumilli, "Activation Function- ReLU," https:// gkadusumilli.github.io/relu/ (accessed 18 Jun. 2025). R. Singh, "Decoding CNNs: A Beginner’s Guide to Convolutional Neural Networks and their Applications," https://ravjot03.medium.com/decoding-cnns-a-beginners-guide-to-convolutional-neural-networks-and-their-applications-1a8806cbf536 (accessed 15 Jun. 2025). C. M. Lee, "直觀理解ResNet —簡介、觀念及實作(Python Keras)," https://medium.com/@rossleecooloh/%E7%9B%B4%E8%A7%80%E7%90%86%E8%A7%A3resnet-%E7%B0%A1%E4%BB%8B-%E8%A7%80%E5%BF%B5%E5 %8F%8A%E5%AF%A6%E4%BD%9C-python-keras-8d1e2e057de2 (accessed 15 Jun. 2025). C. A. Taiwan, "損失函數的設計(Loss Function)," https://cinnamonaitaiwan.medium.com/cnn%E6%A8%A1%E5%9E%8B-%E6% 90%8D%E5%A4%B1%E5%87%BD%E6%95%B8-loss-function-647e13956c50 (accessed 15 Jun. 2025). T. Huang, "機器/深度學習: 基礎介紹-損失函數(loss function)," https://chih-sheng-huang821.medium.com/%E6%A9%9F%E5%99%A8-%E6%B7%B1%E 5%BA%A6%E5%AD%B8%E7%BF%92-%E5%9F%BA%E7%A4%8E%E4%BB%8B%E7%B4%B9-%E6%90%8D%E5%A4%B1%E5%87%BD%E6%95%B8 -loss-function-2dcac5ebb6cb (accessed 15 Jun. 2025). D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagating errors," Nature 323, 533–536 (1986). X. Liu, et al., "Self-supervised learning: Generative or contrastive," IEEE Trans. Knowl. Data Eng. 35, 857–876 (2021). T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, "A simple framework for contrastive learning of visual representations," in Int. Conf. Mach. Learn., 1597–1607 (2020). J. MacQueen, "Some methods for classification and analysis of multivariate observations," in Proc. 5th Berkeley Symp. Math. Stat. Probab., 5, 281–298 (1967). C. C. Aggarwal, A. Hinneburg, and D. A. Keim, "On the surprising behavior of distance metrics in high dimensional space," in Int. Conf. Database Theory, 420–434 (2001). J. Xie, R. Girshick, and A. Farhadi, "Unsupervised deep embedding for clustering analysis," in Int. Conf. Mach. Learn., 478–487 (2016) dengliu2000, " Day 20 - 卷積神經網路(CNN)的介紹," https://ithelp.ithome.com.tw/articles/10304119 (accessed 18 Jun. 2025). 李劭允, "利用卷積神經網路從黃斑部光學同調斷層掃描影像分辨健康及好發黃斑前膜的亞健康眼睛," 碩士, 光電工程研究所, 國立臺灣大學, 2023年. R. Geirhos, J.-H. Jacobsen, C. Michaelis, R. Zemel, W. Brendel, M. Bethge, and F. A. Wichmann, "Shortcut learning in deep neural networks," Nat. Mach. Intell. 2, 665–673 (2020). S. Azizi, K. Mustafa, F. Xing, L. Angus, H. L. Jin, T. Epstein, K. Roth, and N. Houlsby, "Big self-supervised models advance medical image classification," in Proc. IEEE/CVF Int. Conf. Comput. Vis., 3478–3488 (2021). Y.-C. Chen, "如何辨別機器學習模型的好壞？秒懂Confusion Matrix," https://ycc.idv.tw/confusion-matrix.html (accessed 10 Jul. 2025). P. N. Jain, "Epiretinal Membrane," https://www.eophtha.com/posts/epiretinal-membrane (accessed 10 Jul. 2025).	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98288	-
dc.description.abstract	光學同調斷層掃描(Optical Coherence Tomography; OCT)是一種高解析度的非侵入式成像技術，具備快速成像與優異解析力，廣泛應用於眼科臨床診斷。其產生之高品質影像不僅有助於醫師進行病灶判讀，也能提升卷積神經網路(Convolutional Neural Network; CNN)對影像特徵的學習效率，進而優化模型表現。根據先前研究，CNN已成功應用於視網膜 OCT影像之疾病分類任務，證實其於健康與病理影像判別上的可行性。為進一步提升特徵表徵能力，本研究採用自監督學習(Self-Supervised Learning; SSL)中的SimCLR演算法進行模型預訓練。與傳統監督式學習不同，SimCLR無需人工標註，能利用大規模未標註資料進行學習，降低標註成本，並可使用與下游任務同源之醫學影像，較不受ImageNet自然影像資料集之領域差異影響。本研究使用19464張OCT影像進行SimCLR預訓練，相較之下，ImageNet 資料集則包含1280萬張自然影像。實驗結果顯示，雖然以SimCLR預訓練之模型在多項分類任務中皆能順利完成目標，整體表現卻常略遜於以ImageNet資料集預訓練之模型，推測與預訓練資料集規模上的巨大差異有關。此外，訓練初期驗證準確率高於訓練集，收斂後則反轉。此現象是因為訓練資料為提升泛化能力會隨機水平翻轉，而驗證資料則未進行增強；隨著訓練次數增加，模型逐漸熟悉訓練資料並有效擬合，因此訓練集表現提升，最終準確率超越驗證集。然而在健康與亞健康分類任務中，無論採用哪種預訓練策略，模型平均準確率皆約為66%，推測部分亞健康樣本因對側眼還未確診為黃斑前膜(Epiretinal Membrane; ERM)而誤標為健康，進而影響模型學習。為進一步探討資料集中潛在亞健康樣本之分布，本研究結合無監督聚類分析K-means與半監督聚類分析Semi-supervised Deep Embedded Clustering(SDEC)進行分析。於SDEC中僅提供亞健康樣本之標註，引導模型學習其潛在特徵。雖兩種聚類方法皆能在視覺化中呈現清晰分群，但實際標註影像卻呈現沒有規則之分布。然而，經多次實驗統計結果可知，SDEC模型在重複挑選潛在亞健康樣本上展現較高一致性。經由與醫師之討論確認，部分被挑選之樣本中央凹弧度較平緩，與先前研究中所述亞健康特徵相符。綜上所述，本研究期望透過SDEC分群方式，有效輔助找出健康標註資料中之潛在亞健康樣本，進而提升CNN模型於健康與亞健康分類任務中的效能。未來若能搭配臨床追蹤資料驗證，將有助於早期辨識黃斑前膜病變風險，達到預防與精準診療之目標。	zh_TW
dc.description.abstract	Optical Coherence Tomography (OCT) is a high-resolution, non-invasive imaging technique that has been widely applied in ophthalmic clinical diagnosis due to its rapid image acquisition and excellent resolution. The high-quality images generated by OCT not only assist physicians in identifying lesions but also enhance the learning efficiency of Convolutional Neural Networks (CNN) in extracting image features, thereby improving model performance. According to previous studies, CNNs have been successfully applied to retinal OCT image classification tasks, demonstrating their feasibility in distinguishing between healthy and pathological cases. To further enhance feature representation, this study adopts SimCLR, a Self-Supervised Learning (SSL) algorithm, for model pretraining. Unlike traditional supervised learning, SimCLR does not require manual annotation and can learn from large-scale unlabelled datasets, reducing annotation costs. Furthermore, it enables the use of medical images that are consistent with downstream tasks, minimizing domain gaps compared to pretraining on natural image datasets like ImageNet. In this study, 19464 OCT images were used for SimCLR pretraining, whereas the ImageNet dataset contains approximately 12.8 million natural images. Experimental results show that although the SimCLR-pretrained model successfully accomplishes multiple classification tasks, its overall performance is generally slightly inferior to that of the ImageNet-pretrained model. This discrepancy is presumed to stem from the significant difference in dataset scale between the two pretraining strategies. In addition, validation accuracy was higher than training accuracy during the early training phase but reversed as the model converged. This was due to the use of random horizontal flipping in the training data to enhance generalization, while no augmentation was applied to the validation data. As training progressed, the model gradually became more familiar with the training data and fit it more effectively, resulting in higher training accuracy in the later stages. However, in the classification task of distinguishing between healthy and sub-healthy cases, both pretraining methods yield similar average accuracies of approximately 66%. This may be due to some sub-healthy samples being mislabeled as healthy, as their contralateral eye had not yet been clinically diagnosed with Epiretinal Membrane (ERM), thereby affecting model learning. To further investigate the distribution of potential sub-healthy samples in the dataset, this study employs both unsupervised clustering (K-means) and semi-supervised clustering (Semi-supervised Deep Embedded Clustering, SDEC). In the SDEC framework, only labels of sub-healthy samples are provided to guide the model in learning latent features. Although both clustering methods produce visually distinct groupings, the labeled samples appear randomly distributed. Nevertheless, statistical results from repeated experiments indicate that the SDEC model demonstrates higher consistency in repeatedly identifying potential sub-healthy samples. Discussions with ophthalmologists confirmed that some of the selected samples exhibited a flatter foveal contour, which aligns with previously reported characteristics of sub-healthy cases. In summary, this study aims to utilize the SDEC clustering method to effectively assist in identifying potential sub-healthy samples within the healthy-labeled dataset, thereby improving CNN performance in healthy versus sub-healthy classification tasks. If validated by future longitudinal clinical data, this approach may contribute to the early detection of ERM risk and support the goals of preventive and precision medicine.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-08-01T16:05:06Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-08-01T16:05:06Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	致謝 I 中文摘要 II Abstract III 目次 V 圖次 VII 表次 IX Chapter 1 緒論 1 Chapter 2 頻域式光學同調斷層掃描及視網膜介紹 3 2.1 光學同調斷層掃描 3 2.1.1 光學同調斷層掃描原理 3 2.1.2 頻域式光學同調斷層掃描 4 2.1.3 海德堡光學同調斷層掃描顯微儀 6 2.2 視網膜介紹 6 2.2.1 視網膜之結構 6 2.2.2 黃斑部與中央凹之功能 9 2.3 視網膜疾病 10 2.3.1 黃斑前膜 10 2.3.2 黃斑前膜之亞健康定義 10 Chapter 3 研究方法及資料集 12 3.1 卷積神經網絡基本架構及殘差網絡架構介紹 12 3.2 自監督學習 15 3.2.1 自監督學習與監督學習之介紹與比較 15 3.2.2 SimCLR流程及應用 16 3.3 聚類方式 17 3.3.1 無監督聚類 18 3.3.2 半監督聚類 18 3.4 資料集與前處理 20 3.4.1 資料集之來源及數量 20 3.4.2 影像之前處理 20 Chapter 4 健康、亞健康與不健康眼睛之分類 22 4.1 實驗設計 22 4.1.1 模型架構及預訓練方式之介紹 22 4.1.2 訓練設定及評估指標 24 4.2 分類任務一: 健康與不健康之分類 25 4.3 分類任務二: 黃斑前膜與其他疾病之分類 30 4.4 分類任務三: 健康與黃斑前膜之分類 35 4.5 分類任務四: 健康與亞健康之分類 40 4.6 各項分類任務之分析與討論 44 Chapter 5 聚類分析與潛在亞健康樣本之探討 46 5.1 聚類分析設計與方法 46 5.2 無監督聚類分析 47 5.3 半監督聚類分析 52 5.4 聚類結果之分析與討論 56 Chapter 6 結果與未來展望 61 6.1 結論 61 6.2 未來展望 62 Reference 64 附錄1 三種不同預訓練模型在K-means分群中所選之潛在亞健康樣本 67 ImageNet Pre-train Model 67 SimCLR Pre-train Model 68 OCT Pre-train Model 70 附錄2 三種不同預訓練模型在SDEC分群中所選之潛在亞健康樣本 72 ImageNet Pre-train Model 73 SimCLR Pre-train Model 75 OCT Pre-train Model 77	-
dc.language.iso	zh_TW	-
dc.subject	卷積神經網絡	zh_TW
dc.subject	光學同調斷層掃描	zh_TW
dc.subject	半監督聚類分析	zh_TW
dc.subject	自監督學習	zh_TW
dc.subject	黃斑前膜亞健康	zh_TW
dc.subject	Self-Supervised Learning	en
dc.subject	Convolutional Neural Network	en
dc.subject	Optical Coherence Tomography	en
dc.subject	Sub-healthy Samples of Epiretinal Membrane	en
dc.subject	Semi-Supervised Clustering	en
dc.title	應用自監督學習及半監督聚類於視網膜光學同調斷層掃描影像探討潛在之亞健康樣本	zh_TW
dc.title	Exploring potential sub-healthy eyes in retina OCT images using self-supervised learning and semi-supervised clustering	en
dc.type	Thesis	-
dc.date.schoolyear	113-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	王一中;謝易庭	zh_TW
dc.contributor.oralexamcommittee	I-Jong Wang;Yi-Ting Hsieh	en
dc.subject.keyword	光學同調斷層掃描,卷積神經網絡,自監督學習,半監督聚類分析,黃斑前膜亞健康,	zh_TW
dc.subject.keyword	Optical Coherence Tomography,Convolutional Neural Network,Self-Supervised Learning,Semi-Supervised Clustering,Sub-healthy Samples of Epiretinal Membrane,	en
dc.relation.page	78	-
dc.identifier.doi	10.6342/NTU202502484	-
dc.rights.note	同意授權(全球公開)	-
dc.date.accepted	2025-07-29	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	光電工程學研究所	-
dc.date.embargo-lift	2025-08-02	-
顯示於系所單位：	光電工程學研究所

文件中的檔案：

檔案	大小	格式
ntu-113-2.pdf	6.27 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。