醫學眼底影像中不平衡多標籤分類的準確性提升

林鼎鈞; Ting-Chun Lin

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94216

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	周承復	zh_TW
dc.contributor.advisor	Cheng-Fu Chou	en
dc.contributor.author	林鼎鈞	zh_TW
dc.contributor.author	Ting-Chun Lin	en
dc.date.accessioned	2024-08-15T16:16:17Z	-
dc.date.available	2024-08-16	-
dc.date.copyright	2024-08-15	-
dc.date.issued	2024	-
dc.date.submitted	2024-08-01	-
dc.identifier.citation	Retinal image analysis for multidisease detection challenge. https://riadd. grand-challenge.org/, 2021. Accessed : 20240201. J. Ali, R. Khan, N. Ahmad, and I. Maqsood. Random forests and decision trees. International Journal of Computer Science Issues(IJCSI), 9, 09 2012. M. Awad and R. Khanna. Support Vector Machines for Classification, pages 39–66. 04 2015. E. BenBaruch, T. Ridnik, N. Zamir, A. Noy, I. Friedman, M. Protter, and L. Zelnik Manor. Asymmetric loss for multilabel classification, 2021. F. Charte, A. Rivera, M. J. del Jesus, and F. Herrera. A first approach to deal with imbalance in multilabel datasets. In J.S. Pan, M. M. Polycarpou, M. Woźniak, A. C. P. L. F. de Carvalho, H. Quintián, and E. Corchado, editors, Hybrid Artificial Intelligent Systems, pages 150–160, Berlin, Heidelberg, 2013. Springer Berlin Hei delberg. F. Charte, A. J. Rivera, M. J. del Jesus, and F. Herrera. Mlenn: A first approach to heuristic multilabel undersampling. In E. Corchado, J. A. Lozano, H. Quintián, and H. Yin, editors, Intelligent Data Engineering and Automated Learning – IDEAL 2014, pages 1–9, Cham, 2014. Springer International Publishing. F. Charte, A. J. Rivera, M. J. del Jesus, and F. Herrera. Addressing imbal ance in multilabel classification: Measures and random resampling algorithms. Neurocomputing, 163:3–16, 2015. Recent Advancements in Hybrid Artificial Intel ligence Systems and its Application to RealWorld Problems Progress in Intelligent Systems Mining Humanistic Data. F.Charte,A.J.Rivera,M.J.delJesus,andF.Herrera.Mlsmote:Approachingimbal anced multilabel learning through synthetic instance generation. KnowledgeBased Systems, 89:385–397, 2015. F.Charte,A.J.Rivera,M.J.delJesus,andF.Herrera.Dealingwithdifficultminority labels in imbalanced mutilabel data sets. Neurocomputing, 326–327:39–53, Jan. 2019. F. Charte, A. J. Rivera, M. J. del Jesus, and F. Herrera. Remedialhwr: Tackling multilabel imbalance through label decoupling and data resampling hybridization. Neurocomputing, 326327:110–122, 2019. C.F. Chen, Q. Fan, and R. Panda. Crossvit: Crossattention multiscale vision trans former for image classification, 2021. Z. Dai, H. Liu, Q. V. Le, and M. Tan. Coatnet: Marrying convolution and attention for all data sizes, 2021. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale, 2021. M. Everingham, S. M. A. Eslami, L. V. Gool, C. K. I. Williams, J. M. Winn, and A. Zisserman. The pascal visual object classes challenge: A retrospective. International Journal of Computer Vision, 111:98 – 136, 2014. D. Farnell, F. Hatfield, P. Knox, M. Reakes, S. Spencer, D. Parry, and S. Harding. Enhancement of blood vessels in digital fundus photographs via the application of multiscale line operators. Journal of the Franklin Institute, 345(7):748–765, 2008. B.B. Gao and H.Y. Zhou. Learning to discover multiclass attentional regions for multilabel image recognition. IEEE Transactions on Image Processing, 30:5920– 5932, 2021. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition, 2015. A.Hoover,V.Kouznetsova,andM.Goldbaum.Locatingbloodvesselsinretinalim ages by piecewise threshold probing of a matched filter response. IEEE Transactions on Medical Imaging, 19(3):203–210, 2000. G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger. Densely connected convolutional networks, 2018. S. Kulkarni, R. Kamble, and M. Kokare. Automatic field of view extraction with variable enhancement of color fundus images. In 2017 14th IEEE India Council International Conference (INDICON), pages 1–5, 2017. J. Lanchantin, T. Wang, V. Ordonez, and Y. Qi. General multilabel image classifi cation with transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 16478–16488, June 2021. Z. Leng, M. Tan, C. Liu, E. D. Cubuk, X. Shi, S. Cheng, and D. Anguelov. Polyloss: A polynomial expansion perspective of classification loss functions, 2022. T.Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár. Focal loss for dense object detection, 2018. R.Liu,J.Huang,T.H.Li,andG.Li.Causalitycompensatedattentionforcontextual biased visual recognition. In The Eleventh International Conference on Learning Representations, 2023. S. Liu, L. Zhang, X. Yang, H. Su, and J. Zhu. Query2label: A simple transformer way to multilabel classification, 2021. Z. Liu, H. Hu, Y. Lin, Z. Yao, Z. Xie, Y. Wei, J. Ning, Y. Cao, Z. Zhang, L. Dong, F. Wei, and B. Guo. Swin transformer v2: Scaling up capacity and resolution, 2022. Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo. Swin trans former: Hierarchical vision transformer using shifted windows, 2021. Z. Liu, H. Mao, C.Y. Wu, C. Feichtenhofer, T. Darrell, and S. Xie. A convnet for the 2020s, 2022. S. Mehta and M. Rastegari. Mobilevit: Lightweight, generalpurpose, and mobile friendly vision transformer, 2022. S. Pachade, P. Porwal, D. Thulkar, M. Kokare, G. Deshmukh, V. Sahasrabuddhe, L. Giancardo, G. Quellec, and F. Mériaudeau. Retinal fundus multidisease image dataset (rfmid), 2020. S. Pachade, P. Porwal, D. Thulkar, M. Kokare, G. Deshmukh, V. Sahasrabuddhe, L. Giancardo, G. Quellec, and F. Mériaudeau. Retinal fundus multidisease image dataset (rfmid): A dataset for multidisease detection research. Data, 6(2), 2021. M. A. Rodriguez, H. AlMarzouqi, and P. Liatsis. Multilabel retinal disease classi fication using transformers, 2022. M. Tan and Q. V. Le. Efficientnet: Rethinking model scaling for convolutional neural networks, 2020. M. Tan and Q. V. Le. Efficientnetv2: Smaller models and faster training, 2021. Z. Tu, H. Talebi, H. Zhang, F. Yang, P. Milanfar, A. Bovik, and Y. Li. Maxvit: Multiaxis vision transformer, 2022. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. Attention is all you need, 2023. S. Woo, S. Debnath, R. Hu, X. Chen, Z. Liu, I. S. Kweon, and S. Xie. Convnext v2: Codesigning and scaling convnets with masked autoencoders, 2023. S. Xu, Y. Li, J. Hsiao, C. Ho, and Z. Qi. Open vocabulary multilabel classification with dualmodal decoder on aligned visualtextual features, 2023. J. Ye, J. He, X. Peng, W. Wu, and Y. Qiao. Attentiondriven dynamic graph convo lutional network for multilabel image recognition, 2020.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/94216	-
dc.description.abstract	近年來，深度學習技術在醫學影像研究領域中的應用不斷深入，目的是提升對各種病症的診斷準確度。當前面臨的主要挑戰是，醫學影像分類任務大多僅聚焦於單一標籤，缺少對影像的全面診斷能力，尤其是在處理數據不平衡的多標籤分類問題時。這些挑戰涉及到不同病變類型之間的資料量懸殊以及潛在的病變相關性，這些因素都可能影響模型性能。為應對這些問題，本篇論文開發了一款多標籤深度學習模型，對當前先進的影像識別模型進行改進，使其適用於多標籤醫學影像分類任務。在資料前處理階段，本研究設計了一套針對類別不平衡問題的數據增強演算法，利用公開醫學眼底影像進行多標籤分類。本研究著重於解決眼底病變類型數量不平衡的問題，以提升模型的準確性和泛化能力，從而在自動識別和分類眼底疾病上達到最佳性能，實現對多種病變的準確診斷。最後，通過公開醫學眼底資料集上進行訓練與效能評估，與現有的多標籤模型進行比較，實驗結果顯示，我們的模型及其數據增強方法顯著提高了準確率，且性能超越了現有的先進模型。	zh_TW
dc.description.abstract	In recent years, the application of deep learning techniques in the field of medical imaging research has continued to deepen, with the goal of enhancing the diagnostic ac curacy for various diseases. A major challenge currently faced is that most medical image classification tasks focus only on single labels, lacking comprehensive diagnostic capabili ties for images, especially when dealing with imbalanced data in multilabel classification problems. These challenges involve disparities in data volumes among different lesion types and potential interrelations among lesions, which can impact model performance. To address these issues, this paper develops a multi-label deep learning model that improves upon current advanced image recognition models, making them suitable for multi label medical image classification tasks. In the data preprocessing stage, this study designs a set of data augmentation algorithms specifically for class imbalance issues, using public medical fundus images for multilabel classification. This research focuses on resolving the issue of imbalanced numbers of fundus lesion types to enhance the accuracy and generalizability of the model, thereby achieving optimal performance in the automatic identi fication and classification of fundus diseases and realizing accurate diagnosis of multiple lesions. Finally, by training and conducting performance evaluations on the public medical fundus datasets, and comparing with existing multi-label models, the experimental results demonstrate that our model and its data augmentation methods significantly improve ac curacy and outperform existing advanced models.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-08-15T16:16:17Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2024-08-15T16:16:17Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	Verification Letter from the Oral Examination Committee - I Acknowledgements - iii 摘要 - v Abstract - vii Contents - ix List of Figures - xi List of Tables - xiii Chapter 1 Introduction - 1 Chapter 2 Related Work - 7 2.1 Image classification - 7 2.1.1 Convolution-based methods - 7 2.1.2 Transformer-based methods - 11 2.1.3 Hybrid methods - 13 2.2 Multi-label Models - 14 2.3 Class imbalance - 16 2.3.1 Resampling methods - 16 2.3.2 Cost Sensitive Methods - 18 Chapter 3 Dataset - 21 Chapter 4 Method - 25 4.1 Data augmentation for imbalance data - 25 4.2 Model Architecture - 26 4.2.1 ConvNeXt - 28 4.2.2 Transformer Encoder - 29 4.2.3 Feature Projection - 30 Chapter 5 Experiments - 33 5.1 Metrics - 33 5.2 Determination of optimal method configuration - 36 5.2.1 Transformer Encoder layers Selection - 37 5.2.2 Class imbalance method - 39 5.2.3 Comparison with other models - 40 5.2.4 Loss functions - 42 5.3 Heatmaps for Our Model - 43 Chapter 6 Conclusion - 49 References - 53 Appendix A — Experiments Details - 59 A.1 Training Hyperparameters - 59 A.2 Training Preprocess - 60 Appendix B — More Result Details -61 B.1 Each class results - 61	-
dc.language.iso	en	-
dc.subject	疾病分類	zh_TW
dc.subject	醫學眼底影像	zh_TW
dc.subject	不平衡多標籤數據集	zh_TW
dc.subject	深度學習	zh_TW
dc.subject	多標籤模型	zh_TW
dc.subject	Deep Learning	en
dc.subject	Multi-label Model	en
dc.subject	Medical Fundus Images	en
dc.subject	Imbalanced Multilabel Datasets	en
dc.subject	Disease Classification	en
dc.title	醫學眼底影像中不平衡多標籤分類的準確性提升	zh_TW
dc.title	Improved Accuracy of Imbalanced Multi-label Classification in Medical Fundus Images	en
dc.type	Thesis	-
dc.date.schoolyear	112-2	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	吳曉光;黃志煒;陳駿丞;吳振吉	zh_TW
dc.contributor.oralexamcommittee	Hsiao-kuang Wu;Chih-Wei Huang;Jun-Cheng Chen;Chen-Chi Wu	en
dc.subject.keyword	深度學習,多標籤模型,醫學眼底影像,不平衡多標籤數據集,疾病分類,	zh_TW
dc.subject.keyword	Deep Learning,Multi-label Model,Medical Fundus Images,Imbalanced Multilabel Datasets,Disease Classification,	en
dc.relation.page	63	-
dc.identifier.doi	10.6342/NTU202401103	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2024-08-04	-
dc.contributor.author-college	電機資訊學院	-
dc.contributor.author-dept	資訊工程學系	-
dc.date.embargo-lift	2029-07-30	-
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-112-2.pdf 未授權公開取用	16.89 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。