預訓練對於醫療影像的探討

Yu-Cheng Chang; 張友誠

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8315

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	徐宏民(Winston Hsu)
dc.contributor.author	Yu-Cheng Chang	en
dc.contributor.author	張友誠	zh_TW
dc.date.accessioned	2021-05-20T00:51:57Z	-
dc.date.available	2020-08-24
dc.date.available	2021-05-20T00:51:57Z	-
dc.date.copyright	2020-08-24
dc.date.issued	2020
dc.date.submitted	2020-08-18
dc.identifier.citation	Data from lidc-idri. lits - liver tumor segmentation challenge. H. Aerts, E. Rios Velazquez, R. T. Leijenaar, C. Parmar, P. Grossmann, S. Carvalho, and P. Lambin. Data from nsclc-radiomics. the cancer imaging archive, 2015. H. J. Aerts, E. R. Velazquez, R. T. Leijenaar, C. Parmar, P. Grossmann, S. Carvalho, J. Bussink, R. Monshouwer, B. Haibe-Kains, D. Rietveld, et al. Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach. Nature communications, 2014. S. Bakas, H. Akbari, A. Sotiras, M. Bilello, M. Rozycki, J. S. Kirby, J. B. Freymann, K. Farahani, and C. Davatzikos. Advancing the cancer genome atlas glioma mri collections with expert segmentation labels and radiomic features. Scientific data, 4:170117, 2017. S. Bakas, M. Reyes, A. Jakab, S. Bauer, M. Rempfler, A. Crimi, R. T. Shinohara, C. Berger, S. M. Ha, M. Rozycki, et al. Identifying the best machine learning algorithms for brain tumor segmentation, progression assessment, and overall survival prediction in the brats challenge. arXiv preprint arXiv:1811.02629, 2018. O. Bernard, A. Lalande, C. Zotti, F. Cervenansky, X. Yang, P.-A. Heng, I. Cetin, K. Lekadir, O. Camara, M. A. G. Ballester, et al. Deep learning techniques for automatic mri cardiac multi-structures segmentation and diagnosis: Is the problem solved? IEEE transactions on medical imaging, 37(11):2514–2525, 2018. N. Bjorck, C. P. Gomes, B. Selman, and K. Q. Weinberger. Understanding batch normalization. In Advances in Neural Information Processing Systems, pages 7694– 7705, 2018. T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners. arXiv preprint arXiv:2005.14165, 2020. J. Carreira and A. Zisserman. Quo vadis, action recognition? a new model and the kinetics dataset. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6299–6308, 2017. S. Chen, K. Ma, and Y. Zheng. Med3d: Transfer learning for 3d medical image analysis. arXiv preprint arXiv:1904.00625, 2019. T. Chen, S. Kornblith, M. Norouzi, and G. Hinton. A simple framework for contrastive learning of visual representations. arXiv preprint arXiv:2002.05709, 2020. Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger. 3d u-net: learning dense volumetric segmentation from sparse annotation. In International conference on medical image computing and computer-assisted intervention, pages 424–432. Springer, 2016. Z. Dai, Z. Yang, Y. Yang, J. Carbonell, Q. V. Le, and R. Salakhutdinov. Transformerxl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860, 2019. J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Imagenet: A largescale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009. J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. C. Doersch, A. Gupta, and A. A. Efros. Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE international conference on computer vision, pages 1422–1430, 2015. M. Eichenlaub, K. Astheimer, J. Minners, T. Blum, C. Restle, C. Maring, S. Schweitzer, U. Thiel, F.-J. Neumann, T. Arentz, et al. Evaluation of a new ultralow-dose radiation protocol for electrophysiological device implantation: A near-zero fluoroscopy approach for device implantation. Heart Rhythm, 17(1):90– 97, 2020. J. Frankle, D. J. Schwab, and A. S. Morcos. Training batchnorm and only batchnorm: On the expressive power of random features in cnns. arXiv preprint arXiv:2003.00152, 2020. R. Girshick. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 1440–1448, 2015. R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580–587, 2014. Y. Goltsev, N. Samusik, J. Kennedy-Darling, S. Bhate, M. Hale, G. Vazquez, S. Black, and G. P. Nolan. Deep profiling of mouse splenic architecture with codex multiplexed imaging. Cell, 174(4):968–981, 2018. K. He, G. Gkioxari, P. Dollár, and R. Girshick. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017. K. He, X. Zhang, S. Ren, and J. Sun. Delving deep into rectifiers: Surpassing humanlevel performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015. S. Ioffe and C. Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167, 2015. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. J. Long, E. Shelhamer, and T. Darrell. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3431–3440, 2015. P. Luo, X. Wang, W. Shao, and Z. Peng. Towards understanding regularization in batch normalization. arXiv preprint arXiv:1809.00846, 2018. B. Menze, A. Jakab, S. Bauer, J. Kalpathy-cramer, K. Farahani, J. Kirby, et al. The multimodal brain tumor image segmentation benchmark (brats). medical imaging. IEEE Transactions on, pages 1–32, 2014. M. Noroozi and P. Favaro. Unsupervised learning of visual representations by solving jigsaw puzzles. In European Conference on Computer Vision, pages 69–84. Springer, 2016. A. v. d. Oord, Y. Li, and O. Vinyals. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala. Pytorch: An imperative style, high-performance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019. A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever. Language models are unsupervised multitask learners. OpenAI Blog, 1(8):9, 2019. S. Santurkar, D. Tsipras, A. Ilyas, and A. Madry. How does batch normalization help optimization? In Advances in Neural Information Processing Systems, pages 2483–2493, 2018. H.-C. Shin, H. R. Roth, M. Gao, L. Lu, Z. Xu, I. Nogues, J. Yao, D. Mollura, and R. M. Summers. Deep convolutional neural networks for computer-aided detection: Cnn architectures, dataset characteristics and transfer learning. IEEE transactions on medical imaging, 35(5):1285–1298, 2016. K. Simonyan and A. Zisserman. Two-stream convolutional networks for action recognition in videos. In Advances in neural information processing systems, pages 568–576, 2014. N. Tajbakhsh, J. Y. Shin, S. R. Gurudu, R. T. Hurst, C. B. Kendall, M. B. Gotway, and J. Liang. Convolutional neural networks for medical image analysis: Full training or fine tuning? IEEE transactions on medical imaging, 35(5):1299–1312, 2016. R. Werner, T. Sentker, F. Madesta, T. Gauer, and C. Hofmann. Intelligent 4d ct sequence scanning (i4dct): concept and performance evaluation. Medical physics, 46(8):3462–3474, 2019. G. Yang, J. Pennington, V. Rao, J. Sohl-Dickstein, and S. S. Schoenholz. A mean field theory of batch normalization. arXiv preprint arXiv:1902.08129, 2019. Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, and Q. V. Le. Xlnet: Generalized autoregressive pretraining for language understanding. In Advances in neural information processing systems, pages 5753–5763, 2019. J. Zbontar, F. Knoll, A. Sriram, M. J. Muckley, M. Bruno, A. Defazio, M. Parente, K. J. Geras, J. Katsnelson, H. Chandarana, et al. fastmri: An open dataset and benchmarks for accelerated mri. arXiv preprint arXiv:1811.08839, 2018. R. Zhang, P. Isola, and A. A. Efros. Colorful image colorization. In European conference on computer vision, pages 649–666. Springer, 2016. Z. Zhou, V. Sodha, M. M. R. Siddiquee, R. Feng, N. Tajbakhsh, M. B. Gotway, and J. Liang. Models genesis: Generic autodidactic models for 3d medical image analysis. In International Conference on Medical Image Computing and ComputerAssisted Intervention, pages 384–393. Springer, 2019.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8315	-
dc.description.abstract	模型預訓練 (Pre-training) 在電腦視覺及自然語言處理的領域中被廣泛地使用，透過巨量的訓練資料，模型可以習得泛用的影像或文字的特徵提取能力，進一步幫助網路模型有更好的表現。然而，鮮少的文獻深入探討是否有哪些因素會影響模型預訓練在醫療影像領域的效果，因此本篇研究著重探討預訓練的本質，透過大量的分析闡明預訓練對於醫療影像處理的限制，且進一步提出新穎的解決方法。透過實驗分析驗證，問題複雜度與模型的預訓練資料型態 (Modality) 皆對於預訓練的效果有著明顯的影響。我們也分析了批量標準化 (Batch normalization) 當中的縮放參數 (Scaling term γ)，並且建立一套高效率的模型能力評估方法。除此之外，我們更進一步提出神經網路煉金術 (Network Alchemy)，有系統性的激發模型的潛能，以妥善利用模型所有的可學習參數。大量的實驗結果說明在各式各樣的實驗設定中，我們的方法均能夠提升模型的表現，展示其泛化性與穩定性。	zh_TW
dc.description.abstract	Pre-training is a well-developed technique to extract general feature representations from abundant data. However, the factors affecting how pre-training works in medical imaging is rarely studied. In this work, we fully explore the essence of pre-training in medical imaging and provide comprehensive analysis. We conclude that both the target task complexity and the pre-trained data modality have considerable impact on the effectiveness of pre-training in medical imaging. In addition, we analyze the trainable parameter γ in batch normalization (BatchNorm) and establish an original standard to efficiently assess the effectiveness of pre-trained weights. We further propose the Network Alchemy to stimulate the considerable potential of the network and fully utilize model parameters in fine-tuning stage. Extensive experimental results exhibit the robustness and the generalization ability of our proposed methodology in various experimental scenarios	en
dc.description.provenance	Made available in DSpace on 2021-05-20T00:51:57Z (GMT). No. of bitstreams: 1 U0001-0508202017013100.pdf: 3001399 bytes, checksum: 86f21e3b2442b9cf9321ab19c669404a (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	誌謝 iii 摘要 iv Abstract v 1 Introduction 1 2 Related work 4 2.1 Pre-training 4 2.2 Pre-training in medical imaging 5 2.3 Batch normalization 5 3 Methodology 7 3.1 Pilot study 7 3.1.1 Is pre-training always useful in medical imaging? 7 3.1.2 Does the modality of pre-trained datasets matter? 8 3.2 Proposed approach 9 3.2.1 Analysis on γ in batch normalization 9 3.2.2 Network Alchemy 10 4 Experiments 14 4.1 Implementation details 14 4.2 Quantitative evaluation 15 4.3 Ablation study 15 4.4 Pre-trained network evaluation standard 17 5 Conclusion 18 Bibliography 19
dc.language.iso	en
dc.title	預訓練對於醫療影像的探討	zh_TW
dc.title	Rethinking Pre-training in Medical Imaging	en
dc.type	Thesis
dc.date.schoolyear	108-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	陳文進(Chen-Wen Jin),葉梅珍(Mei-Zhen Ye),李志國(Zhi-Guo Li),蘇東弘(Dong-Hong Su)
dc.subject.keyword	深度學習,醫療影像,預訓練,	zh_TW
dc.subject.keyword	Deep Learning,Medical Imaging,Pre-training,	en
dc.relation.page	23
dc.identifier.doi	10.6342/NTU202002487
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2020-08-19
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
U0001-0508202017013100.pdf	2.93 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。