Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資料科學學位學程
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/599
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳縕儂(Yun-Nung Chen)
dc.contributor.authorShang-Chi Tsaien
dc.contributor.author蔡尚錡zh_TW
dc.date.accessioned2021-05-11T04:40:25Z-
dc.date.available2019-08-20
dc.date.available2021-05-11T04:40:25Z-
dc.date.copyright2019-08-20
dc.date.issued2019
dc.date.submitted2019-08-20
dc.identifier.citation[1]E. Choi, M. T. Bahadori, A. Schuetz, W. F. Stewart, and J. Sun, “Doctor ai:Predicting clinical events via recurrent neural networks,” in Machine Learningfor Healthcare Conference, pp. 301–318, 2016.
[2]L. R. de Lima, A. H. Laender, and B. A. Ribeiro-Neto, “A hierarchical approachto the automatic categorization of medical documents,” in Proceedings of theseventh international conference on Information and knowledge management,pp. 132–139, ACM, 1998.
[3]J. Mullenbach, S. Wiegreffe, J. Duke, J. Sun, and J. Eisenstein, “Explainableprediction of medical codes from clinical text,” in Proceedings of the 2018 Con-ference of the North American Chapter of the Association for ComputationalLinguistics: Human Language Technologies, vol. 1, pp. 1101–1111, 2018.
[4]W. H. Organization et al., “International statistical classification of diseasesand related health problems: tenth revision-version for 2007,” http://apps.who. int/classifications/apps/icd/icd10online/, 2007.
[5]H. Shi, P. Xie, Z. Hu, M. Zhang, and E. P. Xing, “Towards automated icdcoding using deep learning,” arXiv preprint arXiv:1711.04075, 2017.
[6]A. E. Johnson, T. J. Pollard, L. Shen, H. L. Li-wei, M. Feng, M. Ghassemi,B. Moody, P. Szolovits, L. A. Celi, and R. G. Mark, “Mimic-iii, a freely acces-sible critical care database,” Scientific data, vol. 3, p. 160035, 2016.
[7]G. Singh, J. Thomas, I. Marshall, J. Shawe-Taylor, and B. C. Wallace, “Struc-tured multi-label biomedical text tagging via attentive neural tree decoding,”in Proceedings of the 2018 Conference on Empirical Methods in Natural Lan-guage Processing, pp. 2837–2842, Association for Computational Linguistics,2018.
[8]Y. Kim, “Convolutional neural networks for sentence classification,” in Pro-ceedings of the 2014 Conference on Empirical Methods in Natural LanguageProcessing (EMNLP), pp. 1746–1751, 2014.
[9]A. Nie, A. Zehnder, R. L. Page, A. L. Pineda, M. A. Rivas, C. D. Bustamante,and J. Zou, “Deeptag: inferring all-cause diagnoses from clinical notes in under-resourced medical domain,” arXiv preprint arXiv:1806.10722, 2018.
[10]T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributedrepresentations of words and phrases and their compositionality,” in Advancesin neural information processing systems, pp. 3111–3119, 201
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/handle/123456789/599-
dc.description.abstract電子病歷中記錄了病人相關的症狀描述、看診的歷史紀錄等文字資料,每一筆紀錄都有其對應的診斷代碼,代表著該次就醫時醫師所下的診斷結果及其治療方案等資訊。這篇論文主要想利用醫院中這類型的文字描述資料結合醫學上的專家知識來做診斷代碼的預測。為了能讓模型有效利用診斷代碼的階層關係這類額外的專家知識,我們提出了各種不同的方式去計算卷積神經網路的損失函數以此來取得同一種類別的診斷中所共享的語義資訊。這樣的資訊不只讓模型有額外的醫學知識作為學習方向,也幫助解決訓練資料中樣本數量不平衡的問題。根據我們做在MIMIC3這份國際通用的資料集的結果顯示,我們提出的方法確實能夠有效利用階層種類的知識並提供模型有意義的資訊來幫助改善現階段最好的預測結果。而這樣的討論與研究也顯示了結合額外的專家知識於機器學習的模型中是有一定的好處與重要性,能啟發未來更多的研究方向。zh_TW
dc.description.abstractClinical notes are essential medical documents to record each patient's symptoms. Each record is typically annotated with medical diagnostic codes, which means diagnosis and treatment. This paper focuses on predicting diagnostic codes given the descriptive present illness in electronic health records by leveraging domain knowledge. We investigate various losses in a convolutional model to utilize hierarchical category knowledge of diagnostic codes in order to allow the model to share semantics across different labels under the same category. The proposed model not only considers the external domain knowledge but also addresses the issue about data imbalance. The MIMIC3 benchmark experiments show that the proposed methods can effectively utilize category knowledge and provide informative cues to improve the performance in terms of the top-ranked diagnostic codes which is better than the prior state-of-the-art. The investigation and discussion express the potential of integrating the domain knowledge in the current machine learning based models and guiding future research directions.en
dc.description.provenanceMade available in DSpace on 2021-05-11T04:40:25Z (GMT). No. of bitstreams: 1
ntu-108-R06946004-1.pdf: 384276 bytes, checksum: e411392666cbefea28f9da406fdb89bd (MD5)
Previous issue date: 2019
en
dc.description.tableofcontents誌謝 iii
摘要 v
Abstract vii
1 Introduction 1
1.1 Motivation and Problem Description 1
1.2 Main Contribution 2
1.3 Thesis Structure 2
2 Proposed Approach 3
2.1 Convolutional Models 3
2.1.1 TextCNN 3
2.1.2 Convolutional Attention Model (CAML) 4
2.2 Knowledge Integration Mechanism 4
2.2.1 Cluster Penalty 4
2.2.2 Multi-Task Learning 5
2.2.3 Hierarchical Learning 5
2.3 Training 6
3 Evaluation 7
3.1 Experimental Setup 7
3.2 Results 8
3.3 Qualitative Analysis 9
4 Conclusion 11
Bibliography 13
dc.language.isoen
dc.subject多標籤預測zh_TW
dc.subject診斷文字zh_TW
dc.subject醫療診斷zh_TW
dc.subjectICD predictionen
dc.subjectmulti-label classificationen
dc.subjectclinical notesen
dc.title整合階層種類知識於多標籤診斷文字理解zh_TW
dc.titleLeveraging Hierarchical Category Knowledge for Multi-Label Diagnostic Text Understandingen
dc.date.schoolyear107-2
dc.description.degree碩士
dc.contributor.coadvisor古倫維(Lun-Wei Ku)
dc.contributor.oralexamcommittee曹昱,賴飛羆,黃挺豪
dc.subject.keyword診斷文字,醫療診斷,多標籤預測,zh_TW
dc.subject.keywordclinical notes,multi-label classification,ICD prediction,en
dc.relation.page14
dc.identifier.doi10.6342/NTU201903870
dc.rights.note同意授權(全球公開)
dc.date.accepted2019-08-20
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資料科學學位學程zh_TW
顯示於系所單位:資料科學學位學程

文件中的檔案:
檔案 大小格式 
ntu-108-1.pdf375.27 kBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved