電子病歷資訊檢索與擷取技術研究

Chia-Chun Lee; 李佳純

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/61368

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳信希
dc.contributor.author	Chia-Chun Lee	en
dc.contributor.author	李佳純	zh_TW
dc.date.accessioned	2021-06-16T13:01:43Z	-
dc.date.available	2013-08-14
dc.date.copyright	2013-08-14
dc.date.issued	2013
dc.date.submitted	2013-08-07
dc.identifier.citation	[1] 行政院衛生署電子病歷推動專區http://emr.doh.gov.tw/introduction.aspx [2] G. Goth, 'Analyzing medical data'. Communications of the ACM, 2012. 55(6): p. 13-15. [3] H.-H. Huang, C.-C. Lee, and H.-H. Chen, 'Outpatient Department Recommendation Based on Medical Summaries', Information Retrieval Technology. 2012, Springer. p. 518-527. [4] D. A. Hanauer. 'EMERSE: the electronic medical record search engine'. AMIA Annual Symposium Proceedings. 2006. [5] L. Yang, Q. Mei, K. Zheng, and D. A. Hanauer. 'Query log analysis of an electronic health record search engine'. AMIA Annual Symposium Proceedings. 2011. [6] K. Zheng, Q. Mei, and D. A. Hanauer, 'Collaborative search in electronic health records'. Journal of the American Medical Informatics Association, 2011. 18(3): p. 282-291. [7] TREC Medical Track http://trec.nist.gov/data/medical.html [8] The Lemur Project http://www.lemurproject.org/ [9] S. Kullback and R. A. Leibler, 'On information and sufficiency'. The Annals of Mathematical Statistics, 1951. 22(1): p. 79-86. [10] S. E. Robertson and K. S. Jones, 'Relevance weighting of search terms'. Journal of the American Society for Information science, 1976. 27(3): p. 129-146. [11] C. E. Shannon, 'Prediction and entropy of printed English'. Bell system technical journal, 1951. 30(1): p. 50-64. [12] M. C. Grignetti, 'A note on the entropy of words in printed English'. Information and Control, 1964. 7(3): p. 304-306. 51 [13] P. Sondhi, V. V. Vydiswaran, and C. Zhai, 'Reliability prediction of webpages in the medical domain', Advances in Information Retrieval. 2012, Springer. p. 219-231. [14] D. Mondal, A. Gangopadhyay, and W. Russell. 'Medical decision making using vector space model'. Proceedings of the 1st ACM International Health Informatics Symposium. 2010. [15] B. Koopman, P. Bruza, L. Sitbon, and M. Lawley. 'Evaluating medical information retrieval'. Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval. 2011. [16] D. Zhu and B. Carterette. 'Improving health records search using multiple query expansion collections'. Bioinformatics and Biomedicine (BIBM), 2012 IEEE International Conference on. 2012. [17] T. Mitchell. 'Machine Learning'. McGraw Hill. ISBN 0070428077, p.2. [18] T. Strohman, D. Metzler, H. Turtle, and W. B. Croft. 'Indri: A language model-based search engine for complex queries'. Proceedings of the International Conference on Intelligent Analysis. 2005. [19] Indri Retrieval Model Overview. http://ciir.cs.umass.edu/~metzler/indriretmodel.html [20] Lemur toolkit: http://www.lemurproject.org/ [21] SVM rank: http://www.cs.cornell.edu/people/tj/svm_light/svm_rank.html [22] NCBI stopword: http://www.ncbi.nlm.nih.gov/books/NBK3827/table/pubmedhelp.T43/ [23] H.-B. Chen, H.-H. Huang, C.-T. Tan, J. Tjiu, and H.-H. Chen. 'A statistical medical summary translation system'. Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium. 2012.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/61368	-
dc.description.abstract	本研究以台大醫院病歷為資料集，目的在設計一套病歷檢索系統，幫助醫師檢索相關病歷或提供治療方法給醫師參考。首先分析病歷的語言現象，包含病歷平均長度、詞彙量、資訊熵。依據病人掛號科別，病歷可以分為14 類，每個科別的語言現象也會分別呈現。第一階段實驗使用五種檢索模型與六種索引策略，第二階段實驗則加入排序學習技術與三種索引策略。效能評估則分為病歷檢索層次與治療檢索層次，在病歷檢索層次，主訴視為查詢文字，在治療檢索層次，主訴與簡短病史視為查詢文字。第一階段實驗的病歷檢索效能評估中，okapi 模型效能最佳。資訊熵較低的科別，其效能也比較好。與身體多重器官或人體系統相關的科別，如腫瘤科、神經科，則低於平均效能。治療檢索效能評估，則沒有一個模型特別優異。第二階段實驗的病歷檢索效能評估中，tf-idf 模型效能最好。結合多種檢索模型的檢索分數，反而使得效能下降。運用排序學習技術，能夠顯著優於第一階段的實驗結果，大部份科別的效能皆有提升。治療檢索效能評估，則五種模型差別不大。	zh_TW
dc.description.provenance	Made available in DSpace on 2021-06-16T13:01:43Z (GMT). No. of bitstreams: 1 ntu-102-R00922081-1.pdf: 934500 bytes, checksum: 2178590476dbb1b207fb5dc4afc10b98 (MD5) Previous issue date: 2013	en
dc.description.tableofcontents	摘要 i Abstract ii 圖目錄 v 表目錄 vi 第1章緒論 1 1.1 研究背景 1 1.2 研究動機 1 1.3 研究目的 2 1.4 論文架構 2 第2章相關研究 3 2.1 病歷資訊檢索 3 2.2 病歷資訊檢索測試集 4 2.3 傳統資訊檢索模型 6 2.3.1 Indri 6 2.3.2 Kullback–Leibler divergence 7 2.3.3 tf-idf 7 2.3.4 Okapi 8 2.3.5 向量空間模型 9 2.4 排序學習 10 第3章實驗資料集 12 3.1 病歷格式 12 3.2 病歷資料相關統計 14 第4章實驗方法 19 4.1 實驗設計與效能評估 19 4.2 資料前置處理 21 4.3 醫學術語擷取 23 4.4 第一階段實驗：建立索引與檢索模型 27 4.5第二階段實驗：排序學習 28 第5章實驗結果 34 5.1 檢索模型實驗結果 34 5.2 排序學習實驗結果 41 第6章結論與未來研究 49 6.1 結論 49 6.2 未來研究方向 50
dc.language.iso	zh-TW
dc.subject	病歷資訊檢索	zh_TW
dc.subject	病歷資訊擷取	zh_TW
dc.subject	排序學習	zh_TW
dc.subject	Medical Record Information Retrieval	en
dc.subject	Medical Record Information Extraction	en
dc.subject	Learning to Rank	en
dc.title	電子病歷資訊檢索與擷取技術研究	zh_TW
dc.title	Medical Record Retrieval and Extraction for Professional Information Access	en
dc.type	Thesis
dc.date.schoolyear	101-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	鄭卜壬,蔡銘峰,郭俊桔
dc.subject.keyword	病歷資訊檢索,病歷資訊擷取,排序學習,	zh_TW
dc.subject.keyword	Medical Record Information Retrieval,Medical Record Information Extraction,Learning to Rank,	en
dc.relation.page	52
dc.rights.note	有償授權
dc.date.accepted	2013-08-07
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-102-1.pdf 未授權公開取用	912.6 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。