Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/6445
Title: | 應用機器學習演算法識別罹患帶狀疱疹之高危險群 Applying Machine Learning Algorithms to Identify Patients with High Risk of Developing Herpes Zoster |
Authors: | Ding-Dar Lee 李定達 |
Advisor: | 歐陽彥正 |
Keyword: | 帶狀疱,疹,危險因子,全民健康保險研究資料庫,機器學習演算法,決策樹,線性元件分析, herpes zoster,risk factor,National Health Insurance Research Database,machine learning algorithm,decision tree,linear component analysis, |
Publication Year : | 2013 |
Degree: | 博士 |
Abstract: | 帶狀疱疹是台灣常見的疾病之一,許多患者還受到其併發症--疱疹後神經痛所苦。雖然已有許多帶狀疱疹的危險因子被報告過,例如紅斑性狼瘡,但大部份報告都只針對單一疾病。近年來機器學習演算法已被大量應用於大型醫學資料庫之分析且獲致極有價值之訊息。本研究針對台灣之全民健康保險研究資料庫,利用兩種特性相異之機器學習演算法--「決策樹」及「線性元件分析」,同時對多種危險因子進行全面性的分析。我們首先以線性鑑別分析法進行特徵選擇,並找出最具相關性的八種共病:心律不整、缺乏性貧血、腎衰竭、類風濕性關節炎、冠狀動脈心臟病、惡性腫瘤、慢性肺病及風濕性疾病。這八種疾病皆以線性元件分析方法加以整合分析,而後四種疾病則用於建構決策樹分析。除了一個末端節點外,所有決策樹之末端節點及線性元件分析之不等式皆得到有意義之勝算比及百分之九十五信賴區間,顯示這兩個方法皆有能力識別出部分罹患帶狀疱疹之高危險群。其中兩個末端節點及四個不等式甚至得到大於三之勝算比,這些患者具有較多次因惡性腫瘤、風濕性疾病、冠狀動脈心臟病或腎衰竭的就診紀錄。我們發現有一群患者可以同時被決策樹及線性元件分析找到,特別是風濕性疾病或冠狀動脈心臟病。線性元件分析可以找到決策樹所無法鑑別的一群腎衰竭患者。雖然決策樹與線性元件分析所識別的帶狀疱疹患者有所重疊,卻也有部分互斥,對於不同族群具有相異之鑑別力,能夠以互補的方式識別出不同族群,因此聯集這兩個演算法之結果或可增加識別之靈敏度。 Herpes zoster is a relatively common disorder in Taiwan and many patients suffer from its most notorious complication-- postherpetic neuralgia. A lot of risk factors of herpes zoster have been reported, e.g., systemic lupus erythematosus, but most of the observations were based on one single disorder. In this study, rather than focusing on a specific risk factor, we conducted comprehensive analyses on the joint effects of multiple risk factors based on the National Health Insurance Research Database in Taiwan. In recent years, machine learning algorithms have been applied to the analyses of large medical databases and have yielded valuable information. In this respect, we employed two machine learning algorithms with distinctive characteristics, decision tree and linear component analysis. We first performed feature selection by linear discriminant analysis and identified the most relevant eight comorbidities of herpes zoster, i.e. arrhythmia, deficiency anemia, renal failure, rheumatoid arthritis, coronary heart disease, malignancy, chronic pulmonary disease, and rheumatologic disease; all of them were used in linear component analysis while the last four disorders were incorporated to generate the decision tree. All leaf nodes except for one of decision tree and all inequalities of linear component analysis corresponded to a significant odds ratio and 95% confidence interval, indicating that both methods were able to identify some sub-populations of patients with higher risk of developing herpes zoster. Two leaf nodes of decision tree and four inequalities of linear component analysis yielded odds ratios even higher than 3. Those patients showed more visits for malignancy, rheumatologic disorder, coronary heart disease or renal failure. We noticed a very specific subgroup of patients, especially those of rheumatologic disorder or coronary heart disease, could be identified by both decision tree and linear component analysis algorithms at the same time while linear component analysis discovered cases of renal failure that decision tree was unable to recognize. Though there were overlaps of cases identified by some leaf nodes of decision tree and inequalities of linear component analysis, some extent of exclusions also existed between the herpes zoster cases identified by both algorithms. It implied that decision tree and linear component analysis were able to recognize different populations of herpes zoster cases in a complementary way and therefore union of the results obtained by both methods could possibly increase the sensitivity. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/6445 |
Fulltext Rights: | 同意授權(全球公開) |
Appears in Collections: | 資訊工程學系 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
ntu-102-1.pdf | 3.89 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.