Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90239
Title: | 區辨未曾吸菸者之良性肺病灶與早期肺癌:以術前結構化問卷為方法 Distinguishing Benign Pulmonary Lesions Misidentified as Early-Stage Lung Cancer in Never-Smokers via a Structured Questionnaire Before Resection |
Authors: | 黃序立 Hsu-Li Huang |
Advisor: | 陳保中 Pau-Chung Chen |
Keyword: | 肺癌,良性肺病灶,未曾吸煙者,多元邏輯斯迴歸,機器學習模型, Lung cancer,Benign pulmonary lesions,Never-smoker,Multiple logistic regression,Machine learning models, |
Publication Year : | 2023 |
Degree: | 碩士 |
Abstract: | 研究目的:本研究旨在分析初步診斷為肺癌,但手術切除後證實為良性的肺部病灶之特徵,並比較多元邏輯斯迴歸模型與機器學習模型在區分良性與惡性病灶間的表現。
方法:本研究分析國立臺灣大學醫學院附設醫院的「非吸菸者肺腺癌暴露體探索研究(ExWAS-LANS)」中,自2021年11月至2022年10月間共計473名參與者的資料,透過結構化問卷收集了人口統計、飲食習慣、生活方式、環境因素以及個人與家族醫療史等資訊,並運用多元邏輯斯迴歸模型與四種機器學習模型(包括分類與迴歸樹、隨機森林、支援向量機和類神經網絡),以分析將良性肺病灶誤判為肺癌的相關風險因子。 結果:本研究發現數項風險因子與將良性肺病灶誤判為肺癌顯著相關,如糖尿病病史、較高的家庭所得、居住在較新住房、居住在從未粉刷過的住房、住家附近經常聞到異味以及較高的PM2.5三年累積暴露量等。比較機器學習模型與多元邏輯斯迴歸模型在區分良性肺病灶方面的表現,兩者的F值(F1-score)相當。進一步而言,機器學習模型在區分良性肺病灶上有較高的精確率(precision),而相較之下,多元邏輯斯迴歸模型則有較佳的召回率(recall)。 結論:本研究指出了將未曾吸煙者的良性肺病灶誤判為肺癌的風險因子,並比較了多元邏輯斯迴歸模型與多種機器學習模型的有效性。增進對於前揭風險因子的認識,以開發更精準的預測模型,並在實際臨床情境中驗證模型的有效性,仍值得後續研究。 Objective: The objectives of this study were to characterize pulmonary lesions initially suspected to be lung cancer but ultimately confirmed as benign following surgical resection, and to compare the performance of multiple logistic regression and machine learning models in distinguishing between benign and cancerous lesions. Methods: Data from 473 participants enrolled in the Exposome-Wide Approach Study of Lung Adenocarcinoma in Never-Smokers (ExWAS-LANS) conducted at National Taiwan University Hospital between November 2021 and October 2022 were analyzed. Factors including demographics, dietary, lifestyle, living environment, as well as personal and family medical history were assessed using structured questionnaires. Multiple logistic regression and machine learning models, including classification and regression tree, random forest, support vector machine, and artificial neural network, were employed to identify factors associated with the misclassification of benign lesions as lung cancer. Results: Several risk factors were significantly associated with the misidentification of benign pulmonary lesions as lung cancer, including comorbidity of diabetes mellitus, higher household income, living in a newer house, inhabiting a never-painted house, reporting unusual odors near the residence, and higher three-year cumulative PM2.5 exposure. Comparing the multiple logistic regression model with machine learning models demonstrated that these models have comparable performance in F1-score for identifying benign cases misclassified as cancer cases. However, upon further analysis, the machine learning models showed a marked advantage in precision for the benign class over the multiple logistic regression model; in contrast, the multiple logistic regression model exhibited a significant upper hand in terms of recall. Conclusion: This study identified factors associated with the misidentification of benign pulmonary lesions as lung cancer in never-smokers and the effectiveness of multiple logistic regression and machine learning models. Further research is warranted to enhance our understanding of these factors, develop more accurate predictive models for clinical implementation, and validate the effectiveness of these models in real-world clinical settings. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90239 |
DOI: | 10.6342/NTU202301117 |
Fulltext Rights: | 同意授權(全球公開) |
Appears in Collections: | 環境與職業健康科學研究所 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
ntu-111-2.pdf | 2.17 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.