請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98577| 標題: | 結合全基因組關聯性研究、多基因風險評分與機器學 習方法揭示日光角化症之遺傳結構 A genome-wide association study with polygenic risk scoring and machine learning approaches reveals novel genetic architecture of actinic keratosis |
| 作者: | 李育明 Yu-Ming Lee |
| 指導教授: | 莊曜宇 Eric Y. Chuang |
| 共同指導教授: | 安莉達 Amrita Chattopadhyay |
| 關鍵字: | 日光角化症,全基因組關聯研究,多基因風險分數,皮膚癌,機器學習, Actinic Keratosis,GWAS,PRS,Skin cancer,Machine Learning, |
| 出版年 : | 2025 |
| 學位: | 碩士 |
| 摘要: | 日光角化症(AK)是一種皮膚表皮的病徵,被認為是皮膚癌的癌前病變,其在全球具有高盛行率。了解其遺傳結構並開發可以有效預測風險的工具對於早期介入至關重要。這項研究採用英國生物資料庫(UK Biobank)的基因組和表型資料來探討日光角化症的遺傳易感性。我們識別出患者(病例組),並使用了傾向分數配對法(Propensity Score Matching)以年齡以及性別兩因素作為基礎,以1:10的比例與對照組進行配對。經過標準的品質控制流程後,我們對病例對照組進行全基因組關聯性研究(GWAS),並調整年齡、性別和前20個主成分。分位數-分位數圖的lambda值為1.036,表示混雜因子控制良好。
我們的GWAS研究識別出48個獨立顯著的單核苷酸多態性(SNPs)(P<5×10⁻⁸),其中13個主要SNPs對應至第6、11、16和20號染色體上的四個新的風險位點。這些顯著SNPs中約70%位於內含子區域,而也有部分在基因間區域。這些SNPs的功能註釋顯示65個顯著基因,主要與紫外線暴露、皮膚的色素沉澱和免疫調節相關。此外,eQTLs和MAGMA基因集富集分析突顯了與皮膚色素調節、黑色素瘤和其他皮膚癌的關聯性。 利用這些GWAS發現,我們使用篩選(Clumping)和閾值設定(Thresholding)方法(PRSice-2)開發了多基因風險分數(PRS),在單獨預測光化性角化症方面達到0.58的曲線下面積(AUC)。當與生活方式和表型因子整合時,包括皮膚和頭髮顏色、飲酒頻率,以及日曬暴露習慣(夏季戶外時間、防曬使用頻率),我們的預測模型顯著改善。在各種機器學習模型(邏輯回歸、隨機森林、XGBoost)中,邏輯回歸模型表現最佳,達到0.626的AUC。此外我們也注意到,建立的PRS也與黑色素瘤以及非黑色素瘤的皮膚癌風險增加顯著相關。 這些發現為日光角化症的遺傳基礎提供了新面向的資訊,並突顯了結合遺傳和生活方式風險評估在改善日光角化症預測,以及可能其他皮膚癌預測方面的潛力。 Actinic keratosis (AK) is a common skin lesion with high prevalence globally. Understanding its genetic architecture and developing effective risk prediction tools are crucial for early intervention. AK is consider as a precancerous sign. This study leveraged genomic and phenotypic data from the UK Biobank to investigate genetic susceptibility to AK. We identified individuals with AK (cases) and matched them 1:10 with controls based on age and sex using propensity score matching. Following stringent quality control, a genome-wide association study (GWAS) was performed on the case-control cohort, adjusting for age, sex, and the first 20 principal components. The QQ plot showed a lambda value of 1.036, indicating well-controlled confounding. Our GWAS identified 48 independent single nucleotide polymorphisms (SNPs) which are significant (P<5×10−8), with 13 lead SNPs mapping to four novel risk loci on chromosomes 6, 11, 16, and 20. Approximately 70% of these significant SNPs were intronic, while the remainder were intergenic. Functional annotation of these SNPs revealed 65 significant genes, predominantly associated with UV exposure, skin pigmentation, and immune regulation. Furthermore, eQTLs and MAGMA gene-set enrichment analyses highlighted connections to skin pigmentation, melanoma, and other skin carcinomas. Utilizing these GWAS findings, we developed a polygenic risk score (PRS) using the Clumping + Thresholding (C+T) method, which achieved an area under the curve (AUC) of 0.58 for predicting AK in isolation. When integrated with lifestyle and phenotypic factors, including skin and hair color, smoking and alcohol frequency, and sun exposure habits (summer outdoor time, sunscreen use frequency), our predictive models significantly improved. Among various machine learning models (Logistic Regression, Random Forest, XGBoost), the Logistic Regression model demonstrated the best performance, achieving an AUC of 0.626. Importantly, the established PRS was also significantly associated with an increased risk of non-melanoma skin cancer and melanoma. These findings supply novel information about the genetic knowledge of AK and highlight the potential of a combined genetic and lifestyle risk assessment for improved AK prediction and potentially other skin cancers. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/98577 |
| DOI: | 10.6342/NTU202503272 |
| 全文授權: | 同意授權(限校園內公開) |
| 電子全文公開日期: | 2025-08-26 |
| 顯示於系所單位: | 生醫電子與資訊學研究所 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-2.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 11.6 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
