請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96801| 標題: | 整合單核苷酸多型性和基因表現以探索台灣乳癌之生物標記與舊藥新用 Integrated Approaches Utilizing Single Nucleotide Polymorphism and Gene Expression for Exploring Biomarkers and Potential Drug Repurposing in Taiwanese Breast Cancers |
| 作者: | 徐于晴 Yu-Ching Hsu |
| 指導教授: | 盧子彬 Tzu-Pin Lu |
| 關鍵字: | 乳癌,全基因體關聯研究,多基因風險分數,解構,深度學習,舊藥新用, breast cancer,genome-wide association study,polygenic risk score,deconvolution,deep learning,drug repurposing, |
| 出版年 : | 2024 |
| 學位: | 博士 |
| 摘要: | 精準醫學(precision medicine)旨在利用患者的遺傳資訊、生活方式及環境因素等多種資料,為個人提供量身打造的治療方案。為實現精準醫學的理念,主要需達成兩大目標:疾病分型及個人化治療。分子生物標誌物(molecular biomarker)的發現對疾病分型非常重要,而在藥物基因體學(pharmacogenomics)領域日益增加的科學證據則能為患者提供更精準的治療建議。在本研究中,我們提出了整合單核苷酸多型性(single nucleotide polymorphism)和基因表現(gene expression)數據的方法,以找出專屬台灣乳癌的分子生物標誌物以及可能的舊藥新用(drug repurposing)策略。
全基因體關聯研究(genome-wide association studies)能同時檢測數十萬個基因變異,是一種能有效用來找出可能與疾病相關的生物標誌物的方法,而多基因風險分數(polygenic risk score)則利用全基因體關聯研究的結果來建立疾病風險預測模型。過去針對乳癌的全基因體關聯研究大多使用高加索人族群(Caucasian population)的數據,所得出的結果可能不適用於其他族群,因此,我們針對台灣人族群進行了大規模的全基因體關聯研究,以解決此問題。本研究納入了來自中國醫藥大學附設醫院的12,480名受試者,包括2,496名乳癌患者與9,984名對照組,以進行全基因體關聯研究與多基因風險分數分析。我們鑑定出113個與乳癌相關的單核苷酸多型性,其中有50個是新發現的位點。此外,我們在所有乳癌以及管狀A型亞型(luminal A subtype)的多基因風險分數中,觀察到了達統計顯著的正相關趨勢,而在所有乳癌及管狀A型亞型中,多基因風險分數最高分組的勝算比(odds ratio)及其95%信賴區間(confidence interval)分別為5.33(3.79–7.66)及3.55(2.13–6.14)。 近年來,在癌症細胞株中已蒐集到了大規模的藥物反應資料,如何將這些從癌症細胞株所得到的藥物基因體學知識應用到實際的癌症上,是推動舊藥新用發展的重要關鍵。我們在此研究中的目標是先將腫瘤解構(deconvolute)為相對應的癌症細胞株,再利用解構的結果來開發藥物反應預測演算法以解決此問題。我們採用了先前開發來分析細胞組成的深度學習(deep learning)模型之架構,來訓練出可進行腫瘤解構(tumor deconvolution)的新模型,我們稱這些新模型為Scaden-CA,是針對涵蓋乳癌在內的18種癌症而開發,每個癌症各有一個模型。我們是利用癌症細胞株單細胞基因表現(single cell RNA-seq)數據所生成的模擬數據來訓練和測試這些模型,然後再利用癌症細胞株的基因表現數據進行驗證。Scaden-CA在模型測試(所有癌症的一致性相關係數 > 0.9)和模型驗證(大多數癌症細胞株的正確解構率> 70%)中均有優異的表現。我們進一步將Scaden-CA應用於來自癌症基因體圖譜(The Cancer Genome Atlas, TCGA)的真實腫瘤基因表現數據,並開發了藥物預測演算法。同時,我們也利用了癌症基因體圖譜的突變數據和基因表現數據,來探討舊藥新用及其可能的機制。 為了研究台灣族群與其他族群之間的基因表現差異,並找出可能可應用於台灣乳癌的舊藥新用,我們首先利用PrediXcan及中國醫藥大學附設醫院乳癌患者的基因型資料(genotype profiles)來預測出其相對應的基因表現。而在PredictAP提供的資料中,大約78.4%的基因在東亞族群(East Asian population)與歐洲族群(non-Finnish European population)之間顯示出有族群之差異。部分基因,包括乳癌相關基因,在台灣族群中顯示出基因表現差異,而大多數基因與東亞族群的表現量相近,而這些發現可能有助於解釋乳癌的臨床特徵差異。為了找出可能可用於台灣乳癌的舊藥新用,我們先透過PredictAP過濾掉可能受族群差異影響的基因,然後使用剩餘的基因來檢索從癌症基因體圖譜中所發現到可能用於舊藥新用之藥物,並從中挑出可能台灣乳癌也可能適用的舊藥新用。如此一來,我們不僅能夠找出全新和現有可適用於台灣乳癌的舊藥新用及其可能的機制,還能探討單核苷酸多型性、基因表現與舊藥新用之間的關聯性。 Precision medicine aims at providing tailored treatment for individuals by utilizing distinct types of patient information including genetic, lifestyle, and environmental factors. Two major objectives should be addressed to realize the concept of precision medicine, which are disease subtyping and tailored treatments for specific diseases. Identification of molecular biomarkers can be helpful for advancing disease subtyping, and the accumulating evidences in the field of pharmacogenomics can offer more precise suggestions for patient treatments. In this study, we proposed integrated approaches utilizing single nucleotide polymorphisms and gene expression data to identify biomarkers and potential drug repurposing for Taiwanese breast cancers. Genome-wide association studies (GWASs) are effective methods to examine hundreds of thousands of genetic variants at the same time to identify potential disease-associated biomarkers, and polygenic risk score (PRS) analyses are useful in building prediction models for disease risk by utilizing the results from GWAS. Previous GWASs in breast cancers were mostly conducted in Caucasian population, which may not be applicable to other populations. Therefore, we conducted a large GWAS in Taiwanese population to address the issue. A total of 12,480 participants, including 2,496 cases and 9,984 controls from China Medical University Hospital (CMUH) were included for GWAS and PRS analyses. We identified 113 single-nucleotide polymorphisms (SNPs) associated with breast cancers, among which 50 SNPs are novel. We also observed positively correlated trends with statistical significance in PRS analyses for all breast cancer and the luminal A subtypes, and the odds ratio (95% confidence intervals) for the groups with highest PRS in all breast cancers and the luminal A subtypes were 5.33 (3.79–7.66) and 3.55 (2.13–6.14), respectively. Recently, large-scale drug response data were profiled in a collection of cancer cell lines. How to translate the pharmacogenomics knowledge from in vitro to in vivo is crucial to advance drug repurposing. We aimed to address the issue by deconvoluting tumors to cancer cell lines and developing a corresponding drug response prediction algorithm utilizing the deconvoluted results. We adopted a previously developed deep-learning based model of analyzing cell compositions to train new models for tumor deconvolution, which we called the Scaden-CA models, for 18 cancer types, including breast cancer. The models were trained and tested on simulation data generated from single cell RNA-seq data of cancer cell lines. Then, the models were validated by Cancer Cell Line Encyclopedia (CCLE) bulk RNA-seq data. The Scaden-CA models showed great performance in model testing (concordance correlation coefficient > 0.9 across all cancers) and model validation (correctly deconvoluted rate > 70% across most cancers). We further applied the models to real tumor RNA-seq data from The Cancer Genome Atlas (TCGA) and developed a drug response prediction algorithm. TCGA mutation data and gene expression data were also utilized to investigate the underlying mechanisms of drug repurposing. To investigate gene expression differences between Taiwanese population and other populations and to infer drug repurposing for Taiwanese breast cancers, we first imputed gene expression from the genotype profiles of CMUH breast cancer patients by PrediXcan. About 78.4% genes showed differences between East Asian and non-Finnish European populations in the information provided by PredictAP. Certain genes, including breast cancer associated genes showed gene expression disparity in Taiwanese population while most genes showed similar expression patterns compared to East Asian population. These findings may contribute to the differences in clinical traits of breast cancers. As for inferring drug repurposing for Taiwanese breast cancers, we filtered out imputed genes that may be biased by population differences using PredictAP. We then use the remaining genes to retrieve corresponding breast cancer-gene-drug combinations from TCGA drug repurposing results. In this way, we can not only identify novel and existing breast cancer-gene-drug combinations but also explore the associations between SNPs, gene expression, and drug repurposing. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96801 |
| DOI: | 10.6342/NTU202404805 |
| 全文授權: | 未授權 |
| 電子全文公開日期: | N/A |
| 顯示於系所單位: | 生物資訊學國際研究生博士學位學程 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-113-1.pdf 未授權公開取用 | 5.21 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
