Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48426| Title: | 建構拷貝數變異與基因表現之整合性分析方法
--以台灣非抽菸女性肺癌為例 Concurrent Analysis between Copy Number Variation and Gene Expression of Female Non-Smoking Lung Cancer in Taiwan |
| Authors: | Jung-Chih Chang 張榕芝 |
| Advisor: | 莊曜宇(Eric. Y. Chuang) |
| Keyword: | 拷貝數,基因表現,整合性分析,基因組分析, copy number,gene expression,concurrent analysis,gene set analysis, |
| Publication Year : | 2011 |
| Degree: | 碩士 |
| Abstract: | 本研究將拷貝數(copy number, CN)與基因表現(gene expression, GE)圖譜結合進行DNA與RNA層級整合之分析(concurrent analysis),以尋找並探討在兩者之間發生改變的致病基因與其機轉。此研究包含三個主要的部分:測量拷貝數與基因表現間的相關程度、使用Gene Set Enrichment Analysis(GSEA)進行傳導路徑分析、並藉由一評分模式以整合拷貝數、基因表現、與其兩者間的相關程度所篩選出來之傳導途徑。其中,為了評估本研究方法的表現而使用兩組樣本,其中一組為來自四十四位非抽菸女性肺癌患者的成對樣本,亦即包含了正常組織與癌症組織。另一組則是Gene Expression Omnibus資料庫中編號為GSE19539的卵巢癌樣本,分屬兩種主要的次分型(subtype): endometrioid與serous。上述兩種樣本同時具備了來自同一個體的拷貝數與基因表現之微陣列晶片數據。兩種樣本皆利用Affymetrix SNP 6.0晶片進行拷貝數分析,肺癌樣本的基因表現分析使用Affymetrix U133plus 2.0晶片,而GSE19539卵巢癌樣本的基因表現分析則使用Affymetrix 1.0 ST 晶片。為了深入探討篩選出的傳導途徑,以Support Vector Machine(SVM)方法進行分類預測,並根據預測結果顯示,相較於傳統的分析方法,本研究方法具有較高的預測靈敏度與特異性。此外,藉由整合DNA與RNA層次能對於疾病的生物調控機制與致病相關的基因有更深層的了解,也使得實驗篩選出的生物標靶具備更多生物意義與統計信心,更能降低偽陽性(false positive rate)以提高篩選正確率,有助於臨床醫學診斷與基礎研究。 To identify genes with genomic alterations and/or transcriptional dysregulation, a concurrent analyzing method was developed to integrate data form copy number (CN) and gene expression (GE). This study contains three major parts: determine the correlation between CN and GE, perform pathway analysis by Gene Set Enrichment Analysis (GSEA), and to summarize all the pathways enriched by CN, GE, and correlation between CN and GE using a scoring method. Two datasets were analyzed to evaluate the performance of the method. The first dataset was from 44 female non-smoking lung cancer patients, which contain both paired normal and tumor tissues. The other dataset was retrieved from the Gene Expression Omnibus: GSE19539 ovarian cancer samples with two subtypes, endometrioid and serous. Both the datasets have CN and GE microarray data from the same individual. Copy number was analyzed by Affymetrix SNP 6.0 array in the both datasets. Gene expression profiles were analyzed by Affymetrix U133plus 2.0 array in the first dataset and Affymetrix 1.0 ST array in the second one. To further explore those identified pathways, Support Vector Machine (SVM) was used for classification. The classification results had higher prediction sensitivity and specificity compared with traditional analysis methods. In addition, using integration of data from both DNA and RNA levels is much biological meaningful, and may reveal much information about disease-causing genes and their regulation mechanisms. In summary, the results indicated that concurrent analyses may help to identify potential biomarkers with lower false positive rates. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48426 |
| Fulltext Rights: | 有償授權 |
| Appears in Collections: | 生醫電子與資訊學研究所 |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-100-1.pdf Restricted Access | 2.44 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
