請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48426| 標題: | 建構拷貝數變異與基因表現之整合性分析方法
--以台灣非抽菸女性肺癌為例 Concurrent Analysis between Copy Number Variation and Gene Expression of Female Non-Smoking Lung Cancer in Taiwan |
| 作者: | Jung-Chih Chang 張榕芝 |
| 指導教授: | 莊曜宇(Eric. Y. Chuang) |
| 關鍵字: | 拷貝數,基因表現,整合性分析,基因組分析, copy number,gene expression,concurrent analysis,gene set analysis, |
| 出版年 : | 2011 |
| 學位: | 碩士 |
| 摘要: | 本研究將拷貝數(copy number, CN)與基因表現(gene expression, GE)圖譜結合進行DNA與RNA層級整合之分析(concurrent analysis),以尋找並探討在兩者之間發生改變的致病基因與其機轉。此研究包含三個主要的部分:測量拷貝數與基因表現間的相關程度、使用Gene Set Enrichment Analysis(GSEA)進行傳導路徑分析、並藉由一評分模式以整合拷貝數、基因表現、與其兩者間的相關程度所篩選出來之傳導途徑。其中,為了評估本研究方法的表現而使用兩組樣本,其中一組為來自四十四位非抽菸女性肺癌患者的成對樣本,亦即包含了正常組織與癌症組織。另一組則是Gene Expression Omnibus資料庫中編號為GSE19539的卵巢癌樣本,分屬兩種主要的次分型(subtype): endometrioid與serous。上述兩種樣本同時具備了來自同一個體的拷貝數與基因表現之微陣列晶片數據。兩種樣本皆利用Affymetrix SNP 6.0晶片進行拷貝數分析,肺癌樣本的基因表現分析使用Affymetrix U133plus 2.0晶片,而GSE19539卵巢癌樣本的基因表現分析則使用Affymetrix 1.0 ST 晶片。為了深入探討篩選出的傳導途徑,以Support Vector Machine(SVM)方法進行分類預測,並根據預測結果顯示,相較於傳統的分析方法,本研究方法具有較高的預測靈敏度與特異性。此外,藉由整合DNA與RNA層次能對於疾病的生物調控機制與致病相關的基因有更深層的了解,也使得實驗篩選出的生物標靶具備更多生物意義與統計信心,更能降低偽陽性(false positive rate)以提高篩選正確率,有助於臨床醫學診斷與基礎研究。 To identify genes with genomic alterations and/or transcriptional dysregulation, a concurrent analyzing method was developed to integrate data form copy number (CN) and gene expression (GE). This study contains three major parts: determine the correlation between CN and GE, perform pathway analysis by Gene Set Enrichment Analysis (GSEA), and to summarize all the pathways enriched by CN, GE, and correlation between CN and GE using a scoring method. Two datasets were analyzed to evaluate the performance of the method. The first dataset was from 44 female non-smoking lung cancer patients, which contain both paired normal and tumor tissues. The other dataset was retrieved from the Gene Expression Omnibus: GSE19539 ovarian cancer samples with two subtypes, endometrioid and serous. Both the datasets have CN and GE microarray data from the same individual. Copy number was analyzed by Affymetrix SNP 6.0 array in the both datasets. Gene expression profiles were analyzed by Affymetrix U133plus 2.0 array in the first dataset and Affymetrix 1.0 ST array in the second one. To further explore those identified pathways, Support Vector Machine (SVM) was used for classification. The classification results had higher prediction sensitivity and specificity compared with traditional analysis methods. In addition, using integration of data from both DNA and RNA levels is much biological meaningful, and may reveal much information about disease-causing genes and their regulation mechanisms. In summary, the results indicated that concurrent analyses may help to identify potential biomarkers with lower false positive rates. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48426 |
| 全文授權: | 有償授權 |
| 顯示於系所單位: | 生醫電子與資訊學研究所 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-100-1.pdf 未授權公開取用 | 2.44 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
