請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/59941
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 莊曜宇(Eric Y. Chuang) | |
dc.contributor.author | Yao-Wei Jheng | en |
dc.contributor.author | 鄭耀瑋 | zh_TW |
dc.date.accessioned | 2021-06-16T09:46:20Z | - |
dc.date.available | 2023-12-31 | |
dc.date.copyright | 2017-02-16 | |
dc.date.issued | 2017 | |
dc.date.submitted | 2017-01-24 | |
dc.identifier.citation | 1. Collins, F.S., M.S. Guyer, and A. Charkravarti, Variations on a theme: cataloging human DNA sequence variation. Science, 1997. 278(5343): p. 1580-1.
2. Sherry, S.T., et al., dbSNP: the NCBI database of genetic variation. Nucleic Acids Reseach, 2001. 29(1): p. 308-11. 3. Singer-Sam, J. and C. Gao, Quantitative RT-PCR-based analysis of allele-specific gene expression. Methods in Molecular Biology, 2001. 181: p. 145-52. 4. Jeziorska, D.M., K.W. Jordan, and K.W. Vance, A systems biology approach to understanding cis-regulatory module function. Seminars in Cell & Developmental Biology, 2009. 20(7): p. 856-862. 5. Wittkopp, P.J. and G. Kalay, Cis-regulatory elements: molecular mechanisms and evolutionary processes underlying divergence. Nature Reviews Genetics, 2012. 13(1): p. 59-69. 6. Ong, C.T. and V.G. Corces, Enhancer function: new insights into the regulation of tissue-specific gene expression. Nature Reviews Genetics, 2011. 12(4): p. 283-93. 7. Wagner, J.R., et al., Computational Analysis of Whole-Genome Differential Allelic Expression Data in Human. Plos Computational Biology, 2010. 6(7): p. e1000849. 8. Knight, J.C., Allele-specific gene expression uncovered. Trends Genetics, 2004. 20(3): p. 113-6. 9. Carrel, L. and H.F. Willard, X-inactivation profile reveals extensive variability in X-linked gene expression in females. Nature, 2005. 434(7031): p. 400-404. 10. Walker, E.J., et al., Monoallelic expression determines oncogenic progression and outcome in benign and malignant brain tumors. Cancer Research, 2012. 72(3): p. 636-44. 11. Gimelbrant, A., et al., Widespread monoallelic expression on human autosomes. Science, 2007. 318(5853): p. 1136-40. 12. Mayba, O., et al., MBASED: allele-specific expression detection in cancer tissues and cell lines. Genome Biology, 2014. 15(8): p. 405. 13. Tuch, B.B., et al., Tumor transcriptome sequencing reveals allelic expression imbalances associated with copy number alterations. PLOS ONE, 2010. 5(2): p. e9317. 14. Valle, L., et al., Germline allele-specific expression of TGFBR1 confers an increased risk of colorectal cancer. Science, 2008. 321(5894): p. 1361-5. 15. Cowles, C.R., et al., Detection of regulatory variation in mouse genes. Nature Genetics, 2002. 32(3): p. 432-7. 16. Didelot, A., et al., Competitive allele specific TaqMan PCR for KRAS, BRAF and EGFR mutation detection in clinical formalin fixed paraffin embedded samples. Experimental and Molecular Pathology, 2012. 92(3): p. 275-80. 17. Lee, M.P., Genome-wide analysis of allele-specific gene expression using oligo microarrays. Methods in Molecular Biology, 2005. 311: p. 39-47. 18. Bentley, D.R., et al., Accurate whole human genome sequencing using reversible terminator chemistry. Nature, 2008. 456(7218): p. 53-9. 19. Pickrell, J.K., et al., Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature, 2010. 464(7289): p. 768-72. 20. Marioni, J.C., et al., RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Research, 2008. 18(9): p. 1509-17. 21. Canovas, A., et al., SNP discovery in the bovine milk transcriptome using RNA-Seq technology. Mammalian Genome, 2010. 21(11-12): p. 592-8. 22. Reinius, B. and R. Sandberg, Random monoallelic expression of autosomal genes: stochastic transcription and allele-level regulation. Nature Review Genetics, 2015. 16(11): p. 653-664. 23. Zilliox, M.J. and R.A. Irizarry, A gene expression bar code for microarray data. Nature Methods, 2007. 4(11): p. 911-3. 24. Irizarry, R.A., et al., Multiple-laboratory comparison of microarray platforms. Nature Methods, 2005. 2(5): p. 345-50. 25. Rozowsky, J., et al., AlleleSeq: analysis of allele-specific expression and binding in a network framework. Molecular Systems Biology, 2011. 7: p. 522. 26. Gådin, J.R., et al., AllelicImbalance: an R/ bioconductor package for detecting, managing, and visualizing allele expression imbalance data from RNA sequencing. BMC Bioinformatics, 2015. 16(1): p. 1-6. 27. Harvey, C.T., et al., QuASAR: quantitative allele-specific analysis of reads. Bioinformatics, 2015. 31(8): p. 1235-42. 28. Martin, M., Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal, 2011. 17(1): p. 10-12. 29. Kim, D., et al., TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biology, 2013. 14(4): p. R36. 30. Langmead, B., et al., Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology, 2009. 10(3): p. R25. 31. Langmead, B. and S.L. Salzberg, Fast gapped-read alignment with Bowtie 2. Nature Methods, 2012. 9(4): p. 357-9. 32. Li, H. and R. Durbin, Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics, 2010. 26(5): p. 589-95. 33. Li, H., et al., The Sequence Alignment/Map format and SAMtools. Bioinformatics, 2009. 25(16): p. 2078-9. 34. Lappalainen, T., et al., Transcriptome and genome sequencing uncovers functional variation in humans. Nature, 2013. 501(7468): p. 506-11. 35. Koboldt, D.C., et al., VarScan 2: Somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Research, 2012. 22(3): p. 568-576. 36. McKenna, A., et al., The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Research, 2010. 20(9): p. 1297-303. 37. Morgan, M., et al., Rsamtools: Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import. Bioconductor, 2016. R package version 1.26.1. 38. Pruitt, K.D., T. Tatusova, and D.R. Maglott, NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Research, 2005. 33(Database issue): p. D501-4. 39. Wang, K., M. Li, and H. Hakonarson, ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Research, 2010. 38(16): p. e164. 40. Costello, J.C., et al., A community effort to assess and improve drug sensitivity prediction algorithms. Nature Biotechnology, 2014. 32(12): p. 1202-12. 41. Daemen, A., et al., Modeling precision treatment of breast cancer. Genome Biol, 2013. 14(10): p. R110. 42. Zou, S., et al., Mutational landscape of intrahepatic cholangiocarcinoma. Nature Communications, 2014. 5: p. 5696. 43. Chen, J., et al., A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals. Nature Communications, 2016. 7: p. 11101. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/59941 | - |
dc.description.abstract | 等位基因表現不平衡是指兩個等位基因有顯著差異的基因表現程度,而近期也已經在癌症生物學上有研究價值的一席之地,然而目前對等位基因表現不平衡的實際機制與影響尚未完全了解,因此從現有資料去萃取等位基因表現的資訊便是很重要的事情。而隨著次世代定序的技術進步,能夠透過生物資訊的方法來找出發生等位基因表現不平衡的基因,因此我們提出了一個方法從核酸與核苷酸的次世代定序的資料裡找出等位基因不平衡現象的基因群。方法上主要是藉由異型合子的單一變異位點來定位兩個等位基因的表現數值,透過來自核酸與核苷酸兩次世代定序資料的比值差異計算兩個等位基因表現不平衡的程度,以數學模型來找出顯著差異的基因,並且,我們將此方法應用到兩個資料集上來看等位基因表現的研究。
結果上,我們以GSE48216資料集去討論等位基因表現不平衡的資料型態以及在組織的亞型分群的關係,首先在16個ER顯性細胞株中,每個細胞株平均有119個等位基因表現不平衡的基因。大多數的基因來自X染色體,是女性其中一套X染色體失去活性的現象,也作為我們方法的陽性控制。而這些基因之中,雖然大多數基因都是以主要等位基因形式表現,我們看到少數以次要等位基因形式表現的基因上有潛在危險性非同義單一變異點與ER和癌症的生物途徑有關,如HSD17B7、IRAK、XIAP等基因,因此我們認為部分發生等位基因表現不平衡的基因可能與癌症發生有關。除此之外,我們也看到不同組織的亞型間有不同等位基因表現表現不平衡的資料型態,顯示了亞群分類可能會影響到等位基因的分析。另一方面,我們從肝內膽管細胞癌的資料集(GSE63420)中去看只發生在癌症組織的等位基因表現不平衡的基因,平均每個病人收到573個基因內,我們也看到了有等位基因表現不平衡的熱點與危險性非同義單一變異點的基因,如與過去被報導與該疾病有關的IDH1,會不斷表現出具風險的錯義突變位點R132的核酸序列。結合以上來看,我們的結果也暗指部分發生等位基因表現不平衡的基因可能藉由其他機制來影響癌症生成。 在此我們建立了一個方法來找出發生等位基因表現不平衡的基因,我們的結果也顯示出等位基因表現不平衡的現象可能影響了癌症的調控失衡,這樣的方法與分析能夠提供未來在等位基因的基礎研究以及在基因體的變異點研究上有更多的資訊。 | zh_TW |
dc.description.abstract | Allelic imbalance, which means different expression abundance between two alleles, recently has become a special landmark on cancer biology. However, the present of allelic imbalance has not been fully understood yet. Taking the advantages of next generation sequencing, allelic imbalance can be identified and quantified through bioinformatics algorithms. Here, we developed a mathematical method to identify with allelic imbalance by concurrently analyzing RNA-Seq and DNA-Seq. First we called heterozygous SNVs from both RNA-Seq and Exome-Seq data. The allele ratio is counted by the ratio of read depth between two alleles after integrating SNVs into gene level. Next, we identified allelic imbalance genes by differences of RNA and DNA allele ratios. Here, we applied our method to 2 current datasets to study the issue of allelic imbalance.
As results, we used GSE48216 to discuss the profiles and consistency of allelic imbalance between breast cancer subtypes. We identified 119 genes on average which are significantly allelic imbalance in 16 ER positive breast cancer cell lines. Some allelic imbalance genes were belonged to X-chromosome as our positive controls, which are resulted from X-inactivation in female cells. Among these genes, although the transcripts in allelic imbalance genes were from major alleles, some non-synonymous SNVs in allelic imbalance genes were associated with ER and cancer pathway, such as HSD17B7, IRAK, XIAP. We suggested that they had potential risk to tumorigenesis. Besides, we found different allelic imbalance profiles and consistency between ER+ and ER- breast cancer cell line. It suggested that the subtype is an important factor for allele-expression study. On the other hand, we also screened 7 patients NGS data of intrahepatic cholangiocarcinoma (GSE63420) to study the profiles of tumor-specific allelic imbalance genes. Among 573 genes on average in each patient, we also found the hotspots of allelic imbalance and genes with potential risk non-synonymous SNVs, such as the dominant transcript of missense mutation on R132 of IDH1 protein. Taking together, the results indicated that partial genes of allelic imbalance might potentially influence tumorigenesis by another unknown mechanism. To sum up, we proposed a method to call allelic imbalance genes. Our results also demonstrated the role of allelic imbalance of dysregulation of tumor. Moreover, our method and analysis can be applied to uncover more basic mechanisms on allelic imbalance and provide more information on variant studies. | en |
dc.description.provenance | Made available in DSpace on 2021-06-16T09:46:20Z (GMT). No. of bitstreams: 1 ntu-106-R03945019-1.pdf: 2792508 bytes, checksum: 9ab1de3b806e05b0c31012758d496713 (MD5) Previous issue date: 2017 | en |
dc.description.tableofcontents | 謝誌 I
中文摘要 II Abstract IV Chapter 1 Introduction 1 1.1 Single nucleotide variants and allele-specific expression 1 1.2 Allelic imbalance 3 1.3 Allele-specific expression by high-throughput analysis 5 1.4 Bioinformatics and current algorithms on allelic imbalance 8 1.5 Motivation and aims 11 Chapter 2 Materials and Methods 13 2-1 Quality check of NGS data 13 2-2 Data processing for next generation sequencing 14 2.2.1 Alignment 15 2.2.2 Data pre-processing 16 2.2.3 Variant calling 16 2.2.4 Annotation and data extraction 18 2-3 Allele-specific expression analysis 19 2.4 Dataset collection 22 2.4.1 Breast cancer cell line 22 2.4.2 Intrahepatic cholangiocarcinoma 22 Chapter 3 Results 24 3.1 Overview of application of studying allelic imbalance genes 24 3.1.1 Application: the consistence of profiles of ER positive breast cell lines. 24 3.1.2 Application: exploring the allelic imbalance profiles in intrahepatic cholangiocarcinoma and normal tissues 34 Chapter 4 Discussion 40 4.1 Model performance 40 4.2 Model limitation 42 4.3 The role of allelic imbalance and biological meaning 44 Chapter 5 Conclusion 46 References 47 Appendix 52 | |
dc.language.iso | en | |
dc.title | 利用次世代定序資料探討等位基因表現不平衡之分析方法 | zh_TW |
dc.title | A Method for Characterizing Allelic Imbalance by Next Generation Sequencing | en |
dc.type | Thesis | |
dc.date.schoolyear | 105-1 | |
dc.description.degree | 碩士 | |
dc.contributor.coadvisor | 蕭自宏(Tzu-Hung Hsiao) | |
dc.contributor.oralexamcommittee | 賴亮全(Liang-Chuan Lai),蔡孟勳(Mon-Hsun Tsai),盧子彬(Tzu-Pin Lu) | |
dc.subject.keyword | 等位基因表現不平衡,基因變異,次世代定序,單核?酸變異, | zh_TW |
dc.subject.keyword | Allelic imbalance,Genetic variation,Next generation sequencing,SNV, | en |
dc.relation.page | 53 | |
dc.identifier.doi | 10.6342/NTU201700207 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2017-01-24 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 生醫電子與資訊學研究所 | zh_TW |
顯示於系所單位: | 生醫電子與資訊學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-106-1.pdf 目前未授權公開取用 | 2.73 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。