Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 醫學院
  3. 基因體暨蛋白體醫學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90340
Full metadata record
???org.dspace.app.webui.jsptag.ItemTag.dcfield???ValueLanguage
dc.contributor.advisor許書睿zh_TW
dc.contributor.advisorShu-Jui Hsuen
dc.contributor.author陳品瑄zh_TW
dc.contributor.authorPin-Xuan Chenen
dc.date.accessioned2023-09-26T16:20:33Z-
dc.date.available2023-11-09-
dc.date.copyright2023-09-26-
dc.date.issued2023-
dc.date.submitted2023-08-04-
dc.identifier.citation(HPA), H. P. A. (2018) Prenatal genetic testing for couples with Thalassemia carriers can prevent severe Thalassemia in newborns. Avaliable at: https://www.hpa.gov.tw/EngPages/Detail.aspx?nodeid=1038&pid=7623.
Abel, H. J., Larson, D. E., Regier, A. A., Chiang, C., Das, I., Kanchi, K. L., . . . Hall, I. M. (2020). Mapping and characterization of structural variation in 17,795 human genomes. Nature, 583(7814), 83-89. doi:10.1038/s41586-020-2371-0
Alkan, C., Coe, B. P., & Eichler, E. E. (2011). Genome structural variation discovery and genotyping. Nature Reviews Genetics, 12(5), 363-376.
Ball, M. P., Thakuria, J. V., Zaranek, A. W., Clegg, T., Rosenbaum, A. M., Wu, X., . . . Callow, M. J. (2012). A public resource facilitating clinical use of genomes. Proceedings of the National Academy of Sciences, 109(30), 11920-11927.
Barseghyan, H., Tang, W., Wang, R. T., Almalvez, M., Segura, E., Bramble, M. S., . . . Délot, E. C. (2017). Next-generation mapping: a novel approach for detection of pathogenic structural variants with a potential utility in clinical diagnosis. Genome medicine, 9(1), 1-11.
Byrska-Bishop, M., Evani, U. S., Zhao, X., Basile, A. O., Abel, H. J., Regier, A. A., . . . Zody, M. C. (2022). High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios. Cell, 185(18), 3426-3440 e3419. doi:10.1016/j.cell.2022.08.004
Cameron, D. L., Di Stefano, L., & Papenfuss, A. T. (2019). Comprehensive evaluation and characterisation of short read general-purpose structural variant calling software. Nat Commun, 10(1), 3240. doi:10.1038/s41467-019-11146-4
Cameron, D. L., Schroder, J., Penington, J. S., Do, H., Molania, R., Dobrovic, A., . . . Papenfuss, A. T. (2017). GRIDSS: sensitive and specific genomic rearrangement detection using positional de Bruijn graph assembly. Genome Res, 27(12), 2050-2060. doi:10.1101/gr.222109.117
Chaisson, M. J. P., Sanders, A. D., Zhao, X., Malhotra, A., Porubsky, D., Rausch, T., . . . Lee, C. (2019). Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat Commun, 10(1), 1784. doi:10.1038/s41467-018-08148-z
Chen, X., Schulz-Trieglaff, O., Shaw, R., Barnes, B., Schlesinger, F., Kallberg, M., . . . Saunders, C. T. (2016). Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics, 32(8), 1220-1222. doi:10.1093/bioinformatics/btv710
Chiang, C., Scott, A. J., Davis, J. R., Tsang, E. K., Li, X., Kim, Y., . . . Hall, I. M. (2017). The impact of structural variation on human gene expression. Nat Genet, 49(5), 692-699. doi:10.1038/ng.3834
Collins, R. L., Brand, H., Karczewski, K. J., Zhao, X., Alfoldi, J., Francioli, L. C., . . . Talkowski, M. E. (2020). A structural variation reference for medical and population genetics. Nature, 581(7809), 444-451. doi:10.1038/s41586-020-2287-8
Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., DePristo, M. A., . . . Sherry, S. T. (2011). The variant call format and VCFtools. Bioinformatics, 27(15), 2156-2158.
Danecek, P., Bonfield, J. K., Liddle, J., Marshall, J., Ohan, V., Pollard, M. O., . . . Davies, R. M. (2021). Twelve years of SAMtools and BCFtools. Gigascience, 10(2), giab008.
Delage, W. J., Thevenon, J., & Lemaitre, C. (2020). Towards a better understanding of the low recall of insertion variants with short-read based variant callers. BMC Genomics, 21(1), 762. doi:10.1186/s12864-020-07125-5
DePristo, M. A., Banks, E., Poplin, R., Garimella, K. V., Maguire, J. R., Hartl, C., . . . Daly, M. J. (2011). A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet, 43(5), 491-498. doi:10.1038/ng.806
Duan, J., Zhang, J.-G., Deng, H.-W., & Wang, Y.-P. (2013). Comparative studies of copy number variation detection methods for next-generation sequencing technologies. PloS one, 8(3), e59128.
Ebert, P., Audano, P. A., Zhu, Q., Rodriguez-Martin, B., Porubsky, D., Bonder, M. J., . . . Serra Mari, R. (2021). Haplotype-resolved diverse human genomes and integrated analysis of structural variation. Science, 372(6537), eabf7117.
English, A. C., Menon, V. K., Gibbs, R. A., Metcalf, G. A., & Sedlazeck, F. J. (2022). Truvari: refined structural variant comparison preserves allelic diversity. Genome Biology, 23(1), 1-20.
Feng, Y.-C. A., Chen, C.-Y., Chen, T.-T., Kuo, P.-H., Hsu, Y.-H., Yang, H.-I., . . . Lin, Y.-F. (2022). Taiwan Biobank: A rich biomedical research database of the Taiwanese population. Cell Genomics, 2(11). doi:10.1016/j.xgen.2022.100197
Freeman, J. L., Perry, G. H., Feuk, L., Redon, R., McCarroll, S. A., Altshuler, D. M., . . . Hurles, M. E. (2006). Copy number variation: new insights in genome diversity. Genome research, 16(8), 949-961.
Gardner, E. J., Lam, V. K., Harris, D. N., Chuang, N. T., Scott, E. C., Pittard, W. S., . . . Devine, S. E. (2017). The Mobile Element Locator Tool (MELT): population-scale mobile element discovery and biology. Genome Res, 27(11), 1916-1929. doi:10.1101/gr.218032.116
Geoffroy, V., Herenger, Y., Kress, A., Stoetzel, C., Piton, A., Dollfus, H., & Muller, J. (2018). AnnotSV: an integrated tool for structural variations annotation. Bioinformatics, 34(20), 3572-3574. doi:10.1093/bioinformatics/bty304
Gilissen, C., Hoischen, A., Brunner, H. G., & Veltman, J. A. (2011). Unlocking Mendelian disease using exome sequencing. Genome Biology, 12(9), 228.
Gong, T., Hayes, V. M., & Chan, E. K. F. (2021). Detection of somatic structural variants from short-read next-generation sequencing data. Brief Bioinform, 22(3). doi:10.1093/bib/bbaa056
Halldorsson, B. V., Eggertsson, H. P., Moore, K. H., Hauswedell, H., Eiriksson, O., Ulfarsson, M. O., . . . Jensson, B. O. (2022). The sequences of 150,119 genomes in the UK Biobank. Nature, 607(7920), 732-740.
He, J., Song, W., Yang, J., Lu, S., Yuan, Y., Guo, J., . . . Long, F. (2017). Next-generation sequencing improves thalassemia carrier screening among premarital adults in a high prevalence population: the Dai nationality, China. Genetics in medicine, 19(9), 1022-1031.
Hehir-Kwa, J. Y., Marschall, T., Kloosterman, W. P., Francioli, L. C., Baaijens, J. A., Dijkstra, L. J., . . . Guryev, V. (2016). A high-quality human reference panel reveals the complexity and distribution of genomic structural variants. Nat Commun, 7, 12989. doi:10.1038/ncomms12989
Ho, S. S., Urban, A. E., & Mills, R. E. (2020). Structural variation in the sequencing era. Nature Reviews Genetics, 21(3), 171-189.
Hurles, M. E., Dermitzakis, E. T., & Tyler-Smith, C. (2008). The functional impact of structural variation in humans. Trends in Genetics, 24(5), 238-245.
Jeffares, D. C., Jolly, C., Hoti, M., Speed, D., Shaw, L., Rallis, C., . . . Sedlazeck, F. J. (2017). Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat Commun, 8, 14061. doi:10.1038/ncomms14061
Jun, G., English, A. C., Metcalf, G. A., Yang, J., Chaisson, M. J., Pankratz, N., . . . Sedlazeck, F. J. (2023). Structural variation across 138,134 samples in the TOPMed consortium. bioRxiv. doi:10.1101/2023.01.25.525428
Jun, G., Sedlazeck, F., Zhu, Q., English, A., Metcalf, G., Kang, H. M., . . . Boerwinkle, E. (2021). muCNV: genotyping structural variants for population-level sequencing. Bioinformatics, 37(14), 2055-2057.
Karampetsou, E., Morrogh, D., & Chitty, L. (2014). Microarray Technology for the Diagnosis of fetal chromosomal aberrations: which platform should we use? Journal of clinical medicine, 3(2), 663-678.
Kendig, K. I., Baheti, S., Bockol, M. A., Drucker, T. M., Hart, S. N., Heldenbrand, J. R., . . . Mainzer, L. S. (2019). Sentieon DNASeq Variant Calling Workflow Demonstrates Strong Computational Performance and Accuracy. Front Genet, 10, 736. doi:10.3389/fgene.2019.00736
Khayat, M. M., Sahraeian, S. M. E., Zarate, S., Carroll, A., Hong, H., Pan, B., . . . Zheng, Y. (2021). Hidden biases in germline structural variant detection. Genome Biology, 22(1), 1-15.
Kosugi, S., Kamatani, Y., Harada, K., Tomizuka, K., Momozawa, Y., Morisaki, T., & Terao, C. (2023). Detection of trait-associated structural variations using short-read sequencing. Cell Genomics, 3(6).
Kosugi, S., Momozawa, Y., Liu, X., Terao, C., Kubo, M., & Kamatani, Y. (2019). Comprehensive evaluation of structural variation detection algorithms for whole genome sequencing. Genome Biol, 20(1), 117. doi:10.1186/s13059-019-1720-5
Kronenberg, Z. N., Osborne, E. J., Cone, K. R., Kennedy, B. J., Domyan, E. T., Shapiro, M. D., . . . Yandell, M. (2015). Wham: Identifying Structural Variants of Biological Consequence. PLoS Comput Biol, 11(12), e1004572. doi:10.1371/journal.pcbi.1004572
Krusche, P., Trigg, L., Boutros, P. C., Mason, C. E., De La Vega, F. M., Moore, B. L., . . . Health Benchmarking, T. (2019). Best practices for benchmarking germline small-variant calls in human genomes. Nat Biotechnol, 37(5), 555-560. doi:10.1038/s41587-019-0054-x
Kuo, C. W., Hwu, W. L., Chien, Y. H., Hsu, C., Hung, M. Z., Lin, I. L., . . . Lee, N. C. (2020). Frequency and spectrum of actionable pathogenic secondary findings in Taiwanese exomes. Molecular genetics & genomic medicine, 8(10), e1455.
Kuzniar, A., Maassen, J., Verhoeven, S., Santuari, L., Shneider, C., Kloosterman, W. P., & de Ridder, J. (2020). sv-callers: a highly portable parallel workflow for structural variant detection in whole-genome sequence data. PeerJ, 8, e8214. doi:10.7717/peerj.8214
Lappalainen, I., Lopez, J., Skipper, L., Hefferon, T., Spalding, J. D., Garner, J., . . . Zhou, G. (2012). DbVar and DGVa: public archives for genomic structural variation. Nucleic acids research, 41(D1), D936-D941.
Li, C.-K. (2017). New trend in the epidemiology of thalassaemia. Best practice & research Clinical obstetrics & gynaecology, 39, 16-26.
Lin, J., Wang, S., Audano, P. A., Meng, D., Flores, J. I., Kosters, W., . . . Ye, K. (2022). SVision: a deep learning approach to resolve complex structural variants. Nat Methods, 19(10), 1230-1233. doi:10.1038/s41592-022-01609-w
Liu, Z., Roberts, R., Mercer, T. R., Xu, J., Sedlazeck, F. J., & Tong, W. (2022). Towards accurate and reliable resolution of structural variants for clinical diagnosis. Genome Biol, 23(1), 68. doi:10.1186/s13059-022-02636-8
Mahmoud, M., Gobet, N., Cruz-Davalos, D. I., Mounier, N., Dessimoz, C., & Sedlazeck, F. J. (2019). Structural variant calling: the long and the short of it. Genome Biol, 20(1), 246. doi:10.1186/s13059-019-1828-7
McLaren, W., Gil, L., Hunt, S. E., Riat, H. S., Ritchie, G. R., Thormann, A., . . . Cunningham, F. (2016). The ensembl variant effect predictor. Genome Biology, 17(1), 1-14.
Miller, D. T., Lee, K., Abul-Husn, N. S., Amendola, L. M., Brothers, K., Chung, W. K., . . . documents@acmg.net, A. S. F. W. G. E. a. (2023). ACMG SF v3.2 list for reporting of secondary findings in clinical exome and genome sequencing: A policy statement of the American College of Medical Genetics and Genomics (ACMG). Genet Med, 100866. doi:10.1016/j.gim.2023.100866
Mohiyuddin, M., Mu, J. C., Li, J., Bani Asadi, N., Gerstein, M. B., Abyzov, A., . . . Lam, H. Y. (2015). MetaSV: an accurate and integrative structural-variant caller for next generation sequencing. Bioinformatics, 31(16), 2741-2744. doi:10.1093/bioinformatics/btv204
Nagai, A., Hirata, M., Kamatani, Y., Muto, K., Matsuda, K., Kiyohara, Y., . . . Mushiroda, T. (2017). Overview of the BioBank Japan Project: study design and profile. Journal of epidemiology, 27(Supplement_III), S2-S8.
Nicholas, T. J., Cormier, M. J., & Quinlan, A. R. (2022). Annotation of structural variants with reported allele frequencies and related metrics from multiple datasets using SVAFotate. BMC Bioinformatics, 23(1), 490. doi:10.1186/s12859-022-05008-y
Nurk, S., Koren, S., Rhie, A., Rautiainen, M., Bzikadze, A. V., Mikheenko, A., . . . Gershman, A. (2022). The complete sequence of a human genome. Science, 376(6588), 44-53.
O’Leary, A., Fernàndez-Castillo, N., Gan, G., Yang, Y., Yotova, A. Y., Kranz, T. M., . . . Cabana-Domínguez, J. (2022). Behavioural and functional evidence revealing the role of RBFOX1 variation in multiple psychiatric disorders and traits. Molecular Psychiatry, 1-10.
Olson, N. D., Wagner, J., McDaniel, J., Stephens, S. H., Westreich, S. T., Prasanna, A. G., . . . Zook, J. M. (2021). precisionFDA Truth Challenge V2: Calling variants from short- and long-reads in difficult-to-map regions. doi:10.1101/2020.11.13.380741
Oussalah, A., Siblini, Y., Hergalant, S., Chéry, C., Rouyer, P., Cavicchi, C., . . . Pupavac, M. (2022). Epimutations in both the TESK2 and MMACHC promoters in the Epi-cblC inherited disorder of intracellular metabolism of vitamin B12. Clinical Epigenetics, 14(1), 1-13.
Pirooznia, M., Goes, F. S., & Zandi, P. P. (2015). Whole-genome CNV analysis: advances in computational approaches. Frontiers in genetics, 6, 138.
Popic, V., Rohlicek, C., Cunial, F., Hajirasouliha, I., Meleshko, D., Garimella, K., & Maheshwari, A. (2023). Cue: a deep-learning framework for structural variant discovery and genotyping. Nat Methods, 20(4), 559-568. doi:10.1038/s41592-023-01799-x
Poplin, R., Chang, P.-C., Alexander, D., Schwartz, S., Colthurst, T., Ku, A., . . . Afshar, P. T. (2018). A universal SNP and small-indel variant caller using deep neural networks. Nature biotechnology, 36(10), 983-987.
Quinlan, A. R., & Hall, I. M. (2010). BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 26(6), 841-842.
Rausch, T., Zichner, T., Schlattl, A., Stutz, A. M., Benes, V., & Korbel, J. O. (2012). DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics, 28(18), i333-i339. doi:10.1093/bioinformatics/bts378
Richards, S., Aziz, N., Bale, S., Bick, D., Das, S., Gastier-Foster, J., . . . Spector, E. (2015). Standards and guidelines for the interpretation of sequence variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics and the Association for Molecular Pathology. Genetics in medicine, 17(5), 405-423.
Riggs, E. R., Andersen, E. F., Cherry, A. M., Kantarci, S., Kearney, H., Patel, A., . . . Thorland, E. C. (2020). Technical standards for the interpretation and reporting of constitutional copy-number variants: a joint consensus recommendation of the American College of Medical Genetics and Genomics (ACMG) and the Clinical Genome Resource (ClinGen). In: Elsevier.
Ryan M Layer, C. C., Aaron R Quinlan and Ira M Hall. (2014). <LUMPY a probabilistic framework for structural variant discovery.pdf>. Genome Biology.
Sánchez-Gaya, V., & Rada-Iglesias, A. (2023). POSTRE: a tool to predict the pathological effects of human structural variants. Nucleic acids research, 51(9), e54-e54.
Sebat, J., Lakshmi, B., Malhotra, D., Troge, J., Lese-Martin, C., Walsh, T., . . . Kendall, J. (2007). Strong association of de novo copy number mutations with autism. Science, 316(5823), 445-449.
Sudmant, P. H., Rausch, T., Gardner, E. J., Handsaker, R. E., Abyzov, A., Huddleston, J., . . . Korbel, J. O. (2015). An integrated map of structural variation in 2,504 human genomes. Nature, 526(7571), 75-81. doi:10.1038/nature15394
Van der Auwera, G. A., Carneiro, M. O., Hartl, C., Poplin, R., Del Angel, G., Levy‐Moonshine, A., . . . Thibault, J. (2013). From FastQ data to high‐confidence variant calls: the genome analysis toolkit best practices pipeline. Current protocols in bioinformatics, 43(1), 11.10. 11-11.10. 33.
Van Rossum, G., & Drake, F. L. (1995). Python reference manual: Centrum voor Wiskunde en Informatica Amsterdam.
Wagner, J., Olson, N. D., Harris, L., Khan, Z., Farek, J., Mahmoud, M., . . . Zook, J. M. (2022). Benchmarking challenging small variants with linked and long reads. Cell Genom, 2(5). doi:10.1016/j.xgen.2022.100128
Wagner, J., Olson, N. D., Harris, L., McDaniel, J., Cheng, H., Fungtammasan, A., . . . Sedlazeck, F. J. (2022). Curated variation benchmarks for challenging medically relevant autosomal genes. Nat Biotechnol, 40(5), 672-680. doi:10.1038/s41587-021-01158-1
Wala, J. A., Bandopadhayay, P., Greenwald, N. F., O'Rourke, R., Sharpe, T., Stewart, C., . . . Beroukhim, R. (2018). SvABA: genome-wide detection of structural variants and indels by local assembly. Genome Res, 28(4), 581-591. doi:10.1101/gr.221028.117
Wei, C.-Y., Yang, J.-H., Yeh, E.-C., Tsai, M.-F., Kao, H.-J., Lo, C.-Z., . . . Belsare, S. (2021). Genetic profiles of 103,106 individuals in the Taiwan Biobank provide insights into the health and history of Han Chinese. NPJ genomic medicine, 6(1), 10.
Weischenfeldt, J., Symmons, O., Spitz, F., & Korbel, J. O. (2013). Phenotypic impact of genomic structural variation: insights from and for human disease. Nature Reviews Genetics, 14(2), 125-138.
Weisenfeld, N. I., Kumar, V., Shah, P., Church, D. M., & Jaffe, D. B. (2017). Direct determination of diploid genome sequences. Genome research, 27(5), 757-767.
Wickham, H. (2011). ggplot2. Wiley interdisciplinary reviews: computational statistics, 3(2), 180-185.
Wu, D.-C., Hsu, J. S.-J., Chen, C.-Y., Shih, S.-H., Liu, J.-F., Tsai, Y.-C., . . . Chen, P.-L. (2021). Complete genomic profiles of 1,496 Taiwanese reveal curated medical insights. doi:10.1101/2021.12.23.21268291
Wu, Z., Jiang, Z., Li, T., Xie, C., Zhao, L., Yang, J., . . . Xie, Z. (2021). Structural variants in the Chinese population and their impact on phenotypes, diseases and population adaptation. Nat Commun, 12(1), 6501. doi:10.1038/s41467-021-26856-x
Yang, J., & Chaisson, M. J. (2022). TT-Mars: structural variants assessment based on haplotype-resolved assemblies. Genome Biology, 23(1), 110.
Yang, Z., Cui, Q., Zhou, W., Qiu, L., & Han, B. (2019). Comparison of gene mutation spectrum of thalassemia in different regions of China and Southeast Asia. Molecular genetics & genomic medicine, 7(6), e680.
Zarate, S., Carroll, A., Mahmoud, M., Krasheninina, O., Jun, G., Salerno, W. J., . . . Sedlazeck, F. J. (2020). Parliament2: Accurate structural variant calling at scale. Gigascience, 9(12). doi:10.1093/gigascience/giaa145
Zhao, W.-W. (2013). Intragenic deletion of RBFOX1 associated with neurodevelopmental/neuropsychiatric disorders and possibly other clinical presentations. Molecular cytogenetics, 6, 1-5.
Zhao, X., Collins, R. L., Lee, W. P., Weber, A. M., Jun, Y., Zhu, Q., . . . Talkowski, M. E. (2021). Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies. Am J Hum Genet, 108(5), 919-928. doi:10.1016/j.ajhg.2021.03.014
Zook, J. M., Catoe, D., McDaniel, J., Vang, L., Spies, N., Sidow, A., . . . Alexander, N. (2016). Extensive sequencing of seven human genomes to characterize benchmark reference materials. Scientific data, 3(1), 1-26.
Zook, J. M., Hansen, N. F., Olson, N. D., Chapman, L., Mullikin, J. C., Xiao, C., . . . Salit, M. (2020). A robust benchmark for detection of germline large deletions and insertions. Nat Biotechnol, 38(11), 1347-1355. doi:10.1038/s41587-020-0538-8
Zook, J. M., McDaniel, J., Olson, N. D., Wagner, J., Parikh, H., Heaton, H., . . . Salit, M. (2019). An open resource for accurately benchmarking small variant and reference calls. Nat Biotechnol, 37(5), 561-566. doi:10.1038/s41587-019-0074-6
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/90340-
dc.description.abstract遺傳性結構性變異 (germline structural variations, SV) 的定義為基因體上長度超過五十個鹼基的大片段改變,和單點核苷酸變異 (single-nucleotide variant, SNV) 相比具有複雜的變異型態、對基因作用的影響範圍也較大。近年來許多大型基因體定序計畫,如基因組匯總資料庫 (Genome Aggregation Database, gnomAD),利用大規模的多族群樣本進行結構性變異的偵測,並探勘其族群等位基因頻率 (allele frequency) 與功能影響。但在定序分析層面,目前仍未有高可信度且準確的單一方法,可以衡量不同演算法的偵測表現;在族群基因體層面,目前也沒有針對台灣族群進行的大規模結構性變異偵測研究。此研究中我們利用Genome in a Bottle Consortium (GIAB) 釋出位於國際標準品HG002基因體上之大片段缺失 (deletion) 與插入 (insertion) 兩種變異位點的資料,來衡量九種基於不同演算法開發的工具表現。本論文比較工具的偵測能力與特性差異,並建議將所有工具結果合併可達到最佳偵測靈敏度,在缺失和插入變異的召回率 (recall) 可分別達到82%和67%。更進一步將分析策略應用至1,484個台灣人體生物資料庫 (Taiwan Biobank) 的全基因體定序樣本,總共偵測到81,268個缺失與81,235個插入變異、平均每人帶有7,526個大片段缺失與插入兩種結構性變異;從等位基因頻率分析,84%屬於等位基因小於1%的罕見變異 (rare variant),更有48%屬於僅一人帶有的單例變異 (singleton)。通過計算族群等位基因頻率並與基因組匯總資料庫 (gnomAD) 資料庫中東亞族群的變異頻率進行比較,驗證兩資料集的等位基因頻率高度一致,在缺失與插入變異的相關係數分別為0.93和0.89。此外透過變異對基因的影響分析,我們發現某些變異發生在ACMG secondary finding list中具有潛在罹病風險的基因位點,也評估台灣人常見之遺傳疾病甲型地中海型貧血的攜帶率 (carrier rate) 約為5.12%,和過去研究結果相似。
本篇研究建立高可信度的比對流程並評估最佳化偵測策略以協助建立台灣族群的結構性變異資料庫,同時計算族群變異的等位基因頻率。未來將進一步釋出變異資料集與等位基因頻率資訊,以應用於臨床分子診斷及相關研究。
zh_TW
dc.description.abstractStructural variants (SVs) are defined as genomic changes larger than 50 bp and have a functional impact on human genomes. Large public databases such as The Genome Aggregation Database (gnomAD) have constructed population SV profile using different detection methods. However, there was no standard method to evaluate SV calling tools with different algorithms and no SV profiling specifically constructed for Taiwanese. We utilized a benchmark truth set with a collection of insertion /deletion variants (NIST_SV0.6) on HG002 released by Genome in A Bottle (GIAB) consortium to evaluate nine reputable SV callers. Our results suggested the combination of multiple callers was the most sensitive strategy which increased the overall recall rate to 82% for deletions, and 67% for insertions at most. A total of 81,268 deletions and 81,235 insertions were discovered from 1,484 short-read whole genome sequencing Taiwan Biobank samples, with 7,526 SVs per individual on average. Through cohort frequency, we found 84% of SVs were rare, 48% were singleton among Taiwanese. Moreover, the population SV allele frequency was analyzed and compared to the East Asia population in gnomAD-SV database, with correlation coefficients 0.93 in deletion and 0.89 in insertion, respectively. Finally, we annotated the SV call set and found some SVs had overlapped with genes in ACMG secondary finding lists. In addition, we estimated 5.12% of carrier rate for alpha thalassemia in Taiwanese which was similar to previous predictions.
In conclusion, this study utilized a robust benchmarking process and constructed an optimized SV detection workflow for Taiwan Biobank WGS data. Through the cohort SV profile, we characterized the allele frequency distribution in general population. The cohort SV call set and the population allele frequency will be released to facilitate the molecular diagnosis and human genetic researches for Taiwanese population.
en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-09-26T16:20:33Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2023-09-26T16:20:33Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontents口試委員會審定書 …………………………………………………………………… i
誌謝 ………………………………………………………………………………… ii
中文摘要 …………………………………………………………………………….. iii
英文摘要 ……………………………………………………………………………... v
縮寫對照表 …………………………………………………………………………. vii
圖目錄 ……………………………………………………………………………….. xi
表目錄 ………………………………………………………………………………. xii
第一章 背景介紹(Introduction) ……………………………………………………… 1
第二章 實驗設計(Study Materials & Methods) ……………………………………… 9
2.1 Study Materials …………………………………………………………………… 9
2.1.1 GIAB NIST_SV0.6 truth set ……………………………………………….. 9
2.1.2 International standard sample input ………………………………………. 10
2.1.3 Taiwan Biobank sample information ……………………………………… 10
2.2 Study Methods …………………………………………………………………... 12
2.2.1 Sequence preprocessing and variant calling ………………………………. 12
2.2.2 Variant filtration and VCF format change ………………………………… 14
2.2.3 SV merging and accuracy evaluation ……………………………………... 16
2.2.4 Functional annotation and correlation of allele frequency ………………... 18
第三章 實驗結果(Results) ………………………………………………………….. 19
3.1 Truth set composition and filtration ……………………………………………. 19
3.1.1 The identification of truth subset with 8,103 SVs for benchmarking ……. 19
3.1.2 The call-set composition and size distribution in the truth subset ……….. 20
3.2 Benchmarking of structural variation caller performance ……………………… 22
3.2.1 DRAGEN had the best performance among individual tools ………......... 22
3.2.2 SURVIVOR was the optimized tool to merge callers …………………… 23
3.2.3 Merging all callers had higher recall than any single caller ……………… 25
3.2.4 The multiple limitations of insertions caused high false-negatives ……… 26
3.3 Structural variation profiles in Taiwan Biobank samples ……………………… 29
3.3.1 The optimized method to joint variants from multiple samples. ………… 29
3.3.2 Statistics of 164 high-quality and 1,484 Taiwanese ……………………… 29
3.3.3 The individual-level SVs showed a higher overlapping consistency at the joint calling level ………………………………………………………. 32
3.3.4 Transforming from sample frequency to genotype allele frequency …….. 33
3.4 Functional annotation in Taiwan Biobank samples ……………………………. 36
3.4.1 Most of SVs caused effects in non-coding or intergenic regions …........... 36
3.4.2 The prediction of SV pathogenicity using VEP and AnnotSV …………... 37
3.4.3 The allele frequency correlation between Taiwanese and gnomAD-SV database ……………………………………………………………………39
3.4.4 Thalassemia carrier rate and secondary finding list in Taiwan Biobank ……………………………………………………………………………… 41
3.5 The final release for the Taiwan Biobank SV joint call set …………………… 42
第四章 結果討論(Discussion) …………………………………………………….. 44
圖 …………………………………………………………………………................. 55
表 …………………………………………………………………………............... 69
參考文獻 ………………………………………………………………….………... 78
附錄 ………………………………………………………………………………… 86
-
dc.language.isoen-
dc.title台灣人體生物資料庫1,484人全基因體定序偵測大片段缺失與插入變異資訊zh_TW
dc.titleGermline large deletions and insertions in 1,484 Taiwan Biobank whole genome sequence dataen
dc.typeThesis-
dc.date.schoolyear111-2-
dc.description.degree碩士-
dc.contributor.oralexamcommittee陳沛隆;陳倩瑜;郭柏秀zh_TW
dc.contributor.oralexamcommitteePei-Lung Chen;Chien-Yu Chen;Po-Hsiu Kuoen
dc.subject.keyword生物資訊學,結構性變異,工具表現比對,族群基因體學,台灣人體生物資料庫,zh_TW
dc.subject.keywordbioinformatics,structural variation,benchmarking,population genetics,Taiwan Biobank,en
dc.relation.page90-
dc.identifier.doi10.6342/NTU202302413-
dc.rights.note未授權-
dc.date.accepted2023-08-04-
dc.contributor.author-college醫學院-
dc.contributor.author-dept基因體暨蛋白體醫學研究所-
Appears in Collections:基因體暨蛋白體醫學研究所

Files in This Item:
File SizeFormat 
ntu-111-2.pdf
  Restricted Access
4.91 MBAdobe PDF
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved