Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72441Full metadata record
| ???org.dspace.app.webui.jsptag.ItemTag.dcfield??? | Value | Language |
|---|---|---|
| dc.contributor.advisor | 趙坤茂 | |
| dc.contributor.author | Yi-Chen Huang | en |
| dc.contributor.author | 黃亦晨 | zh_TW |
| dc.date.accessioned | 2021-06-17T06:59:16Z | - |
| dc.date.available | 2019-08-07 | |
| dc.date.copyright | 2019-08-07 | |
| dc.date.issued | 2019 | |
| dc.date.submitted | 2019-08-05 | |
| dc.identifier.citation | [1] Kececioglu, J.D., and Myers, E. W. (1995). Combinatorial algorithms for DNA sequence assembly. Algorithmica 13, 7–51.
[2] Zerbino, D. R. and Birney, E. (2008). Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome research, 18(5), 821–829. [3] Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB. (2008). ALLPATHS: de novo assembly of whole-genome shotgun microreads. Genome Res 18: 810–820. [4] Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJM, Birol I. (2009). ABySS: a parallel assembler for short read sequence data. Genome Res 19: 1117–1123. [5] Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y, et al. (2012). SOAPdenovo2: an empirically improved memory-efficient shortread de novo assembler. Gigascience 1: 27. [6] Pevzner P.A. Tang H. Waterman M.S. (2001). An Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. U.S.A., 98:9748–9753. [7] Paulino D, Warren RL, Vandervalk BP, et al. (2015). Sealer: a scalable gap-closing application for finishing draft genomes. BMC Bioinformatics. 16(1):230. [8] Boetzer M, Pirovano W. (2012). Toward almost closed genomes with GapFiller. Genome Biol. 13(6):R56. [9] Tsai, I.J. et al. (2010). Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps. Genome Biol., 11, R41. [10] 10x Genomics, Inc. Overview of Genome Software. https://support.10xgenomics.com/genome-exome/software/overview/welcome. Accessed 11 Jul 2019. [11] Weisenfeld, N. I., Kumar, V., Shah, P., Church, D. M., & Jaffe, D. B. (2017). Direct determination of diploid genome sequences. Genome Research, 27(5), 757-767. [12] Yeo S, Coombe L, Chu J, Warren RL, Birol I. (2018). ARCS: scaffolding genome drafts with linked reads. Bioinformatics, 34(5), 725–31. [13] Coombe L, Zhang J, Vandervalk BP, et al. (2018). ARKS: chromosome-scale scaffolding of human genome drafts with linked read kmers. BMC Bioinformatics, 19:234. [14] Jackman SD, Coombe L, Chu J, Warren RL, Vandervalk BP, Yeo S, et al. (2018). Tigmint: Correcting Assembly Errors Using Linked Reads From Large Molecules bioRxiv. [15] Warren RL, Yang C, Vandervalk BP, Behsaz B, Lagman A, Jones SJ, et al. (2015). LINKS: scalable, alignment-free scaffolding of draft genomes with long reads. GigaScience, 4(1):35. [16] Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V., & Zdobnov, E. M. (2017). BUSCO: Assessing genome assembly and annotation completeness with single‐copy orthologs. Bioinformatics, 31(19), 3210–3212. [17] Bankevich, A., Nurk, S., Antipov, D., Gurevich, A. A., Dvorkin, M., Kulikov, A. S., Lesin, V. M., Nikolenko, S. I., Pham, S., Prjibelski, A. D., Pyshkin, A. V., Sirotkin, A. V., Vyahhi, N., Tesler, G., Alekseyev, M. A., … Pevzner, P. A. (2012). SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. Journal of Computational Biology : a Journal of Computational Molecular Cell Biology, 19(5), 455-77. [18] Zheng GX, Lau BT, Schnall-Levin M, Jarosz M, Bell JM, Hindson CM, Kyriazopoulou-Panagiotopoulou S, Masquelier DA, Merrill L, Terry JM, et al. (2016). Haplotyping germline and cancer genomes with highthroughput linked-read sequencing. Nat Biotech 34: 303–311. [19] Trim Galore!, RRID:SCR_011847, http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/ [20] Li, H., & Durbin, R. (2010). Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England), 26(5), 589-95. [21] Gnerre S, MacCallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, et al. (2011). High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci USA. 108:1513–1518. [22] Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W: (2011). Scaffolding pre-assembled contigs using SSPACE. Bioinformatics. 27: 578-579. [23] Koren, S. et al. (2017). Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res.27, 722–736. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72441 | - |
| dc.description.abstract | 次世代定序(Next Generation Sequencing)因為定序讀長與組裝技術的限制,導致基因體初稿中序列之內或之間產生間隙序列(Gaps),亦即一段未知、尚未被確認的序列。透過解決間隙序列,將可以在組裝中找到更多的基因,不僅可以提高組裝中的基因含量,更能夠達到更完整的基因體組裝。本論文使用 10x Genomics公司所開發之技術,將 Barcode 標記結合短序列定序以獲取長片段訊息(Linked Reads);嘗試利用其資料特性,因應以往棘手的重複序列(Repeats)問題,透過解析基因體組裝中的間隙序列,進行填補與修復。實驗結果顯示,針對日本鰻組裝中 11.8%的間隙區域有效的下降到 4.5%,並且大幅提高了組裝中的基因含量。 | zh_TW |
| dc.description.abstract | De novo assembly of short reads that accompanies with scaffolding algorithm often produces lots of gaps due to a lack of information to determine the sequences between contigs which leads to an incomplete draft of genomes. Here, we present a protocol for filling the gaps in genome that utilizes linked-read information contained in barcoded short reads. We validate our protocol with Japanese eel genome and show how the gaps can be filled using 10x Chromium linked reads with our local assembly methods. We expect this gap-filling protocol can be utilized for reaching more complete and high-quality scaffolds in genome assembly drafts. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-17T06:59:16Z (GMT). No. of bitstreams: 1 ntu-108-R06945025-1.pdf: 1529195 bytes, checksum: a27d29f15525e44dbb1cfb401604ffc1 (MD5) Previous issue date: 2019 | en |
| dc.description.tableofcontents | 口試委員會審定書...........................................................................................................#
誌謝....................................................................................................................................i 中文摘要.......................................................................................................................... ii ABSTRACT .................................................................................................................... iii CONTENTS .....................................................................................................................iv LIST OF FIGURES..........................................................................................................vi LIST OF TABLES.......................................................................................................... vii Chapter 1 Introduction..............................................................................................1 Chapter 2 Methods and Materials............................................................................6 2.1 Basic Idea .......................................................................................................6 2.2 Strategies of contig construction ....................................................................6 2.2.1 Scaffold-based bin algorithm ................................................................8 2.2.2 Gap-based bin algorithm.......................................................................8 2.2.3 Window-based bin algorithm..............................................................10 2.2.4 Alternative strategy: De novo assembly..............................................11 2.3 Gap Filling Pipeline......................................................................................12 2.3.1 Data Preprocessing..............................................................................12 2.3.2 Contig Construction ............................................................................13 2.3.3 Gap Filling Algorithm.........................................................................14 2.3.4 Data Sources........................................................................................17 Chapter 3 Results.....................................................................................................19 3.1 Barcoded Linked Reads Preprocessing ........................................................19 3.2 Test the Gap Filling on Four-scaffold Testset...............................................20 3.2.1 Barcode Selection................................................................................20 3.2.2 Effects of mark duplicate ....................................................................22 3.2.3 Partition Algorithms............................................................................26 3.3 Results on L95 Set of Eel Assembly ............................................................28 3.3.1 Gap Filling Result ...............................................................................28 3.3.2 Biological Quality Assessment ...........................................................31 Chapter 4 Discussion................................................................................................34 Chapter 5 Conclusion ..............................................................................................38 REFERENCE ..................................................................................................................39 | |
| dc.language.iso | en | |
| dc.subject | 基因體間隙 | zh_TW |
| dc.subject | GemCode 平台 | zh_TW |
| dc.subject | Barcode 序列 | zh_TW |
| dc.subject | 間隙填補 | zh_TW |
| dc.subject | 基因體組裝 | zh_TW |
| dc.subject | de novo assembly | en |
| dc.subject | Gap filling | en |
| dc.subject | Gap closure | en |
| dc.subject | Linked reads | en |
| dc.subject | Next-generation sequencing | en |
| dc.subject | 10x Genomics Chromium | en |
| dc.title | 針對基因組裝中欠缺的間隙區域以標記性長片段做填補 | zh_TW |
| dc.title | Gap-Centered Local Assembly: Gap Filling in Genome Drafts with Linked Reads | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 107-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 何建明,林仲彥 | |
| dc.subject.keyword | 基因體組裝,基因體間隙,間隙填補,GemCode 平台,Barcode 序列, | zh_TW |
| dc.subject.keyword | 10x Genomics Chromium,Gap filling,Gap closure,Linked reads,Next-generation sequencing,de novo assembly, | en |
| dc.relation.page | 41 | |
| dc.identifier.doi | 10.6342/NTU201902537 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2019-08-05 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 生醫電子與資訊學研究所 | zh_TW |
| Appears in Collections: | 生醫電子與資訊學研究所 | |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-108-1.pdf Restricted Access | 1.49 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
