Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 生醫電子與資訊學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72441
Full metadata record
???org.dspace.app.webui.jsptag.ItemTag.dcfield???ValueLanguage
dc.contributor.advisor趙坤茂
dc.contributor.authorYi-Chen Huangen
dc.contributor.author黃亦晨zh_TW
dc.date.accessioned2021-06-17T06:59:16Z-
dc.date.available2019-08-07
dc.date.copyright2019-08-07
dc.date.issued2019
dc.date.submitted2019-08-05
dc.identifier.citation[1] Kececioglu, J.D., and Myers, E. W. (1995). Combinatorial algorithms for DNA sequence assembly. Algorithmica 13, 7–51.
[2] Zerbino, D. R. and Birney, E. (2008). Velvet: algorithms for de novo short read assembly using de bruijn graphs. Genome research, 18(5), 821–829.
[3] Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK, Lander ES, Nusbaum C, Jaffe DB. (2008). ALLPATHS: de novo assembly of whole-genome shotgun microreads. Genome Res 18: 810–820.
[4] Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJM, Birol I. (2009). ABySS: a parallel assembler for short read sequence data. Genome Res 19: 1117–1123.
[5] Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y, et al. (2012). SOAPdenovo2: an empirically improved memory-efficient shortread de novo assembler. Gigascience 1: 27.
[6] Pevzner P.A. Tang H. Waterman M.S. (2001). An Eulerian path approach to DNA fragment assembly. Proc. Natl. Acad. Sci. U.S.A., 98:9748–9753.
[7] Paulino D, Warren RL, Vandervalk BP, et al. (2015). Sealer: a scalable gap-closing application for finishing draft genomes. BMC Bioinformatics. 16(1):230.
[8] Boetzer M, Pirovano W. (2012). Toward almost closed genomes with GapFiller. Genome Biol. 13(6):R56.
[9] Tsai, I.J. et al. (2010). Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps. Genome Biol., 11, R41.
[10] 10x Genomics, Inc. Overview of Genome Software. https://support.10xgenomics.com/genome-exome/software/overview/welcome. Accessed 11 Jul 2019.
[11] Weisenfeld, N. I., Kumar, V., Shah, P., Church, D. M., & Jaffe, D. B. (2017). Direct determination of diploid genome sequences. Genome Research, 27(5), 757-767.
[12] Yeo S, Coombe L, Chu J, Warren RL, Birol I. (2018). ARCS: scaffolding genome drafts with linked reads. Bioinformatics, 34(5), 725–31.
[13] Coombe L, Zhang J, Vandervalk BP, et al. (2018). ARKS: chromosome-scale scaffolding of human genome drafts with linked read kmers. BMC Bioinformatics, 19:234.
[14] Jackman SD, Coombe L, Chu J, Warren RL, Vandervalk BP, Yeo S, et al. (2018). Tigmint: Correcting Assembly Errors Using Linked Reads From Large Molecules bioRxiv.
[15] Warren RL, Yang C, Vandervalk BP, Behsaz B, Lagman A, Jones SJ, et al. (2015). LINKS: scalable, alignment-free scaffolding of draft genomes with long reads. GigaScience, 4(1):35.
[16] Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V., & Zdobnov, E. M. (2017). BUSCO: Assessing genome assembly and annotation completeness with single‐copy orthologs. Bioinformatics, 31(19), 3210–3212.
[17] Bankevich, A., Nurk, S., Antipov, D., Gurevich, A. A., Dvorkin, M., Kulikov, A. S., Lesin, V. M., Nikolenko, S. I., Pham, S., Prjibelski, A. D., Pyshkin, A. V., Sirotkin, A. V., Vyahhi, N., Tesler, G., Alekseyev, M. A., … Pevzner, P. A. (2012). SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing. Journal of Computational Biology : a Journal of Computational Molecular Cell Biology, 19(5), 455-77.
[18] Zheng GX, Lau BT, Schnall-Levin M, Jarosz M, Bell JM, Hindson CM, Kyriazopoulou-Panagiotopoulou S, Masquelier DA, Merrill L, Terry JM, et al. (2016). Haplotyping germline and cancer genomes with highthroughput linked-read sequencing. Nat Biotech 34: 303–311.
[19] Trim Galore!, RRID:SCR_011847, http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/
[20] Li, H., & Durbin, R. (2010). Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England), 26(5), 589-95.
[21] Gnerre S, MacCallum I, Przybylski D, Ribeiro FJ, Burton JN, Walker BJ, et al. (2011). High-quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci USA. 108:1513–1518.
[22] Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W: (2011). Scaffolding pre-assembled contigs using SSPACE. Bioinformatics. 27: 578-579.
[23] Koren, S. et al. (2017). Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res.27, 722–736.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/72441-
dc.description.abstract次世代定序(Next Generation Sequencing)因為定序讀長與組裝技術的限制,導致基因體初稿中序列之內或之間產生間隙序列(Gaps),亦即一段未知、尚未被確認的序列。透過解決間隙序列,將可以在組裝中找到更多的基因,不僅可以提高組裝中的基因含量,更能夠達到更完整的基因體組裝。本論文使用 10x Genomics公司所開發之技術,將 Barcode 標記結合短序列定序以獲取長片段訊息(Linked Reads);嘗試利用其資料特性,因應以往棘手的重複序列(Repeats)問題,透過解析基因體組裝中的間隙序列,進行填補與修復。實驗結果顯示,針對日本鰻組裝中 11.8%的間隙區域有效的下降到 4.5%,並且大幅提高了組裝中的基因含量。zh_TW
dc.description.abstractDe novo assembly of short reads that accompanies with scaffolding algorithm often produces lots of gaps due to a lack of information to determine the sequences between contigs which leads to an incomplete draft of genomes. Here, we present a protocol for filling the gaps in genome that utilizes linked-read information contained in barcoded short reads. We validate our protocol with Japanese eel genome and show how the gaps can be filled using 10x Chromium linked reads with our local assembly methods. We expect this gap-filling protocol can be utilized for reaching more complete and high-quality scaffolds in genome assembly drafts.en
dc.description.provenanceMade available in DSpace on 2021-06-17T06:59:16Z (GMT). No. of bitstreams: 1
ntu-108-R06945025-1.pdf: 1529195 bytes, checksum: a27d29f15525e44dbb1cfb401604ffc1 (MD5)
Previous issue date: 2019
en
dc.description.tableofcontents口試委員會審定書...........................................................................................................#
誌謝....................................................................................................................................i
中文摘要.......................................................................................................................... ii
ABSTRACT .................................................................................................................... iii
CONTENTS .....................................................................................................................iv
LIST OF FIGURES..........................................................................................................vi
LIST OF TABLES.......................................................................................................... vii
Chapter 1 Introduction..............................................................................................1
Chapter 2 Methods and Materials............................................................................6
2.1 Basic Idea .......................................................................................................6
2.2 Strategies of contig construction ....................................................................6
2.2.1 Scaffold-based bin algorithm ................................................................8
2.2.2 Gap-based bin algorithm.......................................................................8
2.2.3 Window-based bin algorithm..............................................................10
2.2.4 Alternative strategy: De novo assembly..............................................11
2.3 Gap Filling Pipeline......................................................................................12
2.3.1 Data Preprocessing..............................................................................12
2.3.2 Contig Construction ............................................................................13
2.3.3 Gap Filling Algorithm.........................................................................14
2.3.4 Data Sources........................................................................................17
Chapter 3 Results.....................................................................................................19
3.1 Barcoded Linked Reads Preprocessing ........................................................19
3.2 Test the Gap Filling on Four-scaffold Testset...............................................20
3.2.1 Barcode Selection................................................................................20
3.2.2 Effects of mark duplicate ....................................................................22
3.2.3 Partition Algorithms............................................................................26
3.3 Results on L95 Set of Eel Assembly ............................................................28
3.3.1 Gap Filling Result ...............................................................................28
3.3.2 Biological Quality Assessment ...........................................................31
Chapter 4 Discussion................................................................................................34
Chapter 5 Conclusion ..............................................................................................38
REFERENCE ..................................................................................................................39
dc.language.isoen
dc.subject基因體間隙zh_TW
dc.subjectGemCode 平台zh_TW
dc.subjectBarcode 序列zh_TW
dc.subject間隙填補zh_TW
dc.subject基因體組裝zh_TW
dc.subjectde novo assemblyen
dc.subjectGap fillingen
dc.subjectGap closureen
dc.subjectLinked readsen
dc.subjectNext-generation sequencingen
dc.subject10x Genomics Chromiumen
dc.title針對基因組裝中欠缺的間隙區域以標記性長片段做填補zh_TW
dc.titleGap-Centered Local Assembly: Gap Filling in Genome Drafts with Linked Readsen
dc.typeThesis
dc.date.schoolyear107-2
dc.description.degree碩士
dc.contributor.oralexamcommittee何建明,林仲彥
dc.subject.keyword基因體組裝,基因體間隙,間隙填補,GemCode 平台,Barcode 序列,zh_TW
dc.subject.keyword10x Genomics Chromium,Gap filling,Gap closure,Linked reads,Next-generation sequencing,de novo assembly,en
dc.relation.page41
dc.identifier.doi10.6342/NTU201902537
dc.rights.note有償授權
dc.date.accepted2019-08-05
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept生醫電子與資訊學研究所zh_TW
Appears in Collections:生醫電子與資訊學研究所

Files in This Item:
File SizeFormat 
ntu-108-1.pdf
  Restricted Access
1.49 MBAdobe PDF
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved