Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 生物資源暨農學院
  3. 農藝學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/64660
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor劉力瑜(Li-Yu Liu)
dc.contributor.authorYing-Tsui Wangen
dc.contributor.author王瀅翠zh_TW
dc.date.accessioned2021-06-16T22:57:02Z-
dc.date.available2017-08-15
dc.date.copyright2012-08-15
dc.date.issued2012
dc.date.submitted2012-08-09
dc.identifier.citation1.Li B, Xia Q, Lu C, Zhou Z, Xiang Z: Analysis on frequency and density of microsatellites in coding sequences of several eukaryotic genomes. Genomics Proteomics Bioinformatics 2004, 2(1):24-31.
2.Morgante M, Hanafey M, Powell W: Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nature Genetics 2002, 30(2):194-200.
3.McCouch SR, Chen X, Panaud O, Temnykh S, Xu Y, Cho YG, Huang N, Ishii T, Blair M: Microsatellite marker development, mapping and applications in rice genetics and breeding. Plant Molecular Biology 1997, 35(1):89-99.
4.Zane L, Bargelloni L, Patarnello T: Strategies for microsatellite isolation: a review. Molecular ecology 2002, 11(1):1-16.
5.Abdelkrim J, Robertson BC, Stanton JAL, Gemmell NJ: Fast, cost-effective development of species-specific microsatellite markers by genomic sequencing. BioTechniques 2009, 46(3):185-192.
6.Zalapa JE, Cuevas H, Zhu H, Steffan S, Senalik D, Zeldin E, McCown B, Harbut R, Simon P: Using next-generation sequencing approaches to isolate simple sequence repeat (SSR) loci in the plant sciences. American Journal of Botany 2012, 99(2):193-208.
7.Kircher M, Kelso J: High throughput DNA sequencing-concepts and limitations. BioEssays 2010, 32(6):524-536.
8.Shendure J, Ji H: Next-generation DNA sequencing. Nat Biotechnol 2008, 26(10):1135-1145.
9.Metzker ML: Sequencing technologies-the next generation. Nature Reviews Genetics 2009, 11(1):31-46.
10.Mardis ER: The impact of next-generation sequencing technology on genetics. Trends in Genetics 2008, 24(3):133-141.
11.Rasmussen D, Noor M: What can you do with 0.1× genome coverage? A case study based on a genome survey of the scuttle fly Megaselia scalaris (Phoridae). Bmc Genomics 2009, 10(1):382.
12.Wicker T, Narechania A, Sabot F, Stein J, Vu G, Graner A, Ware D, Stein N: Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats. Bmc Genomics 2008, 9(1):518.
13.Waring M, Britten RJ: Nucleotide sequence repetition: a rapidly reassociating fraction of mouse DNA. Science 1966, 154(3750):791.
14.Lysholm F, Andersson B, Persson B: An efficient simulator of 454 data using configurable statistical models. BMC research notes 2011, 4(1):449.
15.Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Research 1999, 27(2):573.
16.Peterson DG, Wessler SR, Paterson AH: Efficient capture of unique sequences from eukaryotic genomes. Trends in Genetics 2002, 18(11):547-550.
17.Britten RJ, Graham DE, Neufeld BR: Analysis of repeating DNA sequences by reassociation. Methods in enzymology 1974, 29:363-418.
18.Britten R, Ko D: Repeated Sequences in DA'. 1968.
19.Peterson DG, Pearson WR, Stack SM: Characterization of the tomato (Lycopersicon esculentum) genome using in vitro and in situ DNA reassociation. Genome 1998, 41(3):346-356.
20.Pearson W, Davidson E, Britten R: A program for least squares analysis of reassociation and hybridization data. Nucleic Acids Research 1977, 4(6):1727.
21.Paterson AH: Leafing through the genomes of our major crop plants: strategies for capturing unique information. Nature Reviews Genetics 2006, 7(3):174-184.
22.Gupta VS, Gadre S, Ranjekar P: Novel DNA sequence organization in rice genome* 1. Biochimica et Biophysica Acta (BBA)-Nucleic Acids and Protein Synthesis 1981, 656(2):147-154.
23.Smit A, Hubley R, Green P: RepeatMasker Open-3.0. 2004.
24.Mdust [http://compbio.dfci.harvard.edu/tgi/software/].
25.Price AL, Jones NC, Pevzner PA: De novo identification of repeat families in large genomes. Bioinformatics 2005, 21(suppl 1):i351-i358.
26.McGinnis S, Madden TL: BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Research 2004, 32(suppl 2):W20-W25.
27.Goldberg RB: DNA sequence organization in the soybean plant. Biochemical genetics 1978, 16(1):45-68.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/64660-
dc.description.abstract微衛星序列(microsatellite),又稱為簡單重複序列(simple sequence repeats, SSRs),是以1-6個核苷酸為單位,不斷重複之序列,並分佈於各物種的基因體內。因為具有廣泛分佈及高多型性的特徵,簡單重複序列常被設計為分子標誌(molecular marker)應用於各種研究。近年來,二世代定序技術(next generation sequencing technology)的出現與發展,改變了傳統上用來找尋簡單重複序列的方式,二世代定序的高通量及相對低價格,亦提供給科學家們一個新的機會,尋找更多未曾發現的簡單重複序列,然而,在預算有限的先驗實驗中,經費通常只足以進行低倍率(low coverage)之定序,此情形也將在基因體越大的物種中越漸明顯。
此研究中,我們將以模擬的方式,在低倍率定序下,探討未解序物種的簡單重複序列個數和定序倍率之相關性。模擬分為兩個步驟,首先,我們以水稻全基因體去建立三個資料庫,再利用三個資料庫中水稻的片段序列(subsequence)去組裝出有興趣物種的模擬基因體,而此模擬基因體與原始物種基因體具有相似的DNA複雜度(complexity),接著,利用模擬器454sim去模擬在不同定序倍率下454平台的定序結果,並找尋簡單重複序列。結果顯示,簡單重複序列個數隨著定序倍率增高而增加,更重要的事,此方法使我們得以利用模擬的方式,估計未解序物種之簡單重複序列個數,以幫助我們事先做預算的分配。
zh_TW
dc.description.abstractMicrosatellites or simple sequence repeats (SSRs) are tandem repeats distributed across genomes with 1 to 6 nucleotide motifs. Because of their genomic abundance and high level of polymorphism, SSRs is designed as molecular markers to apply in a variety of researches. In recent year, the rapidly-developing next generation sequencing technology (NGST) has impacted the ways of mining SSRs. NGST not only has the advantage of higher speed and lower cost but also offers the opportunities to discover novel SSRs. However, in a pilot study, the budget may be limited and one can only afford a low-coverage sequencing project regarding to the genome of interest. The situation may be more severe when the genome size is large.
In this study, we aimed to investigate the relation between the mined SSR counts and the sequencing depth for a genome whose sequence which is not yet available by simulations at low coverage sequencing. The simulation was two-fold. First, we separate whole rice genome to establish three databases. Second, we simulated a genome with approximate complexity by recombining known rice genome subsequences. Moreover, we mimicked 454 sequencing results under different coverage using 454sim and mined SSRs accordingly. The results showed that the number of mined SSRs increased as the sequencing depth increased. More importantly, this procedure provided a mean to estimate the number of mined SSRs without whole genome sequence and hence to assist to set budget in advance.
en
dc.description.provenanceMade available in DSpace on 2021-06-16T22:57:02Z (GMT). No. of bitstreams: 1
ntu-101-R99621205-1.pdf: 8361046 bytes, checksum: 081301d934eb75327f686a986b75c4c4 (MD5)
Previous issue date: 2012
en
dc.description.tableofcontents口試委員會審定書 i
誌謝 ii
摘要 iii
Abstract iv
Chapter 1 INTRODUCTION 1
Chapter 2 MATERIALS AND METHOD 4
2.1 Database establishment 5
2.1.1 The approach to establish three databases 7
2.2 Sequencing data simulation 11
2.2.1 Genome simulation 11
2.2.2 Sequencing simulation 13
2.2.3 SSR mining 14
Chapter 3 RESULT AND DISCUSSION 17
3.1 Database establishment 17
3.2 Sequencing data simulation 18
3.2.1 Similarity between true and simulated genome 18
3.2.2 Simulation with 454sim 19
3.2.3 SSR mining-method 1 21
3.2.4 SSR mining-method 2 23
3.3 Technical issue 25
3.3.1 Two additional programs 25
3.3.2 Five main programs 25
Chapter 4 CONCLUSION 31
4.1 Summary 31
4.2 Future work 33
Reference 34
Appendix 37
A.1 37
A.2 39
A.3 40
A.4 42
dc.language.isoen
dc.subject二世代定序技術zh_TW
dc.subject簡單重複序列探勘zh_TW
dc.subject低倍率定序zh_TW
dc.subject模擬zh_TW
dc.subject454simzh_TW
dc.subjectlow coverage sequencingen
dc.subjectSSR miningen
dc.subjectsimulationen
dc.subject454simen
dc.subjectnext generation sequencing technologyen
dc.title模擬研究定序覆蓋率對探勘簡單重複序列的影響zh_TW
dc.titleThe Effect of Sequencing Coverage on Mining Simple Sequence Repeats by Simulationen
dc.typeThesis
dc.date.schoolyear100-2
dc.description.degree碩士
dc.contributor.oralexamcommittee胡凱康(Kae-Kang Hwu),林彥蓉(Yann-Rong Lin)
dc.subject.keyword簡單重複序列探勘,低倍率定序,模擬,454sim,二世代定序技術,zh_TW
dc.subject.keywordSSR mining,low coverage sequencing,simulation,454sim,next generation sequencing technology,en
dc.relation.page42
dc.rights.note有償授權
dc.date.accepted2012-08-10
dc.contributor.author-college生物資源暨農學院zh_TW
dc.contributor.author-dept農藝學研究所zh_TW
顯示於系所單位:農藝學系

文件中的檔案:
檔案 大小格式 
ntu-101-1.pdf
  未授權公開取用
8.17 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved