Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 生醫電子與資訊學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/71669
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor歐陽彥正
dc.contributor.authorYu-Yu Linen
dc.contributor.author林祐榆zh_TW
dc.date.accessioned2021-06-17T06:06:06Z-
dc.date.available2024-01-29
dc.date.copyright2019-01-29
dc.date.issued2018
dc.date.submitted2019-01-16
dc.identifier.citation1. Nalls, M.A., et al., Large-scale meta-analysis of genome-wide association data identifies six new risk loci for Parkinson's disease. Nat Genet, 2014. 46(9): p. 989-93.
2. Ripke, S., et al., Genome-wide association analysis identifies 13 new risk loci for schizophrenia. Nat Genet, 2013. 45(10): p. 1150-9.
3. Sankararaman, S., et al., The genomic landscape of Neanderthal ancestry in present-day humans. Nature, 2014. 507(7492): p. 354-7.
4. Schiffels, S. and R. Durbin, Inferring human population size and separation history from multiple genome sequences. Nat Genet, 2014. 46(8): p. 919-25.
5. Zanger, U.M. and M. Schwab, Cytochrome P450 enzymes in drug metabolism: regulation of gene expression, enzyme activities, and impact of genetic variation. Pharmacol Ther, 2013. 138(1): p. 103-41.
6. Deenen, M.J., et al., Relationship between single nucleotide polymorphisms and haplotypes in DPYD and toxicity and efficacy of capecitabine in advanced colorectal cancer. Clin Cancer Res, 2011. 17(10): p. 3455-68.
7. Castel, S.E., et al., Rare variant phasing and haplotypic expression from RNA sequencing with phASER. Nat Commun, 2016. 7: p. 12817.
8. Browning, S.R. and B.L. Browning, Haplotype phasing: existing methods and new developments. Nat Rev Genet, 2011. 12(10): p. 703-14.
9. Snyder, M.W., et al., Haplotype-resolved genome sequencing: experimental methods and applications. Nat Rev Genet, 2015. 16(6): p. 344-58.
10. O'Connell, J., et al., A general approach for haplotype phasing across the full spectrum of relatedness. PLoS Genet, 2014. 10(4): p. e1004234.
11. Glusman, G., H.C. Cox, and J.C. Roach, Whole-genome haplotyping approaches and genomic medicine. Genome Med, 2014. 6(9): p. 73.
12. Edge, P., V. Bafna, and V. Bansal, HapCUT2: robust and accurate haplotype assembly for diverse sequencing technologies. Genome Res, 2017. 27(5): p. 801-812.
13. Garg, S., M. Martin, and T. Marschall, Read-based phasing of related individuals. Bioinformatics, 2016. 32(12): p. i234-i242.
14. Mazrouee, S. and W. Wang, FastHap: fast and accurate single individual haplotype reconstruction using fuzzy conflict graphs. Bioinformatics, 2014. 30(17): p. i371-8.
15. Pirola, Y., et al., HapCol: accurate and memory-efficient haplotype assembly from long reads. Bioinformatics, 2016. 32(11): p. 1610-7.
16. Choi, Y., et al., Comparison of phasing strategies for whole human genomes. PLoS Genet, 2018. 14(4): p. e1007308.
17. Ellingford, J.M., et al., Whole Genome Sequencing Increases Molecular Diagnostic Yield Compared with Current Diagnostic Testing for Inherited Retinal Disease. Ophthalmology, 2016. 123(5): p. 1143-50.
18. Luukkonen, T.M., et al., Breakpoint mapping and haplotype analysis of translocation t(1;12)(q43;q21.1) in two apparently independent families with vascular phenotypes. Mol Genet Genomic Med, 2018. 6(1): p. 56-68.
19. Sousa-Pinto, B., et al., HLA and Delayed Drug-Induced Hypersensitivity. International Archives of Allergy and Immunology, 2016. 170(3): p. 163-179.
20. Stavropoulos, D.J., et al., Whole Genome Sequencing Expands Diagnostic Utility and Improves Clinical Management in Pediatric Medicine. NPJ Genom Med, 2016. 1.
21. Cheng, S.J., et al., Systematic identification and annotation of multiple-variant compound effects at transcription factor binding sites in human genome. Journal of Genetics and Genomics, 2018. 45(7): p. 373-379.
22. Wu, P.C., et al., ABO genotyping with next-generation sequencing to resolve heterogeneity in donors with serology discrepancies. Transfusion, 2018.
23. Lancia G, e.a. SNPs problems, complexity, and algorithms. in The 9th Annual European Symposium on Algorithms. 2001. Berlin: Springer.
24. Lippert, R., et al., Algorithmic strategies for the single nucleotide polymorphism haplotype assembly problem. Brief Bioinform, 2002. 3(1): p. 23-31.
25. Chen, Z.Z., F. Deng, and L. Wang, Exact algorithms for haplotype assembly from whole-genome sequence data. Bioinformatics, 2013. 29(16): p. 1938-45.
26. He, D., et al., Optimal algorithms for haplotype assembly from whole-genome sequence data. Bioinformatics, 2010. 26(12): p. i183-90.
27. Aguiar, D. and S. Istrail, HapCompass: a fast cycle basis algorithm for accurate haplotype assembly of sequence data. J Comput Biol, 2012. 19(6): p. 577-90.
28. Xie, M., J. Wang, and X. Chen, LGH: A Fast and Accurate Algorithm for Single Individual Haplotyping Based on a Two-Locus Linkage Graph. IEEE/ACM Trans Comput Biol Bioinform, 2015. 12(6): p. 1255-66.
29. Bansal, V. and V. Bafna, HapCUT: an efficient and accurate algorithm for the haplotype assembly problem. Bioinformatics, 2008. 24(16): p. i153-9.
30. Sedlazeck, F.J., et al., Piercing the dark matter: bioinformatics of long-range sequencing and mapping. Nat Rev Genet, 2018. 19(6): p. 329-346.
31. Mark J.P. Chaisson, et al., Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nature, 2018.
32. Duitama, J., et al., Fosmid-based whole genome haplotyping of a HapMap trio child: evaluation of Single Individual Haplotyping techniques. Nucleic Acids Res, 2012. 40(5): p. 2041-53.
33. Zook, J.M., et al., Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls. Nat Biotechnol, 2014. 32(3): p. 246-51.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/71669-
dc.description.abstract基因分型是在確定變異基因座上存在的等位基因,例如A / A(同型合子)或G / A(異型合子),而單倍體分型則描述了判定不同基因座上等位基因共同位於一染色體的過程,正確地揭示多個位點間等位基因之間的關係。換句話說,單倍體分型在於判定多個異型合子在兩股染色體上共出現的相互關係。因此單倍體分型又稱為定相,用於表示同染色體中,不同基因座上等位基因的順位關係。研究指出單倍體相位會影響基因表達、訊號傳遞與蛋白質功能,因此極為重要。
隨著定序技術進步,基於短序列資訊的定相方法日趨受到重視。最小錯誤更正法為短序列定相方法中最主流的建模方式,此方法通過減少單核苷酸多態性與片段矩陣內的衝突來預測機率最高的單倍體組合。然而,學者觀察普遍指出最佳的最小錯誤解可能不是真正的單倍體組合。因為最小錯誤更正法將所有變異基因座都考慮在一起,導致某接基因組區域內大量且密集的錯誤誤導了更正過程的選擇。為解決這個問題,我們提出一個基於階層式組裝的方法,利用漸進方式更正區域衝突。
本論文提出一個基於階層是組裝的新型定相計算方法。此方法由可信度高的成對變異基因座開始,逐步建構出完整單倍體。與其他基於最小錯誤更正法的計算方法相比,在真實收集的全基因組短序列資料集與模擬資料上皆取得較佳的定相錯誤率。論文將真實數據中錯誤校正的次數與其他方法做比較,發現本方法也能找到相當少校正次數的答案解,此說明本方法雖不完全以最小錯誤法為出發點,但仍能找到接近最小錯誤解。最後利用模擬數據來測試論文方法在不同定序條件下的效能,以實證論文方法在現實中的適用性。
zh_TW
dc.description.abstractWhile genotyping refers to the determination of alleles present at a variant locus, such as A/A (homozygous) or G/A (heterozygous), haplotyping describes the procedure of telling which alleles are co-located on the same chromosome, revealing the correct relationship between alleles at multiple loci. In other words, haplotyping is the procedure of phasing two heterozygous variants G/A and C/T in order to produce haplotypes of G-T (one copy of the chromosome) and A-C (the other copy of the chromosome). In this regard, phasing, an alternative name for haplotyping, is to clarify cis relationships in chromosomes, which are essential in association research and clinical genetics.
The need for read-based phasing arises with advances in sequencing technologies. The minimum error correction (MEC) approach is the primary model to resolve haplotypes by reducing conflicts in a SNP-fragment matrix. However, it is frequently observed that the solution with the optimal MEC might not be the real haplotypes, due to the fact that MEC methods consider all positions together and sometimes the discords in noisy regions might mislead the selection of corrections. To tackle this problem, a hierarchical assembly-based method was designed to progressively resolve local conflicts.
This thesis presents HAHap, a new phasing algorithm based on hierarchical assembly. HAHap leverages high-confident variant pairs to build haplotypes progressively. The phasing results by HAHap on both real and simulated data, compared to other MEC-based methods, revealed better phasing error rates for constructing haplotypes using short reads from whole-genome sequencing. The study compared the number of error corrections on real data with other methods, and it reveals the ability of HAHap to predict haplotypes with a lower number of error corrections. The simulated data was used to investigate the behavior of HAHap under different sequencing conditions, highlighting the applicability of HAHap in certain situations.
en
dc.description.provenanceMade available in DSpace on 2021-06-17T06:06:06Z (GMT). No. of bitstreams: 1
ntu-107-D99945022-1.pdf: 2084119 bytes, checksum: 6bffb398b873f0416d21d290863fa720 (MD5)
Previous issue date: 2018
en
dc.description.tableofcontents博士學位論文口試委員會審定書 i
ACKNOWLEDGMENTS ii
中文摘要 iii
ABSTRACT iv
TABLE OF CONTENTS v
LIST OF FIGURES vii
LIST OF TABLES xi
CHAPTER 1 Introduction 1
1.1 Experimental methods 2
1.2 Computational methods 3
1.3 Thesis structure 5
CHAPTER 2 Related Works 6
2.1 SNP-fragment matrix 6
2.2 Minimum error correction model 8
2.3 Challenges of read-based phasing 12
CHAPTER 3 Methods 13
3.1 Haplotype-informative reads and variant blocks 14
3.2 Hierarchical assembly 15
3.3 Multinomial distribution metric 18
3.4 Local phasing using MEC 21
3.4.1 Embedded merging 22
3.4.2 Only ambiguous variant pairs remain 23
3.5 Examples of choices between heuristic and local MEC-based search 24
CHAPTER 4 Experiment Design 27
4.1 Evaluation measurement 27
4.2 Real Datasets 29
4.3 Simulated Datasets 31
CHAPTER 5 Experimental Results 33
5.1 Evaluation on real data 33
5.2 Evaluation on simulated data 36
5.3 Evaluation on sequencing skewness 38
5.4 Comparison of number of error corrections 41
5.5 Comparison of Running time 43
CHAPTER 6 Application on ABO Blood Type 45
CHAPTER 7 Discussions 47
CHAPTER 8 Conclusion and Future Works 49
REFERENCE: 50
Appendix 53
dc.language.isoen
dc.subject單倍體分型zh_TW
dc.subject短序列單倍體定相方法zh_TW
dc.subject階層式組裝zh_TW
dc.subject機率分數zh_TW
dc.subject最小錯誤修正法zh_TW
dc.subjectProbability scoreen
dc.subjectRead-based phasingen
dc.subjectHierarchical assemblyen
dc.subjectMinimum Error Correctionen
dc.subjectHaplotypeen
dc.title基於階層式組裝之短序列單倍體定相方法zh_TW
dc.titleRead-based haplotyping method using hierarchical assemblyen
dc.typeThesis
dc.date.schoolyear107-1
dc.description.degree博士
dc.contributor.coadvisor陳倩瑜
dc.contributor.oralexamcommittee陳沛隆,趙坤茂,蔡懷寬
dc.subject.keyword單倍體分型,短序列單倍體定相方法,階層式組裝,機率分數,最小錯誤修正法,zh_TW
dc.subject.keywordHaplotype,Read-based phasing,Hierarchical assembly,Probability score,Minimum Error Correction,en
dc.relation.page59
dc.identifier.doi10.6342/NTU201900074
dc.rights.note有償授權
dc.date.accepted2019-01-16
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept生醫電子與資訊學研究所zh_TW
顯示於系所單位:生醫電子與資訊學研究所

文件中的檔案:
檔案 大小格式 
ntu-107-1.pdf
  未授權公開取用
2.04 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved