請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/71669
標題: | 基於階層式組裝之短序列單倍體定相方法 Read-based haplotyping method using hierarchical assembly |
作者: | Yu-Yu Lin 林祐榆 |
指導教授: | 歐陽彥正 |
共同指導教授: | 陳倩瑜 |
關鍵字: | 單倍體分型,短序列單倍體定相方法,階層式組裝,機率分數,最小錯誤修正法, Haplotype,Read-based phasing,Hierarchical assembly,Probability score,Minimum Error Correction, |
出版年 : | 2018 |
學位: | 博士 |
摘要: | 基因分型是在確定變異基因座上存在的等位基因,例如A / A(同型合子)或G / A(異型合子),而單倍體分型則描述了判定不同基因座上等位基因共同位於一染色體的過程,正確地揭示多個位點間等位基因之間的關係。換句話說,單倍體分型在於判定多個異型合子在兩股染色體上共出現的相互關係。因此單倍體分型又稱為定相,用於表示同染色體中,不同基因座上等位基因的順位關係。研究指出單倍體相位會影響基因表達、訊號傳遞與蛋白質功能,因此極為重要。
隨著定序技術進步,基於短序列資訊的定相方法日趨受到重視。最小錯誤更正法為短序列定相方法中最主流的建模方式,此方法通過減少單核苷酸多態性與片段矩陣內的衝突來預測機率最高的單倍體組合。然而,學者觀察普遍指出最佳的最小錯誤解可能不是真正的單倍體組合。因為最小錯誤更正法將所有變異基因座都考慮在一起,導致某接基因組區域內大量且密集的錯誤誤導了更正過程的選擇。為解決這個問題,我們提出一個基於階層式組裝的方法,利用漸進方式更正區域衝突。 本論文提出一個基於階層是組裝的新型定相計算方法。此方法由可信度高的成對變異基因座開始,逐步建構出完整單倍體。與其他基於最小錯誤更正法的計算方法相比,在真實收集的全基因組短序列資料集與模擬資料上皆取得較佳的定相錯誤率。論文將真實數據中錯誤校正的次數與其他方法做比較,發現本方法也能找到相當少校正次數的答案解,此說明本方法雖不完全以最小錯誤法為出發點,但仍能找到接近最小錯誤解。最後利用模擬數據來測試論文方法在不同定序條件下的效能,以實證論文方法在現實中的適用性。 While genotyping refers to the determination of alleles present at a variant locus, such as A/A (homozygous) or G/A (heterozygous), haplotyping describes the procedure of telling which alleles are co-located on the same chromosome, revealing the correct relationship between alleles at multiple loci. In other words, haplotyping is the procedure of phasing two heterozygous variants G/A and C/T in order to produce haplotypes of G-T (one copy of the chromosome) and A-C (the other copy of the chromosome). In this regard, phasing, an alternative name for haplotyping, is to clarify cis relationships in chromosomes, which are essential in association research and clinical genetics. The need for read-based phasing arises with advances in sequencing technologies. The minimum error correction (MEC) approach is the primary model to resolve haplotypes by reducing conflicts in a SNP-fragment matrix. However, it is frequently observed that the solution with the optimal MEC might not be the real haplotypes, due to the fact that MEC methods consider all positions together and sometimes the discords in noisy regions might mislead the selection of corrections. To tackle this problem, a hierarchical assembly-based method was designed to progressively resolve local conflicts. This thesis presents HAHap, a new phasing algorithm based on hierarchical assembly. HAHap leverages high-confident variant pairs to build haplotypes progressively. The phasing results by HAHap on both real and simulated data, compared to other MEC-based methods, revealed better phasing error rates for constructing haplotypes using short reads from whole-genome sequencing. The study compared the number of error corrections on real data with other methods, and it reveals the ability of HAHap to predict haplotypes with a lower number of error corrections. The simulated data was used to investigate the behavior of HAHap under different sequencing conditions, highlighting the applicability of HAHap in certain situations. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/71669 |
DOI: | 10.6342/NTU201900074 |
全文授權: | 有償授權 |
顯示於系所單位: | 生醫電子與資訊學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-107-1.pdf 目前未授權公開取用 | 2.04 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。