請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/34277完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 趙坤茂(Kun-Mao Chao) | |
| dc.contributor.author | Yao-Ting Huang | en |
| dc.contributor.author | 黃耀廷 | zh_TW |
| dc.date.accessioned | 2021-06-13T06:01:02Z | - |
| dc.date.available | 2006-06-27 | |
| dc.date.copyright | 2006-06-27 | |
| dc.date.issued | 2006 | |
| dc.date.submitted | 2006-06-23 | |
| dc.identifier.citation | [1] Altshuler, D., Brooks, L.D., Chakravarti, A., Collins, F.S., Daly, M.J., and Donnelly, P. A
haplotype map of the human genome. Nature, 437:1299–1320, 2005. [2] Ao, S.I., Yip, K., Ng, M., Cheung, D., Fong, P.Y., Melhado, I., and Sham,P.C. CLUSTAG: hierarchical clustering and graph methods for selecting tag SNPs. Bioinformatics, 21(8):1735– 1736, 2004. [3] Bafna, V., Gusfield, D., Lancia, G., and Yooseph, S. Haplotyping as perfect phylogeny: a direct approach. Journal of Computational Biology, 10:323–340, 2003. [4] Bafna, V., Halld’orsson, B.V., Schwartz, R., Clark, A.G., Istrail, S. Haplotypes and informative SNP selection algorithms: don’t block out information. In Proc. RECOMB’03, pages 19–27, 2003. [5] Brown, D., and Harrower I. A new integer programming formulation for the pure parsimony problem in haplotype analysis. In Proc. WABI’04, 2004. [6] Carlson, C.S., Eberle, M.A., Rieder, M.J., Yi, Q., Kruglyak, L., and Nickerson, D.A. Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am. J. Hum. Genet., 74:106–120, 2004. [7] Chang, C.C. and Lin, C.J. LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/?cjlin/libsvm, 2001. [8] Chang, C.-J., Huang, Y.-T., and Chao, K.-M. A greedier appraoch for finding tag SNPs. Bioinformatics, 22: 685-691, 2006. [9] Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C. Introduction to algorithms, The MIT Press, 2001. [10] Crawfod, D.C. and Nickerson, D.A. Definition and clinical importance of haplotypes. Annu. Rev. Med., 56:303–320, 2005. [11] Daly, M.J., Rioux, J.D., Schaffner, S.F., Hudson, T.J., and Lander, E.S. High-resolution haplotype structure in the human genome. Nat Genet, 29(2):229–232, 2001. [12] de Bakker, P.I., Yelensky, R., Pe’er, I., Gabriel, S.B., Daly, M.J., and Altshuler, D. Efficiency and power in genetic association studies. Nat. Genet., pages 1217–1223, 2005. [13] Douglas, J.A., Boehnke, M., Gillanders, E., Trent, J.M., and Gruber, S.B. Experimentallyderived haplotypes substantially increase the efficiency of linkage disequilibrium studies. Nat Genet, 28(4):361–364, 2001. [14] Drysdale, C., McGraw, D., Stack, C., Stephens, J., Judson, R., et al: Complex promoter and coding region ‾2-adrenergic receptor haplotypes alter receptor expression and predict in vivo responsiveness. Proceeding of National Academy of Sciences of USA, 97:10483–10488, 2000. [15] Eskin, E., and Halperin, E. Large scale recovery of haplotypes from genotype data using imperfect phylogeny. In Proc. RECOMB’03, pages 104–113, 2003. [16] Even, G., Naor, J., Schieber, B., and Sudan, M. Approximating minimum feedback sets and multicuts in directed graphs. Algorithmica, 20:151–174, 1998. [17] Excoffier, L., and Slatkin, M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol. Biol. Evol., 12:921–927, 1995. [18] Forsgren, A., Gill, P.E., and Wright, M.H. Interior methods for nonlinear optimization. SIAM Rev., 44:525–597, 2002. [19] Gabriel, S.B., Schaffner, S.F., Nguyen, H., Moore, J.M., Roy, J., Blumenstiel, B., Higgins, J., DeFelice, M., Lochner, A., Faggart, M., Liu-Cordero, S.N., Rotimi, C., Adeyemo, A., Cooper, R., Ward, R., Lander, E.S., Daly, M.J., and Altshuler, D. The structure of haplotype blocks in the human genome. Science, 296(5576):2225–2229, 2002. [20] Garey, M.R., and Johnson, D.S. Computers and intractability, Freeman, New York, 1979. [21] Gusfield, D. Inference of haplotypes from samples of diploid populations: complexity and algorithms. J. Comp. Biol., 8:305–323, 2001. [22] Gusfield, D. Haplotyping as perfect phylogeny: conceptual framework and efficient solutions. In Proc. RECOMB’02, pages 166–175, 2002. [23] Gusfield, D. Haplotyping by pure parsimony. In Proc. CPM’03, Lecture Notes in Computer Science, 2676:144–155, 2003. [24] Halld’orsson, B.V., Bafna, V., Lippert, R., Schwartz, R., Vega, F.M., Clark, A.G., and Istrail, S. Optimal haplotype block-free selection of tagging SNPs for genome-wide association studies. Genome Research, pages 1633–1640, 2004. [25] Halperin, E. and Eskin, E. Haplotype reconstruction from genotype data using imperfect phylogeny. Bioinformatics, 2004. [26] Helmuth, L. Genome research: map of the human genome 3.0. Science, 293(5530):583–585, 2001. [27] Hinds, D.A., Stuve, L.L., Nilsen, G.B., Halperin, E., Eskin, E., Ballinger, D.G., Frazer, K.A., Cox, D.R. Whole-genome patterns of common DNA variation in three human populations. Science, 307:1072–1079, 2005. [28] Houchbaum, D.S. Approximation algorithms for the set covering and vertex cover problems. SIAM J. Comp., 11:555-556, 1982. [29] Hu, N., Wang, C., Hu, Y., Yang, H.H., Giffen, C., Tang, Z.-Z., Han, X.-Y., Goldstein, A.M., Emmert, M.R., Buetow, K.H., and Taylor, P.R., and Lee, M.P. Genome-wide asspciation study in esophageal cancer using genechip mapping 10K array. Cancer Research, 65(7):2542–2546, 2005. [30] Huang, Y.-T., Zhang, K., Chen, T., and Chao, K.-M. Approximation algorithms for the selection of robust tag SNPs. In Proc. WABI’04, pages 278–289, 2004. [31] Huang, Y.-T., Zhang, K., Chen, T., and Chao, K.-M. Selecting additional tag SNPs for tolerating missing data in genotyping. BMC Bioinformatics, 6:263, 2005. [32] Huang, Y.-T., Chao, K.-M, and Chen, T. An approximation algorithm for haplotype inference by pure parsimony. In Proc. ACM SAC’05, pages 146–150, 2005. [33] Huang, Y.-T., Chao, K.-M, and Chen, T. An approximation algorithm for haplotype inference by pure parsimony. Journal of Computational Biology, 12: 1261-1274, 2005. [34] Hudon, R.R. Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics, 18:337–338, 2002. [35] Kennedy, G.C., Matsuzaki, H., Dong, S., Liu, W.M., Huang, J., Liu, G., Su, X., Cao, M., Chen, W., Zhang, J., Liu, W., Yang, G., Di, X., Ryder, T., He, Z., Surti, U., Phillips, M.S., Boyce-Jacino, M.T., Fodor, S.P., and Jones, K.W. Large-scale genotyping of complex DNA. Nature Biotechnology, 21:1233–1237, 2003. [36] Kerem, B., Rommens, J., Buchanan, J., Markiewicz, D., Cox, T., Chakravarti, A., Buchwald, M., and Tsui, L.C. Identification of the cystic fibrosis gene: genetic analysis. Science, 245:1073– 1080, 1989. [37] Lancia, G., Pinotti, C., and Rizzi., R. Haplotyping Populations: complexity, exact and approximation algorithms. INFORMS Journal on computing, 16:348–359, 2004. [38] Lewin, B. Genes VIII. Pearson Prentice-Hall, 2004. [39] Lin, S., Cutler, D.J., Zwick, M.E., and Chakravarti, A. Haplotype inference in random population samples. Am. J. Hum. Genet., 71:1129-1137, 2002. [40] Lin, S., Chakravarti, A., and Cutler, D.J. Exhaustive allelic transmission disequilibirum tests as a new approach to genome-wide association studies. Nature Genetics, 36:1181-1188, 2004. [41] Liu, W.M., Di, X., Yang, G., Matsuzaki, H., Huang, J., Mei, R., Ryder, T.B., Webster, T.A., Dong, S., Liu, G., Jones, K.W., Kennedy, G.C., and Kulp, D. Algorithms for Large Scale Genotyping Microarrays. Bioinformatics, 19:2397-2403, 2003. [42] LP Solve [http://www.cs.sunysb.edu/?algorith/implement/lp solve/implement.shtml] [43] Niu, T., Qin, Z., Xu, X., and Liu, J.S. Bayesian haplotype inference for multiple linked singlenucleotide polymorphisms. Am. J. Hum. Genet., 70:157–159, 2002. [44] Ott, J., and Hoh, J. Set association analysis of SNP case-control and microarray data. Journal of Computational Biology, 10:569-574, 2003. [45] Papadimitriou, C. H., and Yannakakis, M. Optimization, approximation, and complexity classes. J. Comput. System Sci., 43:425-440, 1991. [46] Patil, N., Berno, A.J., Hinds, D.A., Barrett, W.A., Doshi, J.M., Hacker, C.R., Kautzer, C.R., Lee, D.H., Marjoribanks, C., McDonough, D.P., et al. Blocks of limited haplotype diversity revealed by high-resolution scanning of human chromosome 21. Science, 294:1719–1723, 2001. [47] Qin, Z., Niu, T., and Liu, J. Partitioning-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide Ploymorphisms. Am. J. Hum. Genet., 71:1242–1247, 2002. [48] Qin, Z.S., Gopalakrishnan, S., Abecasis, G.R. An efficient comprehensive search algorithm for TagSNP selection using linkage disequilibirium criteria. Bioinformatics, 99(11):7335–7339, 2006. [49] Quinlan, R. C4.5: programs for machine learning. Morgan Kaufmann Publishers, 1993. [50] Seltman, H., Roeder, K., and Devlin, B. Transmission/disequilibrium test meets measured haplotype analysis: family-based association analysis guided by evolution of haplotypes. Am. J. Hum. Genet., 68(5):1250–1263, 2001. [51] Sharan, R., Halldorsson, B.V., and Istrail, S. Islands of Tractability for Parsimony Haplotyping. In Proc. CSB’05, 65–72, 2005. [52] Stephens, M., Smith, N.J., and Donnelly, P. A new statistical method for haplotype reconstruction from population data. Am. J. Hum. Genet., 68(4):978–989, 2001. [53] Stephens, M., and Donnelly, P. A comparison of bayesian methods for haplotype reconstruction from population genotype data. Am. J. Hum. Genet., 73:1162–1169, 2003. [54] Stram, D.O., Haiman, C.A., Hirschhorn J.N., Altshuler, D., Kolonel, L.N., Henderson, B.E., and Pike, M.C. Choosing haplotype-tagging SNPs based on unphased genotype data using a preliminary sample of unrelated subjects with an example from the multiethnic cohort study. Hum. Hered., 55:27–36, 2003. [55] Tang, E.K., Suganthan, P.N., and Yao, X. Gene selection algorithms for microarray data based on the least quares support vector machine. BMC Bioinformatics, 7:95, 2006. [56] Tsalenko, A., Ben-Dor, A., Cox, N., and Yakhini, Z. (2003) Methods for analysis and visualization of SNP genotype data for complex diseases. PSB’03, 548–561. [57] The Genome Sequencing Project. Initial sequencing and analysis of human genome. Nature, 409:860–921, 2001. [58] Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., et al: The sequence of the human genome. Science, 291(5507):1304–1351, 2001. [59] Wang, L., and Xu, Y. Haplotype inference by maximum parsimony. Bioinformatics, 19(14):1773–1780, 2003. [60] Weal, M.E., Depondt, C., Macdonald, S.J., Smith, A., Lai, P.S., Shorvon, S.D., Wood, N.W., Goldstein, D.B. Selection and avalulation of taggign SNPs in the neuronal-sodium-channel gene SCN1A: implications for linkage diequilibrium gene mapping. Am. J. Hum. Genet., 73:551–565, 2003. [61] Witten, I.H. and Frank, E. Data Mining: Practical machine learning tools and techniques, 2nd Edition, Morgan Kaufmann, San Francisco, 2005. [62] Yan, H, Papadopoulos, N., Marra, G., et al: Conversion of diploidy to haploidy. Nature, 403:723–724, 2000. [63] Yang, Y., Zhang, J., Hoh, J., Matsuda, F., Xu, P., Lathrop, M., Ott, J. Efficiency of singlenucleotide polymorphism haplotype estimation from pooled DNA. Proc. Nat. Acad. Sci., 100(12):7225–7230, 2003. [64] Zhang, K., Deng, M., Chen, T., Waterman, M.S., and Sun, F. A dynamic programming algorithm for haplotype partitioning. Proc. Nat. Acad. Sci., 99(11):7335–7339, 2002. [65] Zhang, K., Sun, F., Waterman, M.S., and Chen, T. Haplotype block partition with limited resources and applications to human chromosome 21 haplotype data. Am. J. Hum. Genet., 73:63–73, 2003. [66] Zhang, K., Qin, Z.S., Liu, J.S., Chen, T., Waterman, M.S., and Sun, F. Haplotype block partition and tag SNP selection using genotype data and their applications to association studies. Genome Research, 14:908–916, 2004. [67] Zhao, J.H., Lissarrague, S., Essioux, L., and Sham, P.C. GENECOUNTING: haplotype analysis with missing genotypes. Bioinformatics, 18:1694–1695, 2002. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/34277 | - |
| dc.description.abstract | 本論文探討一系列與單核苷酸多型性與單體型相關之最佳化問題。本論文中大多數探討之問題都被證明為難題。為了能有效率地解決這些難題,我們設計並實作一系列近似演算法。經由理論分析與實驗佐證,我們證明所設計的演算法不僅有良好的執行速度,其所找到的近似解亦可相當逼近最佳解。
本論文的第一部份探討如何尋找一組單核苷酸多型性,稱為強固型標籤單核苷酸多型性,可以容許在實驗中遺失部分單核苷酸多型性。我們證明要尋找最少數量的強固型標籤單核苷酸多型性為一難題。為了能有效率的解決此一難題,我們提出了兩種貪婪演算法與一種線性規劃釋限演算法。我們的理論分析與實驗結果顯示這些演算法不僅執行速度很快,更可以找到相當接近最佳解之近似解。 本論文的第二部份探討如何利用多標記單體型,尋找一組最少量的標籤單核苷酸多型性。我們將此問題切割成三個子問題,並證明其中兩個子問題為難題。我們提出一些精確與近似演算法來分別解決這三個子問題。實驗結果顯示我們整合這些演算法所開發出的軟體,不僅可比現有的軟體找出更少量的標籤單核苷酸多型性,其執行速度亦有顯著的提升。 本論文的第三部份探討在基於最大簡約法則下,從基因型資料中推測出單體型資料。我們將此問題表示為一個整數二次方規劃的問題,然後提出一個基於半正定規劃之反覆式近似演算法來解決此問題。我們的理論分析與實驗結果顯示我們開發出的軟體,不僅可以找出相當逼近最佳解之近似解,更可以比現有的軟體在部分資料下得到較好的推測正確率。 本論文的第四部份探討如何在在全基因組關連分析中,選擇一些具有鑑別性的單核苷酸多型性來區分出病例與對照之樣本。我們設計了一個有效率的演算法來尋找具有鑑別性的單核苷酸多型性,並與目前既有之軟體在各種分類器下作比較。我們的實驗結果顯示此演算法在選擇足夠的鑑別性單核苷酸多型性下,可以比其他方法得到較高的鑑別正確率。 | zh_TW |
| dc.description.abstract | This dissertation studies several optimization problems related to SNPs and haplotypes. Most problems studied in this dissertation are shown to be NP-hard. To efficiently solve these problems, we design and implement a series of approximation algorithms. Our theoretical analysis and experimental results indicate that these algorithms are
not only efficient but the solutions found by them are also quite close to the optimal solution. In Part I of this dissertation, we show that there exists a set of SNPs called robust tag SNPs which can tolerate missing SNPs in genotyping. The problem of finding a minimum set of robust tag SNPs is shown to be NP-hard. We give two greedy algorithms and one linear programming relaxation algorithm to efficiently solve this problem. Our theoretical analysis and experimental results show that these algorithms not only run very fast but also find nearly-optimal solutions. In Part II of this dissertation, we study the problem of selecting a minimum set of tag SNPs by multimarker haplotypes. This problem is divided into three subproblems, two of which are shown to be NP-hard. Several exact and approximation algorithms are proposed to solve these subproblems. The experimental results indicate that the program developed by integrating these algorithms finds a smaller set of tag SNPs and runs much faster than existing methods. In Part III of this dissertation, we study the problem of haplotype inference by maximum parsimony. We formulate this problem as an integer quadratic programming problem and present an iterative semi-definite programming relaxation based approximation algorithm. Our theoretical analysis and experimental results show that the solution found is not only close to the optimal solution but the accuracy is also improved in comparison with existing methods. In Part IV of this dissertation, we study the problem of selecting discriminative SNPs for classifying cases and controls in genome-wide association studies. We describe an efficient algorithm for identifying discriminative SNPs and compare it with several existing methods using a variety of classifiers. The experimental results indicate that our method consistently obtains better accuracies than other methods when sufficient discriminative SNPs are selected. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-13T06:01:02Z (GMT). No. of bitstreams: 1 ntu-95-D92922023-1.pdf: 723294 bytes, checksum: 167cd45ebe3aa3911c7fc3ffb1131c05 (MD5) Previous issue date: 2006 | en |
| dc.description.tableofcontents | 1 Introduction 1
1.1 Background and Definitions . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 SNPs and Haplotypes . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.2 Linkage Disequilibrium and Chromosome Recombination . . . . . . . . . . . 4 1.1.3 Haplotype Blocks and Tag SNPs . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Problems and Manuscript Plan . . . . . . . . . . . . . . . . . . . . . . . 5 2 Approximation Algorithms for the Selection of Robust Tag SNPs 9 2.1 Related Works on the Problem of Finding Tag SNPs . . . . . . . . . . . . . 9 2.2 The Problem of Missing Data in Genotyping . . . . . . . . . . . . . . . . . 10 2.3 Methods for Finding Robust Tag SNPs . . . . . . . . . . . . . . . . . . . . 12 2.3.1 The First Greedy Algorithm . . . . . . . . . . . . . . . . . . . . . . . 14 2.3.2 The Second Greedy Algorithm . . . . . . . . . . . . . . . . . . . . . . . 17 2.3.3 The Iterative LP-relaxation Algorithm . . . . . . . . . . . . . . . . . . 19 2.4 Methods for Finding Auxiliary Tag SNPs . . . . . . . . . . . . . . . . . . 22 2.5 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3 Approximation Algorithms for the Selection of Tag SNPs by Multimarker Haplotypes 31 3.1 Related Works on the Problem of Selecting Tag SNPs by Multimarker Haplotypes 32 3.2 Formulation and Hardness of the MHTP Problem . . . . . . . . . . . . . . . . 34 3.3 An Approximation Algorithm for the MHTP Problem . . . . . . . . . . . . . . 36 3.4 Exact and Approximation Algorithms for Solving Subproblems of MTMH . . . . . 38 3.5 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 4 An Approximation Algorithm for Haplotype Inference 51 4.1 Related Works on the Problem of Haplotype Inference . . . . . . . . . . . . 51 4.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 4.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5 Discriminative SNPs Selection for Genome Wide Association Studies 67 5.1 Related Works on the Problem of Selecting Discriminative SNPs . . . . . . . 68 5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2.1 Discriminative SNPs Selection Method . . . . . . . . . . . . . . . . . . . 69 5.2.2 Classification Method . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 6 Conclusion 81 6.1 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 6.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 | |
| dc.language.iso | en | |
| dc.subject | 單體型 | zh_TW |
| dc.subject | 近似演算法 | zh_TW |
| dc.subject | 單核苷 | zh_TW |
| dc.subject | 酸多型性 | zh_TW |
| dc.subject | 基因型 | zh_TW |
| dc.subject | Genotype | en |
| dc.subject | Appproximation Algorithm | en |
| dc.subject | Single Nucleotide Polymorphism | en |
| dc.subject | Haplotype | en |
| dc.title | 單核苷酸多型性與單體型相關的最佳化問題之研究 | zh_TW |
| dc.title | A Study on Some Optimization Problems Related to SNPs and Haplotypes | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 94-2 | |
| dc.description.degree | 博士 | |
| dc.contributor.oralexamcommittee | 徐熊健(Shyong-Jian Shyu),呂學一(Heueh-I Lu),歐陽明(Ming Ouhyoung),黃奇英(Chi-Ying Huang),歐陽彥正(Yen-Jen Oyang),高成炎(Cheng-Yan Kao) | |
| dc.subject.keyword | 單核苷,酸多型性,單體型,基因型,近似演算法, | zh_TW |
| dc.subject.keyword | Single Nucleotide Polymorphism,Haplotype,Genotype,Appproximation Algorithm, | en |
| dc.relation.page | 90 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2006-06-23 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-95-1.pdf 未授權公開取用 | 706.34 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
