序列操作與相關問題之演算法研究

Tien-Ching Lin; 林添進

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28457

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	李德財
dc.contributor.author	Tien-Ching Lin	en
dc.contributor.author	林添進	zh_TW
dc.date.accessioned	2021-06-13T00:08:50Z	-
dc.date.available	2007-07-31
dc.date.copyright	2007-07-31
dc.date.issued	2007
dc.date.submitted	2007-07-27
dc.identifier.citation	[1] R. Agrawal, T. Imielinski, and A. Swami. Data mining using two-dimensional optimized association rules: scheme, algorithms, and visualization. Proceedings of the 1993 ACM SIGMOD international conference on management of data, 207-216, 1993. [2] M. Ajtai, A. Koml´os, and E. Szemer´edi. An O(n log n) sorting networks. Combinatorica, 3:1–19, 1983. [3] S. Alk and G. Guenther. Application of broadcasting with selective reduction to the maximal sum subsegment problem. International journal of high speed computating, 3:107-119, 1991. [4] N. Alon, H. S. Joel, and E. Paul. The Probabilistic Method, Wiley-Interscience Series. [5] S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman. Basic local alignment search tool. Journal of Molecular Biology, 215:403-410, 1990. [6] S. E. Bae and T. Takaoka. Algorithms for the problem of k maximum sums and a VLSI algorithm for the k maximum subarrays problem. 2004 International Symposium on Parallel Architectures, Algorithms and Networks, 247-253, 2004. [7] S. E. Bae and T. Takaoka. Improved Algorithms for the K-Maximum Subarray Problem . The Computer Journal, 2006 49(3):358-374. [8] G. Barhardi. Isochores and the evolutionary genomics of vertebrates. Gene, 241:3–17, 2000. [9] F. Bengtsson and J. Chen. Efficient Algorithms for K Maximum Sums. Algorithmica, 2006 46:27-41. [10] A. Bergkvist and P. Damaschke Fast Algorithms for Finding Disjoint Subsequences with Extremal Densities. Algorithms and Computation, 16th International Symposium, ISAAC 2005, 714-723. [11] F. Bengtsson and J. Chen. Computing Maximum-Scoring Segments in Almost Linear Time. Computing and Combinatorics, 12th Annual International Conference, COCOON 2006, 255-264, 2006. [12] J. Bentley. Programming pearls: algorithm design techniques. Commun. ACM, 27, 9:865-873, 1984. [13] J. Bentley. Programming pearls: algorithm design techniques. Commun. ACM, 27, 11:1087-1092, 1984. [14] M. Ben-Or. Lower bounds for algebraic computation trees. In Proc. 15th Annu. ACM Sympos. Theory Comput., pages 80-86, 1983. [15] G. Bernardi and G. Bernardi. Compositional constraints and genome evolution. Journal of Molecular Evolution, 24:1–11, 1986. [16] M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest, and R. E. Tarjan. Time bound for selection. Journal of Computer and System Sciences, 7(4):448–461. 1973. [17] H. Br¨onnimann and B. Chazelle. Optimal slope selection via cuttings. Computational Geometry — Theory and Applications, 10(1):23–29, 1998. [18] Y.-H. Chen, H.-I. Lu, and C.-Y. Tang. Disjoint segments with maximal density. Proceedings of the International Conference on Computational Science, volumn 3515 of LNCS, 845-850, 2005. [19] C.-H. Cheng, K.-Y. Chen, W.-C. Tien, and K.-M. Chao. Improved algorithms for the k maximum-sums problems. Theoretical computer science, 362: 162-170, 2006. [20] K.-M. Chung and H.-I. Lu. An optimal algorithm for the maximum-density segment problem. SIAM Journal on Computing, 34(2):373–387, 2004. [21] R. Cole. Slowing down sorign networks to obtain faster sorting algorithm. Journal of the association for computing machinery, Vol. 34, No. 1:200–208, 1987. [22] R. Cole, J. S. Salowe, W. L. Steiger, and E. Szemeredi. An optimal-time algorithm for slope selection. SIAM Journal on Computing, 18(4):792–810, 1989. [23] T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introdution to algorithms, MIT Press, 1998. [24] R. Davuluri, I. Grosse, M. Zhang. Computational identification of promoters and first exons in the human genome. Nature Genetics, 29:412-417, 2001. [25] M. De Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf. Computational Geometry: Algorithms and Applications. Springer-Verlag, Berlin, 1997. [26] M. H. Dillencourt, M. H. Mount,and N. S. Netanyahu. A randomized algorithm for slope selection. International Journal of Computational Geometry and Applications, vol. 2, No. 1:1–27, 1992. [27] T.-H. Fan, S. Lee, H.-I. Lu, T.-S. Tsou, T.-C. Wang, and A. Yao. An optimal algorithm for maximum-sum segment and its application in bioinformatics. Eighth International Conference on Implementation and Application of Automata (CIAA 2003), 251-257. [28] P. Fariselli, M. Finelli, M. Marchignoli, P. L. Martelli, I. Rossi, and R. C. Maxsubseq. An algorithm for segment-length optimization. The case study of the transmembrane spanning segments. Bioinformatics, 19:500–505, 2003. [29] W. F. Robert and L. R. Ronald. Expected time bounds for selection. Communications of the ACM, 18(3):165-172. 1975. [30] T. Fukuda, Y. Morimoto, S. Morishita, and T. Tokuyama. Mining association rules between sets of items in large databases. Proceedings of the 1996 ACM SIGMOD international conference on management of data, 13-23, 1996. [31] M. H. Goldwasser, M.-Y. Kao, and H.-I. Lu. Fast algorithms for finding maximumdensity segments of a sequence with applications to bioinformatics. In R. Guig´o and D. Gusfield, editors, Proceedings of the Second International Workshop of Algorithms in Bioinformatics, Lecture Notes in Computer Science 2452, pages 157–171, Rome, Italy, 2002. Springer-Verlag. [32] M. H. Goldwasser, M.-Y. Kao, and H.-I. Lu. Linear-time algorithms for computing maximum-density sequence segments with bioinformatics applications. Journal of Computer and System Sciences, 70(2), 128–144, 2005. [33] R. L. Graham. An efficient algorithm for determining the convex hull of a finite planar set. Information Processing Letters, 1:132-133, 1972. [34] D. Gries. A note on the standard strategy for developing loop invariants and loops. Science of computer programming, 2:207-214, 1982. [35] S. Hannenhalli and S. Levy. Promoter prediction in the human genome. Bioinformatics, 17:S90-S96, 2001. [36] C. A. R. Hoare. Algorithm 63 (partition) and algorithm 65 (find). Communications of the ACM, 4(7):321–322, 1961. [37] S.-Y. Hsieh and T.-Y. Chou. Finding a Weight-Constrained Maximum-Density Subtree in a Tree. Algorithms and Computation, 16th International Symposium, ISAAC 2005, 944-953. [38] X. Huang. An algorithm for identifying regions of a DNA sequence that satisfy a content requirement. Computer Applications in the Biosciences, 10(3):219–225, 1994. [39] I. P. Ioshikhes and M. Q. Zhang. Large-scale human promoter mapping using CpG islands. Gene, Nature Genetics 26:61–63, 2000. [40] N. C. Jones and P. A. Pevzner. An Introduction to Bioinformatics Algorithms, MIT Press, 1998.. [41] M. J. Katz and M. Sharir. Optimal slope selection via expanders. Information Processing Letters, 47(3):115–122, 1993. [42] S. K. Kim. Linear-time algorithm for finding a maximum-density segment of a sequence. Information Processing Letters, 86(6):339–342, 2003. [43] T.-C. Lin and D. T. Lee. Randomized algorithm for the sum selection problem. Algorithms and computation, 16th international symposium, ISAAC 2005, 515-523. [44] T.-C. Lin and D. T. Lee. Efficient algorithms for the sum selection problem and k maximum sums problem. Algorithms and computation, 17th international symposium, ISAAC 2006, 460-473. [45] T.-C. Lin and D. T. Lee. Randomized algorithm for the sum selection problem. Theoretical computer science, 377(1-3):151-156, 2007. [46] T.-C. Lin and D. T. Lee. Efficient algorithms for the sum selection problem and k maximum sums problem. Submitted to SICOMP. [47] D. T. Lee., T.-C. Lin, and H.-I. Lu. Fast Algorithms for the Density Finding Problem Conditionally accepted by Algorithmica. [48] R.-R. Lin, W.-H. Kuo, and K.-M. Chao. Finding a Length-Constrained Maximum- Density Path in a Tree. Journal of Combinatorial Optimization, 9(2), 147–156, 2005. [49] Y.-L. Lin, T. Jiang, and K.-M. Chao. Algorithms for locating the length-constrained heaviest segments, with applications to biomolecular sequence analysis. Journal of Computer and System Sciences, 65(3):570–586, 2002. [50] Y. -L. Lin, X. Huang, T. Jiang, and K.-M. Chao. MAVG: locating non-overlapping maximum average segments in a given sequence. Bioinformatics, 19(1):151–152, 2003. [51] J. Matouˇsek. Randomized optimal algorithm for slope selection. Information Processing Letters, 39(4):183–187, 1991. [52] J. Matouˇsek. Approximations and optimal geometric divide-and-conquer. Proc. 23rd ACM Symp. on Theory of Computing, 1–10, 1991. [53] E. M. McCreight. Priority search trees. SIAM Journal on Computing, 14(2):257–276, 1985. [54] N. Megiddo. Applying parallel computation algorithms in the design of serial algorithm. Journal of the association for computing machinery, Vol. 30, No. 4:852–865, 1983. [55] C. Mikl´os. Maximum-scoring segment sets. IEEE/ACM Transations on Computatonal Biology and Bioinformatics, 1(4):139–150, 2004. [56] M. Mitzenmacher and E. Upfal. Probability and Computing: Randomized Algorithms and Probabilistic Analysis, Cambridge. [57] A. Nekrutenko and W.-H. Li. Assessment of compositional heterogeneity within and between eukaryotic genomes. Genome Research, 10:1986–1995, 2000. [58] U. Ohler, H. Niemann, G. Liao, and G. M. Rubin. Joint modeling of DNA sequence and physical properties to improve eukaryotic promoter recognition. Bioinformatics, 17:199–206, 2001. [59] K. Perumalla and N. Deo. Parallel algorithms for maximum subsequence and maximum subarray. Parallel Processing Letters, 5:367-373, 1995. [60] K. Qiu and S. Alk. Parallel maximum sum algorithms on interconnection networks. Technical Report No. 99-431, Jodrey School of Computer Science, Acadia University, Canada, 1999. [61] P. Rice, I. Longden, and A. Bleasby. Emboss: The European molecular biology open software suite. Trends Genet., 16:276–277, 2000. [62] D. Smith. Applications of a strategy for designing divide-and-conquer algorithms. Science of Computer Programming, 8:213-229, 1987. [63] N. Stojanovic, L. Florea, C. Riemer, D. Gumucio, J. Slightom, M. Goodman, W. Miller, and R. Hardison. Comparison of five methods for finding conserved sequences in multiple alignments of gene regulatory regions. Nucleic Acids Research, 27:3899–3910, 1999. [64] T. Takaoka. Efficient algorithms for the maximum dubarray problem by fistance matrix multiplication. Proceedings of the 2002 australian theory symposium, 189-198, 2002. [65] H. Tamaki and T. Tokuyama. Algorithms for the maximum subarray problem based on matrix multiplication. Proceedings of the ninth annual ACM-SIAM symposium on discrete algorithms, 446-452, 1998. [66] R. E. Tarjan. Updating a balanced search tree in O(1) rotations. Information Processing Letters, 16:253–257, 1983. [67] R. Walder, M. Garrett, A. McClain, G. Beck, T. Brennan, N. Kramer, A. Kanis, A. Mark, A. Rapp, and V. Sheffield. Short tandem repeat polymorphic markers for the rat genome from marker-selected libraries associated with complex mammalian phenotypes. Mammallian Genome, 9:1013–1021, 1998. [68] B. Y. Wu, K. M. Chao, and C. Y. Tang An Efficient Algorithm for the Length- Constrained Heaviest Path Problem on a Tree. Information Processing Letters, 69, 63–67, 1999.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28457	-
dc.description.abstract	本論文主要在研究兩類問題：序列的最佳化問題(optimization problems on sequences)與序列的範圍搜尋問題(range query problems on sequences)。我們會將與這兩類問題相關的所有問題做全面地概述並依據問題難易程度做分類，介紹解決這兩類問題的技巧，並討論它們的一些應用。我們會先研究：序列的最佳化問題。我們會把研究焦點放在序列和的選擇問題(Sum Selection Problem)，這個問題是三個很有名的問題的推廣，包括：計算生物學上的最大序列和問題(Maximum Sum Problem)、有限長度的最大序列和問題(Maximum-Sum Segment Problem)與在資訊科學上的選擇問題(Selection Problem)。我們會對序列和的選擇問題分別給出一個隨機的演算法與一個決定性的演算法。我們接著會將這二個演算法結合序列和的範圍搜尋問題(Sum Range Query Problem)的演算法來給出二個前k大序列和問題(k Maximum Sums Problem)的演算法，這二個演算法均改進了目前這個問題的最佳演算法。我們然後研究序列密度的選擇問題(Density Selection Problem)，這個問題也是三個很有名的問題的推廣，包括：計算生物學上的有限長度的最大序列密度問題(Maximum-Density Segment Problem)、計算幾何學上的斜率選擇問題(Slope Selection Problem)與在資訊科學上的選擇問題(Selection Problem)。我們會對序列密度的選擇問題給出一個最佳的隨機演算法。我們接著會將這個演算法結合序列密度的範圍搜尋問題(Density Range Query Problem)的演算法來給出一個前k大序列密度問題(k Maximum Densities Problem)最佳的隨機演演算法。我們也研究了有限長度的最大序列密度問題的另一個推廣問題稱之為序列密度的找尋問題(Density Finding Problem)。對於沒有寬度上界的情況，我們會給出一個最佳的演算法。但對於一般的情況，我們只給出一個次佳的演算法。另外，我們也給出有限長度的最大序列密度問題的另一個最佳的演算法。接著我們會研究：序列的範圍搜尋問題，包括：序列和的範圍搜尋問題(Sum Range Query Problem) 序列密度的範圍搜尋問題(Density Range Query Problem)。對於這二個問題，我們分別給出一個報導模式(reporting mode)與計數模式(counting mode)的演算法。對於序列密度的範圍搜尋問題我們證明了這二個演算法都是最佳的。這二個問題對於在解決序列的最佳化問題上有很大的應用。	zh_TW
dc.description.abstract	The dissertation will focuses on the optimization problems on sequences and range query problems on sequences. We will survey and classify the related problems, introduce some techniques for solving these kinds of problems and, discuss their possible biological applications. We first consider some optimization problems on sequences. We focus on the Sum Selection Problem which is a generalization of several famous problems, including Maximum Sum Problem, Maximum-Sum Segment Problem in computational biology, and Selection Problem in computer science. We will give a randomized algorithm and a deterministic algorithm for this problem respectively, and combine them with the algorithm for the Sum Range Query Problem to give two improved algorithms for the k Maximum Sums Problem. We then consider the Density Selection Problem, which is also a generalization of several famous problems, including Maximum-Density Segment Problem in computational biology, Slope Selection Problem computational geometry, and Selection Problem in computer science. We will give an optimal randomized algorithm for this problem and combine it with the algorithm for the Density Range Query Problem to give an optimal algorithm for the k Maximum Densities Problem. We also consider another generalization of Maximum-Density Segment Problem called Density Finding Problem. We give an optimal algorithm for the Density Finding Problem for the case when there is no upper bound on the width of the sequence, and an algorithm for general case. As a byproduct, we give another optimal time algorithm for the Maximum-Density Segment Problem.	en
dc.description.provenance	Made available in DSpace on 2021-06-13T00:08:50Z (GMT). No. of bitstreams: 1 ntu-96-D89922001-1.pdf: 598937 bytes, checksum: f44c7bd519e3f318db71cf2f3c5b5cbd (MD5) Previous issue date: 2007	en
dc.description.tableofcontents	1 Introduction 1 1.1 Biological Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Theoretical Foundations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Relationship to Traditional Computational Problems . . . . . . . . . . . . . 3 1.4 Basic Definitions and Notations . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.5.1 Related optimization problems on sequences . . . . . . . . . . . . . . 5 1.5.2 Related range query problems on sequences . . . . . . . . . . . . . . 10 1.6 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2 Optimization Problems on sequences: Sum Selection Problem 13 2.1 Reduction to a Geometric Intersection Selection Problem . . . . . . . . . . . 13 2.2 Deterministic Algorithm for Sum Selection Problem and k Maximum Sums Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 Randomized Algorithm for Sum Selection Problem and k Maximum Sums Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3 Optimization Problems on sequences: Density Selection Problem 40 3.1 Reduction to a Geometric Slope Selection Problem . . . . . . . . . . . . . . 40 3.2 Randomized Algorithm for Density Selection Problem and k Maximum Densities Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4 Optimization Problems on sequences: Density Finding Problem 49 4.1 Reduction to a Geometric Slope Finding Problem . . . . . . . . . . . . . . . 49 iv 4.2 Algorithm for Density Finding Problem . . . . . . . . . . . . . . . . . . . . . 53 4.2.1 Special case when u = w(1; n) . . . . . . . . . . . . . . . . . . . . . . 54 4.2.2 General case when u < w(1; n) . . . . . . . . . . . . . . . . . . . . . . 59 4.3 Algorithm for Maximum-Density Segment Problem . . . . . . . . . . . . . . 62 5 Range Query Problems on sequences 69 5.1 Sum Range Query Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2 Density Range Query Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 71 6 Conclusion 77 6.1 Techniques for Sequence Analysis . . . . . . . . . . . . . . . . . . . . . . . . 77 6.2 Future Research Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
dc.language.iso	en
dc.subject	序列的最佳化問題	zh_TW
dc.subject	序列的範圍搜尋問題	zh_TW
dc.subject	optimization problems on sequences	en
dc.subject	range query problems on sequences	en
dc.title	序列操作與相關問題之演算法研究	zh_TW
dc.title	Algorithmic Studies of Sequence Manipulation and Related Problems	en
dc.type	Thesis
dc.date.schoolyear	95-2
dc.description.degree	博士
dc.contributor.oralexamcommittee	高明達,何錦文,謝孫源,項天瑞
dc.subject.keyword	序列的最佳化問題,序列的範圍搜尋問題,	zh_TW
dc.subject.keyword	optimization problems on sequences,range query problems on sequences,	en
dc.relation.page	86
dc.rights.note	有償授權
dc.date.accepted	2007-07-30
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-96-1.pdf 未授權公開取用	584.9 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。