Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 生醫電子與資訊學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/44010
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor賴飛羆
dc.contributor.authorYu-Wen Luen
dc.contributor.author盧裕文zh_TW
dc.date.accessioned2021-06-15T02:36:10Z-
dc.date.available2014-08-18
dc.date.copyright2009-08-18
dc.date.issued2009
dc.date.submitted2009-08-13
dc.identifier.citation[1] “DNA sequencing,” http://genomics.org/index.php/DNA_sequencing
[2] F. Sanger, G. M. Air, B. G. Barrell, et. al., “Nucleotide sequence of bacteriophage phi X174 DNA,” Nature, vol. 265, no. 5596, pp. 687- 695, 1977.
[3] J. Shendure1 and H. Ji, ” Next-generation DNA sequencing,” Nature Biotechnology, vol. 26, no. 10, pp. 1135- 1145, 2008.
[4] “Human Genome Project HomePage,” http://hgph.com/index.htm
[5] “Human Genome Project Information,” http://www.ornl.gov/sci/techresources/Human_Genome/home.shtml
[6] S. McGinnis and T. L. Madden, “BLAST: at the core of a powerful and diverse set of sequence analysis tools,” Nucleic Acids Research, vol. 32, 2004.
[7] D. A. Benson, M. S. Boguski, D. J. Lipman, et. al., “GenBank,” Nucleic Acids Research, vol. 27, no.1, pp. 12-17, 1999.
[8] T. Hubbard, D. Barker, E. Birney, et. al., “The Ensembl genome database project,” Nucleic Acids Research, vol. 30, no. 1, pp. 38- 41, 2002.
[9] Y. Tateno , S. Miyazaki , M. Ota , et. al., “DNA Data Bank of Japan (DDBJ) in collaboration with mass sequencing teams,” Nucleic Acids Research, vol. 28, no. 1, pp. 24- 26, 2000
[10] J. A. Blake, J. E. Richardson, C. J. Bult, et. al., “MGD: the Mouse Genome Database,” Nucleic Acids Reaserch, vol. 31, no. 1, pp. 193-195, 2003.
[11] Y. Maruyama1, A. Wakamatsu, Y. Kawamura, et. al., “Human Gene and Protein Database (HGPD): a novel database presenting a large quantity of experiment-based results in human proteomics,” Nucleic Acids Research, vol. 37, 2009.
[12] K. D. Pruitt, T. Tatusova and D. R. Maglott, “NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins,” Nucleic Acids Research, vol. 35, 2006.
[13] “NCBI Reference Sequence (RefSeq),” http://www.ncbi.nlm.nih.gov/RefSeq/
[14] “BLAST Program Selection Guide,” http://blast.ncbi.nlm.nih.gov/blast/producttable.shtml
[15] “MegaBLAST Search,” http://www.ncbi.nlm.nih.gov/blast/megablast.shtml
[16] S. F. Altschul, W. Gish, W. Miller, et. al., “Basic local alignment search tool,” Journal of Molecular Biology, vol. 215, no. 3, pp. 403-410, 1990.
[17] S. Grumbach and F. Tahi, “Compression of DNA sequences,” Data Compression Conference. IEEE Computer Society Press, pp. 340-350, 1993.
[18] X. Chen, S. Kwong and M. Li, “A compression algorithm for DNA sequences and its applications in genome comparison,” The 10th Workshop on Genome Informatics, 1999.
[19] X. Chen, M. Li, B. Ma, et. al., “ DNACompress: fast and effective DNA sequence compression,” Bioinformatics, vol. 18, no. 12, pp. 1696- 1698, 2002.
[20] B. Ma, J. Tromp and M. Li, “PatternHunter: faster and more sensitive homology search,” Bioinformatics, vol. 18, no. 3, pp. 440- 445, 2002.
[21] B. Behzadi and F. L. Fessant, “DNA compression challenge revisited: a dynamic programming approach,” Lecture Notes in Computer Science, vol. 3537, pp. 190-200, 2005.
[22] J. Jurka, V. V. Kapitonov, A. Pavlice , et. al., “Repbase Update, a database of eukaryotic repetitive elements,” Cytogentic and Genome Research vol. 110, no. 1- 4. pp. 462-467, 2005
[23] W. R. Pearson and D. J. Lipman, “Improved tools for biological sequence comparison”, PNAS, vol. 85, no. 8, pp. 2444 – 2448, 1988.
[24] “SWFUpload,” http://swfupload.org/
[25] “FASTA format description,” http://www.ncbi.nlm.nih.gov/blast/fasta.shtml
[26] A. Apostolico, A. Fraenkel, ”Robust transmission of unbounded strings using Fibonacci representations,” IEEE Trans. on Information Theory, vol. 33, no. 2, pp. 238- 245, 1987.
[27] P. Elias, “Universal codeword sets and representations of the integers,” IEEE Trans. on Information Theory, vol. 21, no. 2, pp. 194-203, 1975.
[28] R. Li, Y. Li., K. Kristiansen, et. al., “SOAP: short oligonucleotide alignment program,” Bioinformatics, vol. 24, no. 5, pp. 713– 714, 2008.
[29] Genetic Information Nondiscrimination Act of 2008 (Public Law 110-233).
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/44010-
dc.description.abstract隨著人類基因體計畫的初步完成,後基因體時代正逐漸來臨,人類的序列圖譜雖然已被解開,卻不代表已完全了解每個序列片段所扮演的角色,它們的功能仍然需要被進一步地分析與研究。另一方面,因序列資料的快速擴增,資料整理與儲存議題更顯得日趨重要。本篇論文在探討並實作出一套基因體資料庫系統,供使用者儲放與管理序列資料。此系統最大的特色在於它整合了序列比對與資料壓縮的功能。藉由嵌入NCBI提供的程式- blastall,它能將使用者上傳的序列作比對,尋找出該序列在基因體相對應的位置;此外,該系統針對序列之間的差異部份作編碼,有效地將資料壓縮,減少儲存的空間。同時,它提供多樣化的查詢功能。使用者可依檢體編號、GI、序列位置等查詢方式,快速地搜尋到他們所感興趣的序列資料。zh_TW
dc.description.abstractWith the initial completion of Human Genome Project, the post-genomic era is coming. Although the genome map of human has been decoded, the roles that each segment of sequences acts are not totally discovered. Their actually functions are still needed to be analyzed and researched. On the other hand, with the rapid expansion of sequence information, the issues of data compilation and data storage are increasingly important. In this thesis, a “Human Genome Database System” is designed and implemented in National Taiwan University Hospital (NTUH). By accessing this system, users can store and manage the experimental sequence data. The greatest achievement of this system is that it integrates the modules of sequence alignment and data compression. By embedding with the NCBI alignment program- blastall, it automatically aligns the uploaded sequences and searches for the corresponding genomic positions. Besides, the system encodes the differences between sequences, effectively compresses them and decreases the demand of storage space. At the same time, it offers a variety of query methods. Users can quickly access the interesting data by inputting the keywords of specimen number, GI and sequence position, etc.en
dc.description.provenanceMade available in DSpace on 2021-06-15T02:36:10Z (GMT). No. of bitstreams: 1
ntu-98-R96945042-1.pdf: 559426 bytes, checksum: f9b5d09c002132d87f21f8688a09fec7 (MD5)
Previous issue date: 2009
en
dc.description.tableofcontentsChapter 1 Introduction 1
1.1 The DNA Sequencing technology 1
1.2 Human Genome Project 3
1.3 Motivation and Objective 4
1.4 Thesis Organization 5
Chapter 2 Background 6
2.1 Genome Database 6
2.2 NCBI Reference sequence 7
2.3 NCBI BLAST Tool 10
2.4 Mega BLAST Search 13
2.5 DNA Compression Algorithm 15
Chapter 3 Methods 17
3.1 System Architecture 17
3.2 Sequence Conversion 19
3.2.1 Sequence Pre-processing 20
3.2.2 Sequence Aligning 22
3.2.3 Sequence Post-Processing 24
3.3 Sequence Retrieve 26
3.3.1 The Utilization of fastacmd 27
3.3.2 Sequence Assembly 27
3.4 The Access to Genome Database 28
3.5 Data Integration 29
Chapter 4 Results and Discussion 31
4.1 System Implementation 31
4.2 The Experimental Results of Compression 33
4.3 Discussion 36
Chapter 5 Conclusion 39
References 40
dc.language.isoen
dc.title具序列比對與資料壓縮功能之基因體資料庫系統設計zh_TW
dc.titleDesign of Genome Database System with the Sequence Aligning and Data Compression Mechanismen
dc.typeThesis
dc.date.schoolyear97-2
dc.description.degree碩士
dc.contributor.oralexamcommittee簡穎秀,陳俊良,林正偉,李鴻章
dc.subject.keyword人類基因體計畫,DNA定序,基因體資料庫系統,序列比對,資料壓縮,zh_TW
dc.subject.keywordHuman Genome Project,DNA sequencing,genome database system,sequence alignment,data compression,en
dc.relation.page42
dc.rights.note有償授權
dc.date.accepted2009-08-13
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept生醫電子與資訊學研究所zh_TW
顯示於系所單位:生醫電子與資訊學研究所

文件中的檔案:
檔案 大小格式 
ntu-98-1.pdf
  目前未授權公開取用
546.31 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved