Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 生醫電子與資訊學研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62773
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor趙坤茂,劉長遠
dc.contributor.authorHao-Yu Hungen
dc.contributor.author洪晧瑜zh_TW
dc.date.accessioned2021-06-16T16:10:00Z-
dc.date.available2016-04-25
dc.date.copyright2013-04-25
dc.date.issued2013
dc.date.submitted2013-03-14
dc.identifier.citation[1] Dennis A. Benson, Ilene Karsch-Mizrachi, David J. Lipman, James Ostell, and David L. Wheeler. Genbank. Nucleic Acids Research, 33(Database-Issue):34–38, 2005.
[2] Pedro Bernaola-Galv’an, Ram’on Rom’an-Rold’an, and Jos’e L. Oliver. Compositional segmentation and long-range fractal correlations in dna sequences. Physical Review E, 53(5):5181–5189, 1996.
[3] Richard J. Boys, Daniel A. Henderson, and Darren J. Wilkinson. Detecting Homogeneous Segments in DNA Sequences by Using Hidden Markov Models. Journal of the Royal Statistical Society. Series C (Applied Statistics), 49(2):269–285, 2000.
[4] Korber BT, Foley BT, Kuiken CL, Pillai SK, and Sodroski G. Numbering Positions in HIV Relative to HXB2CG. Human Retroviruses and AIDS 1998: A Compilation and Analysis of Nucleic Acid and Amino Acid Sequences.
[5] Wei-Chen Cheng, Jau-Chi Huang, and Cheng-Yuan Liou. Segmentation of dna using simple recurrent neural network. Knowl.-Based Syst., 26:271–280, 2012.
[6] Shan Dong and David B. Searls. Gene structure prediction by linguistic methods. Genomics, 23:540–551, 1994.
[7] Jeffrey L. Elman. Finding structure in time. Cognitive Science, 14(2):179–211, 1990.
[8] T. Ksiazek, D. Erdman, C. Goldsmith, and et al. A novel coronavirus associated with severe acute respiratory syndrome. The New England journal of medicine, 348(20):1953–1966, 2003.
[9] Shu-Yun Le, Wei min Liu, and Jacob V. Maizel Jr. A data mining approach to discover unusual folding regions in genome sequences. Knowledge-Based Systems, 15(4):243 – 250, 2002.
[10] Wentian Li, Pedro Bernaola-Galv’an, Fatameh Haghighi, and Ivo Grosse. Applications of recursive segmentation to the analysis of dna sequences. Computers & Chemistry, 26(5):491–510, 2002.
[11] M. Marra, S. Jones, C. Astell, and et al. The genome sequence of the sars-associated coronavirus. Science, 300(5624):1399–1404, 2003.
[12] G A Martini, H G Knauff, H A Schmidt, G Mayer, and G Baltzer. A hitherto unknown infectious disease contracted from monkeys. Ger Med Mon, 13(10):457–70, 1968.
[13] D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Neurocomputing: foundations of research. pages 673–695, 1988.
[14] A. Sanchez. Filoviridae: Marburg and ebola viruses. Fields virology, pages 1279–1304, 2001.
[15] Saleta Sierra, Bernd Kupfer, and Rolf Kaiser. Basics of the virology of hiv-1 and its replication. Journal of Clinical Virology, 34(4):233 – 244, 2005.
[16] S. van Boheemen, M. de Graaf, C. Lauber, T.M. Bestebroer, V.S. Raj, A.M. Zaki, A.D.M.E. Osterhaus, B.L. Haagmans, A.E. Gorbalenya, E.J. Snijder, and R.A.M.
Fouchier. Genomic characterization of a newly discovered coronavirus associated with acute respiratory distress syndrome in humans. MBio, 3(6), 2012.
[17] Tetsushi Yada, Yasushi Totoki, Masato Ishikawa, Kiyoshi Asai, and Kenta Nakai. Automatic extraction of motifs represented in the hidden markov model from a number
of dna sequences. Bioinformatics, 14(4):317–325, 1998.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62773-
dc.description.abstract去氧核醣核酸(Deoxyribonucleic acid, DNA)序列中含有許多生物體
或病毒運作的基因密碼。根據分子生物學的中心教條,我們可以知
道DNA的片段在製作蛋白質時扮演重要角色。
當一個病毒在世界各地快速擴散時,我們最快可以得到的資訊便
是此病毒的DNA序列。若我們可以在短時間內將DNA序列分段,則我
們可以快速地分析他,更甚者,可以找到有用的藥物或疫苗來治療病
人。
在這篇論文中,我們用simple recurrent network(SRN)來預測DNA序
列中的下一個核甘酸(nucleotide),並藉此做DNA序列的分段。我
們用SARS-Cov, HCoV-EMC/2012, Ebola virus, and HIV 病毒來進行實
驗。實驗的分段結果和其他生物學家找到的片段是一致的。我
們也用Hinton diagram來做互信息(mutual information)的分析。我們知
道SRN可以用來學習基因序列的架構並藉由預測的錯誤來偵測coding
regions的邊界。這顯示SRN可以提供一些有助於未來生物研究的特
徵。
zh_TW
dc.description.abstractDNA encodes the genetic instructions used in the development and functioning of all known living organisms and many viruses. And according to the central dogma that forms the backbone of molecular biology, we know that collection of segmented DNA plays an important role in making protein.
When a virus spreading worldwide quite quickly, the first information we can get is its DNA sequence. If we can segment the DNA sequence in very short time, then we can analyze it fast, and moreover, find the useful drug and
bacterin for patients in very short time.
In this thesis, we use the simple recurrent network (SRN) to predict the next nucleotide of DNA sequence in order to do the segmentation task. SARSCov, HCoV-EMC/2012, Ebola virus, and HIV genomes are used in our experiment. The results are consistent with the finding of biologists. We analyze
the mutual information with Hinton diagram as well. Therefore, we explicitly know that SRN can learn the genome structure and detect the boundaries of the coding regions according to the prediction errors. This implies that SRN
is capable of providing features for further biological studies.
en
dc.description.provenanceMade available in DSpace on 2021-06-16T16:10:00Z (GMT). No. of bitstreams: 1
ntu-102-R99945051-1.pdf: 1656927 bytes, checksum: 08c3fe54893e3f2b5c5fba8c5cf7bde6 (MD5)
Previous issue date: 2013
en
dc.description.tableofcontents致謝i
中文摘要ii
Abstract iii
1 Introduction 1
2 Main Method 2
2.1 Using SRN to predict next nucleotide of a DNA sequence . . . . . . . . . 3
3 Analysis 5
3.1 SARS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 HCoV-EMC/2012 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Ebola virus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4 HIV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4 Conclusion 27
Bibliography 28
iv
dc.language.isoen
dc.subject分段zh_TW
dc.subject去氧核醣核酸zh_TW
dc.subject類神經網路zh_TW
dc.subject核甘酸zh_TW
dc.subject互資訊zh_TW
dc.subjectSegmentationen
dc.subjectDNAen
dc.subjectSimple recurrent networken
dc.subjectNucleotideen
dc.subjectMutual Informationen
dc.title基因序列分段研究zh_TW
dc.titleSegmentation of DNA sequenceen
dc.typeThesis
dc.date.schoolyear101-2
dc.description.degree碩士
dc.contributor.oralexamcommittee鄭為正,黃昭綺
dc.subject.keyword分段,去氧核醣核酸,類神經網路,核甘酸,互資訊,zh_TW
dc.subject.keywordSegmentation,DNA,Simple recurrent network,Nucleotide,Mutual Information,en
dc.relation.page29
dc.rights.note有償授權
dc.date.accepted2013-03-15
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept生醫電子與資訊學研究所zh_TW
顯示於系所單位:生醫電子與資訊學研究所

文件中的檔案:
檔案 大小格式 
ntu-102-1.pdf
  未授權公開取用
1.62 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved