Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 醫學工程學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28469
Title: 生醫詞彙辨識:利用隱藏式馬可夫模型
Biological Terms Recognition:Using Hidden Markov Models
Authors: Chih-Wei Chen
陳志偉
Advisor: 翁昭旼(Jau-Min Wong)
Keyword: 隱藏式馬可夫模型,機器學習,生醫文獻探勘,生物醫學名詞辨識,文字探勘,
Hidden Markov Models,Machine Learning,Biomedical Term Extraction,Biomedical Named Entity Recognition,Text Mining,
Publication Year : 2007
Degree: 碩士
Abstract: 在生醫文獻中的生醫詞彙,存在著例如複合字、同義詞、慣用語、甚至新的命名法則的問題,造成不同文獻中的生醫詞彙未必具有一致性,這使得自動化生醫資料整合的目標因此困難重重。而其中最初步,對系統效能影響最深遠的,莫過於如何從文獻中正確的找出生醫詞彙,即生物醫學名詞辨識( Biomedical Named Entity Recognition, Biomedical NER )。
我們在這篇論文中將利用隱藏式馬可夫模型( Hidden Markov Model ),針對文獻中的摘要部份進行剖析。目標是從文獻摘要中找出生醫詞彙。我們的方法共有四個步驟:首先利用五種生醫詞彙的特徵對文字做分群。第二步,利用分群好的訓練資料產生一個隱藏式馬可夫模型。第三步,將使用者輸入的文章讀入,並且依照前述的四種生醫詞彙特徵對文字做分群。最後,利用Machine Learning演算法,將讀入的文章中,系統判定為生醫詞彙之文字做標記。
With the progress of biomedical science, text mining in biomedical domain is getting important. Since there are many irregularities and ambiguous contexts in biomedical literature such as various compound words, synonyms, acronyms, and even the laws of naming are not literally consistent, how to correctly identify biological terms from text is a fundamental requirement for information extraction.
In this paper we propose a biological term extractor which is based on Hidden Markov Models. There are four steps to accomplish our task. First, the tokens in training data are clustered by five features at the first stage. Second, train a Hidden Markov Model by these clustering tokens. Third, normalize user’s input and cluster these tokens. Finally, annotate the biological terms according to the Machine Learning algorithm.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28469
Fulltext Rights: 有償授權
Appears in Collections:醫學工程學研究所

Files in This Item:
File SizeFormat 
ntu-96-1.pdf
  Restricted Access
792.27 kBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved