請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/40886
標題: | 醫學文獻優化查詢 Medical Literatures Query-Tuning |
作者: | Shun-Hsiang Yang 楊舜翔 |
指導教授: | 翁昭旼(Jau-Min Wong) |
共同指導教授: | 蔣以仁(I-Jen Chiang) |
關鍵字: | 搜尋引擎,資訊檢索,資料探勘,PICO, Search Engine,Information Retrieval,Data Mining,PICO, |
出版年 : | 2011 |
學位: | 碩士 |
摘要: | 醫學文獻數量隨著電腦與網路的普及呈現指數成長,在MEDLINE/PubMed上的文獻數量從1990年底的676萬筆,到2011年的現在已經累積2000萬筆以上的文獻。對於繁忙的醫療專業人員來說,要從這樣的龐大的資料庫中搜尋適合的文獻是一大負擔。
為了輔助使用者的搜尋工作,搜尋引擎的介入是必要的。然而,PubMed所提供預設的搜尋策略並沒有辦法有效回傳相關結果,經驗不足的使用者需要不斷的try and error才能找到符合的文章,且缺乏良好的排序機制。 本研究試著實作一系統,在搜尋的部份使用者除了可以透過關鍵字查詢外,還可以輸入句子甚至是文章來做搜尋,透過使用者與系統介面的互動,來達到快速優化查詢的效果,並以相關性來作為排序文章的機制。另外在系統中針對醫學文獻的摘要即時利用PICO的分類器作句子的分類,協助使用者在閱讀時更快的鎖定目標句。最後在實驗結果與討論的部分,本研究從PubMed中蒐集明確描述Patients、Intervention、Outcome相關的句子當成實驗的材料,實驗材料中有分成訓練資料與測試資料,訓練資料透過NLTK的貝氏分類器建構三組分類模型並且透過PICO分類演算法來對測試資料進行分類,最後透過10-fold cross validation評估PICO分類系統的效能。 The number of medical literature as the popularity of computer and network grow exponentially , the number of medical literature end of 1990 from 676 million in the MEDLINE / PubMed and in 2011 has now accumulated more than 20 million medical literature. For busy medical researcher and physician, from this huge database to search for literature is a major burden. To complement the work of the user's search, the search engine intervention is necessary. However, PubMed provides the default search strategy is not an effective way to return relevant results, inexperienced users need to constantly try and error to find the related article, and the lack of good sorting mechanism. This study tried to implement a system, the user can according keywords to queries, also you can type a sentence and even articles do search through the user interaction with the system interface to find the related articles, and provides sort of mechanism by relevance to help user find the articles quickly. Also in the system for real-time classification of medical literature using PICO classifier for the classification of sentences to help users read faster when user got the target sentence. Finally, in the experimental results and discussion we collection the clearly described Patients, Intervention, Outcome-related sentences as the experimental materials, experimental materials are divided into training data and test data, the training data through the NLTK Bayesian classifier to construct three classification model and through the PICO classification algorithms to classify the test data by 10-fold cross validation for assessing the performance of PICO. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/40886 |
全文授權: | 有償授權 |
顯示於系所單位: | 醫學工程學研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-100-1.pdf 目前未授權公開取用 | 3.46 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。