Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 管理學院
  3. 資訊管理學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52059
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor曹承礎
dc.contributor.authorChin-Ho Linen
dc.contributor.author林慶和zh_TW
dc.date.accessioned2021-06-15T14:05:58Z-
dc.date.available2023-12-31
dc.date.copyright2015-08-21
dc.date.issued2015
dc.date.submitted2015-08-20
dc.identifier.citation參考文獻
[1] United Nations, Department of Economic and Social Affairs, Population Division (2013). World Population Prospects: The 2012 Revision, Volume I: Comprehensive Tables ST/ESA/SER.A/336. and, Key Findings and Advance Tables. Working Paper No. ESA/P/WP.227.
[2] United Nations, Department of Economic and Social Affairs, Population Division (2007). World Population Prospects: The 2006 Revision, Highlights,Working Paper No. ESA/P/WP.202.
[3] 內政部統計處, 102年底人口結構分析, 內政統計通報, Jan. 2014.
[4] 國家發展委員會, 中華民國人口推計(103至150年), 2014.
[5] OECD Health Statistics 2014 - Frequently Requested Data, Available: http://www.oecd.org/els/health-systems/OECD-Health-Statistics-2014-Frequently-Requested-Data.xls
[6] P. Smith, “Health system efficiency: what can health economists contribute?,” Plenary, 9th World Congress, International Health Economics Association, Sydney, 2013.
[7] WHO, United States of America, Statistics (2012), Available: http://www.who.int/countries/usa/en/
[8] Y.P. Wen, S.M. Huang, T.L. Chiang, “An analysis of the growth of healthcare expenditure in Taiwan:healthcare inflation, volume-intensity, and equity,” Taiwan J Public Health, Vol. 31, No.1, pp. 1–10, 2012.
[9] 行政院衛生福利部, 102年版公共衛生年報, Dec. 2013.
[10] 行政院主計總處, 國民所得統計及國內經濟情勢展望, Aug. 2014.
[11] Frost & Sullivan, top 20 global mega trends and their impact on business cultures and society, 2008. Available: http://www.frost.com/prod/servlet/cpo/213016007
[12] HITECH, Available: http://www.hitechanswers.net/
[13] HITECH, Available: http://en.wikipedia.org/wiki/Health_Information_Technology_for_Economic_and_Clinical_Health_Act
[14] 行政院衛生福利部, Available: http://www.mohw.gov.tw/
[15] W. Hersh, et al., “Health-care hit or miss?,” Nature, vol. 470, pp. 327–329, Feb. 2011.
[16] M. Porta and J. M. Last, A Dictionary of Epidemiology, New York: Oxford University Press, 2008.
[17] M.A. Musen and J.H. Bemmel, Handbook of Medical Informatics, Houten: Bohn Stafleu Van Loghum, 1999.
[18] R. Agarwal, G. Gao, C. DesRoches, and A. K. Jha, “Research Commentary: The Digital Transformation of Healthcare: Current Status and the Road Ahead,” Information Systems Research, 21(4), pp. 796–809, 2010.
[19] K. Miller, “Big Data Analytics in Biomedical Research,” Biomedical Computation Review, 2012.
[20] K. Miller, “Leveraging Social Media for Biomedical Research: How Social Media Sites Are Rapidly Doing Unique Research on Large Cohorts,” Biomedical Computation Review, 2012.
[21] J. Manyika, M. Chui, B. Brown, and J. Bughin, R. Dobbs, C. Roxburgh, and A. H. Byers, “Big data: The next frontier for innovation, competition, and productivity,' McKinsey Global Institue, May 2011.
[22] P. Groves, B. Kayyali, D. Knott, and S. V. Kuiken, “The big data revolution in healthcare: Accelerating value and innovation,' McKinsey, Jan. 2013.
[23] National Health Insurance Research Database (NHIRD), Available: http://nhird.nhri.org.tw/en/index.htm
[24] E. F. Codd, “A relational model of data for large shared data banks,” Commun. ACM, vol. 13(6), pp. 377–387, 1970.
[25] R. Elmasri and S. B. Navathe, Fundamentals of Database Systems, 5th Ed., Pearson, Addison Wesley.
[26] V. Mayer-Schonberger and K. Cukier, BIG DATA: A Revolution That Will Transform How We Live, Work, and Think., New York: Houghton Mifflin Harcourt, 2013.
[27] M. A. Beyer and D. Laney, The Importance of 'Big Data': A Definition, Gartner, June 2012.
[28] Wikipedia, “Big Data”, Available: http://en.wikipedia.org/wiki/Big_data
[29] P. A. Bernstein, et al., “Future directions in DBMS research - the Laguna Beach Participants,” ACM SIGMOD Record, vol. 18(1), pp. 17–26, 1989.
[30] Silberschatz and S. Zdonik, “Strategic directions in database systems—breaking out of the box,” ACM Comput. Surv., vol. 28(4), pp. 764–778, Dec. 1996.
[31] G. DeCandia, et al., “Dynamo: amazon's highly available key-value store,” ACM SIGOPS, vol. 41(6), pp. 205–220, Dec. 2007.
[32] F. Chang, et al., “Bigtable: a distributed storage system for structured data,” ACM T. Comput. Syst., vol. 26, no. 2, art. 4, 2006.
[33] A. Lakshman and P. Malik, “Cassandra: a decentralized structured storage system,” ACM SIGOPS, vol. 44(2), pp. 35–40, April 2010.
[34] NoSQL Databases, Available: http://www.nosql-database.org/
[35] R. Cattell, “Scalable SQL and NoSQL data stores,” ACM SIGMOD Record, vol. 39(4), pp. 12–27, Dec. 2010.
[36] M. Stonebraker, “SQL databases v. NoSQL databases,” Commun. ACM, vol. 53(4), pp. 10–11, April 2010.
[37] K.K Lee, W.C. Tang, K.S. Choi, “Alternatives to relational database: comparison of NoSQL and XML approaches for clinical data storage,” Comput. Methods Programs Biomed., vol. 110(1) pp. 99–109, April 2013.
[38] B. G. Tudorica and C. Bucur, “A comparison between several NoSQL databases with comments and notes,” 10th RoEduNet, pp.1–5, June 2011.
[39] W. Zhu and M. Li, 'Using MongoDB to Implement Textbook Management System instead of MySQL,' ICCSN, pp. 303-305, May 2011.
[40] H. Chen, R. H. L. Chiang and V. C. Storey, “Business Intelligence and Analytics: From Big Data to Big Impact,” MIS Quarterly, vol. 36, no. 4, pp. 1165–1188, Dec. 2012.
[41] J. Boyle, “Biology must develop its own big-data systems,” Nature, 499:7, July 2013.
[42] W. Wang and E. Krishnan, 'Big Data and Clinicians: A Review on the State of the Science,' JMIR MEDICAL INFORMATICS, vol. 2, no. 1, 2014.
[43] D. A Grimes and K. F Schulz, “An overview of clinical research: the lay of the land,” THE LANCET, Vol 359, pp. 57–61, Jan. 2002.
[44] M. J. Stampfer and G. A. Colditz, “Estrogen replacement therapy and coronary heart disease: a quantative assessment of the epidemiological evidence,” Prev Med., vol. 20(1), pp. 47–63, Jan. 1991.
[45] D. A Lawlor, G. D. Smith and S. Ebrahim, “The hormone replacement - coronary heart disease conundrum: is this the death of observational epidemiology?,” Int. J. Epidemiology, vol. 33(3), pp. 464–467, 2004.
[46] National Cancer Institue, CCPS Site, Available: http://cancercontrol.cancer.gov/
[47] SEER-Medicare Linked Database, Available: http://healthcaredelivery.cancer.gov/seermedicare/
[48] A.C. Tricco, B. Pham and N. S.B. Rawson, “Manitoba and Saskatchewan administrative health care utilization databases are used differently to answer epidemiologic research questions,” J Clin Epidemiol., vol. 61(2), pp. 192–197, Feb. 2008.
[49] Clinical Practice Research Datalink, CPRD, Available: http://www.cprd.com/
[50] General Practice Research Database, GPRD, Available: http://www.gprd.com/
[51] 成功大學健康資料加值應用研究中心, 健保資料發表論文搜尋, Available: http://healthdata.rsh.ncku.edu.tw/bin/home.php , http://nhipapers.idv.tw/
[52] Y.C. Chen, H.Y. Yeh, J.C. Wu, I. Haschler, T.J. Chen and T. Wetter, “Taiwan’s National Health Insurance Research Database: administrative health care database as study object in bibliometrics,” Scientometrics, vol. 86, pp. 365–380, 2011.
[53] Y.C. Chen, J.C. Wu, T.J. Chen and T. Wetter, “A publicly available database accelerates academic production,” BMJ, 342:d637, 2011.
[54] MongoDB database, , Available: https://www.mongodb.org/
[55] 李友專, 徐建業, 郭譽申, 簡文山, 行政院衛生署「各專科電子病歷內容基本格式制定、試作與資訊技術交流」案期末成果報告(核定版), 台灣醫學資訊學會, Dec. 2006.
[56] R. Bayer and E. McCreight, “Organization and maintenance of large ordered indexes,” Acta Informatica, 1:173–189, 1972.
[57] E. Meijer and G. Bierman, “A Co-Relational Model of Data for Large Shard Data Banks,” Comm. ACM, vol. 54, no. 4, pp. 49–58, April 2011.
[58] Pavlo, C. Curino, S. Zdonik, “Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems,” ACM SIGMOD, pp. 61–72, May 2012.
[59] Y. Liu, Y. Wang, Y. Jin, “Research on The Improvement of MongoDB Auto-Sharding in Cloud Environment,” IEEE ICCSE, pp. 851–854, July 2012.
[60] J. Dean, S. Ghemawat, “MapReduce: simplified data processing on large clusters,” Commun. ACM, vol. 51, no. 1, pp. 107–113, Jan. 2008.
[61] J. Dean and S. Ghemawat, “MapReduce: A Flexible Data Processing Tool,” Commun. ACM, vol. 53, no. 1, pp. 72–77, Jan. 2010.
[62] Google MapReduce, OSDI’04 slides, Available: http://research.google.com/archive/mapreduce-osdi04-slides/index.html http://research.google.com/archive/mapreduce-osdi04-slides/index-auto-0007.html http://research.google.com/archive/mapreduce-osdi04-slides/index-auto-0008.html
[63] E.A. Brewer, “Towards robust distributed systems,” ACM, PODC, 2000.
[64] 陸嘉恒, 挑戰大數據, 台北, 佳魁資訊, Oct. 2013.
[65] E. Meijer and G. Bierman, “A Co-Relational Model of Data for Large Shard Data Banks,” Commun. ACM, vol. 54, no. 4, pp. 49–58, April 2011.
[66] 程炯謀, 應用NoSQL資料庫建置健保資料庫之巨量資料視覺化呈現, 碩士論文, Feb. 2015.
[67] FDA Januvia Tablet, Available: http://www.fda.gov/Safety/MedWatch/SafetyInformation/Safety-RelatedDrugLabelingChanges/ucm121926.htm
[68] C.H. Lin, L.C. Huang, S.C. T. Chou, C.H. Liu, H.F. Cheng and I.J. Chiang, “Temporal Event Tracing on Big Healthcare Data Analytics,” IEEE BigData, pp. 281–287, July 2014.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52059-
dc.description.abstract研究背景–全球高齡化趨勢以及社會型態的改變,使得人口健康問題與健康照護的支出日漸沉重。為防患於未然,各地的政策制定者紛紛推動醫療資料的電子化,用以幫助實現醫療保健系統的五大目標:(1)提高醫療的質量、安全及效率、(2)致力於病人所需要的健康照護、(3)增進健康照護的協調、(4)提高人口的健康、以及(5)確保私密性和安全性。然而,這些電子醫療資料要能“有意義的使用”、擴大用途、以及創造出更多效益,並不容易,且存在很多困難的研究議題待克服。目前醫療資料分散在不同業界,資料彙集困難,也鮮少相互連結分析。而且,醫療紀錄經長年累月已累積形成巨量資料(Big Data),除對原有的計畫與研究造成重大衝擊,也帶來巨量醫療資料的整合、處理、分析、以及健康照護促進等新的研究議題,且衍生加值創新應用與商機。
研究目標–有鑑於目前生醫領域在巨量資料分析的基礎建設仍嚴重落後於趨勢、研究人員依然花費大量時間在建構與組織他們的資料,以及在這些資料上詮釋意義與發掘問題。為推動生醫巨量資料分析的變革,本研究的目標–提出一套從資料儲存到分析處理的完整方法,並基於此方法,驗證二個本研究所提出的巨量資料創新應用:(1)快速檢驗醫療通報事件,如藥物不良反應通報事件、(2)時序性醫療事件的及時監視追蹤,如新上市藥品的監視。為達成此目標,本研究提出的方法須具備:(1)時效性,能迅速回應處理結果、(2)效能,須以低成本達成、(3)擴充性,須能水平擴充運算能力及儲存容量、(4)計算容易,方便檢驗及追蹤指標的計算、以及(5)應用性。
研究方法–有別於流行病學研究方法,時序性醫療事件的追蹤與分析通常無法事先設定所要研究的問題。本研究提出一個新的模式,提供一個可及時追蹤監視醫療事件與揭露相關資訊的運作機制。此模式包含四個部分,分別為:(1)來源資料,即現有的電子醫療資料、(2)資料管理,包含巨量醫療資料儲存模型(PDMdoc)、時序性醫療事件模型(TMEdoc)、以及叢集的分片策略與管理等、(3)處理與運算,包含分片叢集的運作程序、雲端運算MapReduce巨量資料處理方法、以及一個整合的時序事件追蹤分析方法、(4)追蹤指標,內容包含指標項目,以及記載每一個發生此事件的病患的指標數值。其中,影響此模式能否發揮及時監視追蹤的功能,關鍵在於資料管理以及處理與運算的效能。
結果–本研究方法的複雜度:(1)叢集的水平擴充性與平行度為1,亦即每加入一個分片節點至叢集系統,運算能力及儲存容量均會增加一個分片單位,且不受叢集的節點數影響、(2)網路I/O,僅與查詢結果的資料量有關,也不受節點數影響、(3)搜尋與磁碟I/O,PDMdoc與TMEdoc平均搜尋時間分別為O(1)與O(logd(STMEdoc/B)),平均磁碟I/O (尋軌時間、旋轉延遲、傳遞時間)分別為( O(1), O(1), O(EPDMdoc) ) 與 ( O(logd(STMEdoc/B)), O(1), O(ETMEdoc × LTMEdoc) )。在實驗方面:(1)資料,取自於全民健保資料庫承保抽樣歸人檔(LHID2010),計有100萬人從1996年至2010年期間的健保資料、(2)測試系統,採用MongoDB以及5部PC建置成分片叢集系統(3個分片節點)、(3)實驗結果:(a)效能測試,搜尋8個疾病族群的罹病患者,單伺服機系統與分片叢集的花費時間分別介於0.607∼63.248與0.336∼29.484秒,二者平均效能比為1 : 2.024、(b)藥物不良反應通報事件的檢驗,以美國FDA於2009年9月發布的Januvia藥品安全資訊為通報案例,檢驗結果(odds ratio = 1.626)顯示此事件在台灣也有顯著的情形、(c)新上市藥品的監視,系統處理TME的數量可達140,000/秒以上,估計每日可監視數千至數萬種藥品。
zh_TW
dc.description.abstractBackgroud – Global aging trend combined with societal changes are creating population health problems and increasing health care spending. As a precaution, local policy makers have been promoting electronic medical data to help achieve five major goals of health care system: 1) improving health care quality, safety, and performance, 2) committing to patient health needs, 3) improving health care coordination, 4) improving the health of the population, and 5) ensuring privacy and security. However, in order to make these medical data to be 'Meaningful Use', to expand data usage, and to create more profits, many research difficulties have to be overcome and it will not an easy task. Currently medical data is scattered in different industries, data collection is difficult, and mutual analysis is rare. Furthermore, medical records have been accumulating to big data after many years. This not only significantly impacts original plan and research, but also creates bonus innovative applications and opportunities.
Objectives – Given that the current biomedical field in big data analysis infrastructure is still seriously lagging behind current trend, researchers have to spend considerable time on constructing and organizing their data and on interpreting meaning and identifying issues with these data. To revolutionize biomedical big data analysis, this study proposes a set of methods ranging from data storage to data analysis. Based on this set of methods, two novel applications for big data were verified, 1) prompt testing of medical reported incidents, such as adverse drug reactions reported incidents, 2) timely monitoring and tracking of temporal medical events, such as monitoring of newly marketed drugs. To achieve the objectives, this set of methods must have: 1) timeliness, to quickly respond process results, 2) effectiveness, shall reach low cost reach, 3) scalability, shall allow horizontal expansion of computing power and storage capacity, 4) easy calculation, convenient for testing and calculating tracking indicators, and 5) applicability.
Methods – Unlike epidemiological research methods, problems to be studied for tracking and analysis of temporal medical events cannot be delivered in advance. This study proposes a new model, providing an operation mechanism which allows for timely tracking and monitoring of medical events and uncovering relevant information. This model contains four parts, which are: 1) source of data, namely current electronic medical data, 2) data management, including big data storage model PDMdoc, temporal medical events model TMEdoc, and tactics and management of sharded cluster, 3) processing and computing, including sharded cluster operating procedures, cloud computing MapReduce big data processing methods, and an integrated temporal event tracking analysis, 4) tracking indicator, content mainly comprising of a number of indicators, and recording patient index value for every occurrence. Among them, indicators belong to practical application level; therefore impacting whether this model can achieve timely monitoring and tracking function, the essential part lies in data management and efficiency of processing and calculation method.
Results – Complexity of the research methods in this study: 1) sharded cluster horizontal scaling and degree of parallelism is 1 unit, specifically, every time a shard is added to the cluster system, the computing power and storage capacity will both be increased by 1 unit, not affected by the number of cluster nodes, 2) network I/O, only relevant to the amount of data for search results, irrelevant to the number of cluster nodes, 3) search and disk I/O, average seek time for PDMdoc and TMEdoc are O(1) and O(logd(STMEdoc/B)), respectively, average disk I/O for seek time, rotational delay, transmission time are 'O(1), O(1), O(EPDMdoc)' and 'O(logd(STMEdoc/B)), O(1), O(ETMEdoc × LTMEdoc)', respectively. Statistics in experiments performed, 1) data, gathered from Taiwan NHIRD LHID2010 Dataset, containing health care data of a total of one million people for the period 1996 to 2010, 2) test system, sharded cluster containing 3 shard nodes built on MongoDB and five PCs, 3) experiments results: a) benchmarks, the times required to search diseased patients from 8 disease groups for single server system and sharded cluster range from 0.607 to 63.248 seconds and from 0.336 to 29.484 seconds, respectively, the two systems have performance ratio of 1:2.024, b) adverse drug reactions reported incidents, take Januvia drug safety information published by FDA in September, 2009 for example, the test result for odds ratio is 1.626, showing that this type of incidents had significant occurrences in Taiwan as well, c) monitoring for newly marketed drugs, system processing capacity for number of TME can exceed 140,000 per second, the daily number of drugs that can be monitored is estimated to be above tens of thousands.
en
dc.description.provenanceMade available in DSpace on 2021-06-15T14:05:58Z (GMT). No. of bitstreams: 1
ntu-104-D95725005-1.pdf: 2806749 bytes, checksum: ad1591551fc55923744034f1f86aec7f (MD5)
Previous issue date: 2015
en
dc.description.tableofcontents目錄
致謝 i
中文摘要 ii
Abstract iv
目錄 vii
圖目錄 x
表目錄 xi
第一章 緒論 1
1.1 研究背景 1
1.2 研究動機 3
1.3 研究目的 5
第二章 相關研究 6
2.1 電子病歷研究資料庫 6
2.1.1 資料需求 6
2.1.2 常見的資料庫 7
2.2 全民健康保險研究資料庫 9
2.2.1 NHIRD服務模式 9
2.2.2 NHIRD資料內容 10
2.2.3 NHIRD相關研究 12
2.3 NoSQL資料庫 14
2.3.1 ACID vs. BASE 14
2.3.2 NoSQL類別 15
2.3.3 NoSQL效能 17
第三章 研究方法 18
3.1 研究方法概觀 18
3.2 巨量醫療資料儲存模型 20
3.2.1 病歷資料模型設計概念 20
3.2.2 個人病歷資料模型 22
3.2.3 資料重構與載入 23
3.2.4 資料模型基本操作 27
3.3 時序性醫療事件 28
3.3.1 構想 28
3.3.2 定義 29
3.3.3 病歷資料與時序性醫療事件的轉換對映 31
3.3.4 時序性醫療事件資料集的建構方法 32
3.3.5 資料操作 35
3.4 分片法 (Sharding) 37
3.4.1 分片叢集(Sharded Clusters) 37
3.4.2 分片策略 40
3.4.3 PDM與TME分片作法 44
3.5 MapReduce資料處理方法 46
3.5.1 MapReduce資料流程 46
3.5.2 MapReduce與分片叢集的相互運作 48
3.6 時序性醫療事件追蹤與分析方法 50
第四章 複雜度 53
4.1 儲存空間 53
4.2 搜尋與磁碟I/O 53
4.3 分片叢集與網路I/O 54
第五章 結果與討論 56
5.1 實驗系統 56
5.2 實驗一、TME搜尋效能 57
5.3 實驗二、檢驗藥物不良反應通報事件 60
5.4 實驗三、測試新上市藥品的監視 63
第六章 研究限制與未來發展 66
6.1 研究限制 66
6.2 未來發展 67
第七章 結論 68
參考文獻 70
dc.language.isozh-TW
dc.subjectMapReducezh_TW
dc.subject時序性醫療事件zh_TW
dc.subject醫療資料zh_TW
dc.subject藥物不良反應zh_TW
dc.subject巨量資料zh_TW
dc.subjectNoSQLzh_TW
dc.subject分片叢集zh_TW
dc.subjectmedical dataen
dc.subjectbig dataen
dc.subjectdrug reactionen
dc.subjecttemporal medical eventen
dc.subjectMapReduceen
dc.subjectsharded clusteren
dc.subjectNoSQLen
dc.title巨量醫療資料之時序事件追蹤與分析zh_TW
dc.titleTemporal Event Tracing on Big Medical Data Analyticsen
dc.typeThesis
dc.date.schoolyear103-2
dc.description.degree博士
dc.contributor.oralexamcommittee盧信銘,歐陽彥正,陳彥良,吳齊殷
dc.subject.keyword時序性醫療事件,醫療資料,藥物不良反應,巨量資料,NoSQL,分片叢集,MapReduce,zh_TW
dc.subject.keywordtemporal medical event,medical data,drug reaction,big data,NoSQL,sharded cluster,MapReduce,en
dc.relation.page75
dc.rights.note有償授權
dc.date.accepted2015-08-20
dc.contributor.author-college管理學院zh_TW
dc.contributor.author-dept資訊管理學研究所zh_TW
顯示於系所單位:資訊管理學系

文件中的檔案:
檔案 大小格式 
ntu-104-1.pdf
  未授權公開取用
2.74 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved