Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 工學院
  3. 醫學工程學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48953
Title: 巨量資料之病例對照研究平台
Matching Platform of Case Control Study on Big Data
Authors: Bo-Tao Pan
潘博韜
Advisor: 翁昭旼
Co-Advisor: 蔣以仁
Keyword: 傾向分數匹配,巨量資料,健保資料庫,觀察型研究,非關聯式資料庫,
big data,propensity score matching,NHIRD,observational study,NoSQL database,
Publication Year : 2016
Degree: 碩士
Abstract: 本研究的目的在於設計一套流程,用於產生觀察型研究的研究對象,並且實作系統。透過資料篩選、變數的建立與處理、對照組的匹配以及初步的統計檢定四個步驟,讓使用者對研究對象有一個概觀,並得到資料的雛型,以做進一步的研究。
我們以國家衛生研究院發行的百萬人承保抽樣歸人檔(LHID2010)作為研究資料。這份資料包含了台灣2010年全民健保在保者的100萬人抽樣檔,其在1996年至2010年間的所有就醫資料,以歸人的形式建立。這樣的資料結構與龐大的資料量正適合利用NoSQL資料庫schema free與水平擴充的特性來處理。
因此我們建立了MongoDB replica sharded cluster,利用分片(sharding)的功能,可以提升查詢效率,再配合Map-Reduce方法,可以對資料進行較複雜的運算,產生修整好的資料,提供後續的統計分析。
The objective of this research is to make a platform, helping users to obtain the study population of observational studies by a four stages procedure including data querying, variables creating, control group matching and significance testing.
Data querying platform was used to find out the study population. It provides a simple interface for NoSQL database query. It also automatically makes a flow chart of query results, helping users manage the process of query.
Variables creating platform let users extract detail attribute information of selected patients from previous stage such as disease diagnoses or drugs taken records. It was done by a Mongo Map-Reduce process and export to a csv file for next stage.
Control group matching platform read the patients and variables from the previous stage and do propensity score matching. Users choose treatment (or exposure), outcome or other covariates from the variables, then fit the generalized linear model and match the control group by fitted value.
Significant testing platform did t-test on each variable and chi-square goodness of fit test on each age group between case group and control group to see if there is any significant difference.
The first two stages of working process can be separated from others into two independent parts. The first part prepares data for observational study, the second part implements statistic analyzing. Users may have their own analyses on the data we prepared or loading their own data into our matching platform.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48953
DOI: 10.6342/NTU201603499
Fulltext Rights: 有償授權
Appears in Collections:醫學工程學研究所

Files in This Item:
File SizeFormat 
ntu-105-1.pdf
  Restricted Access
1.61 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved