Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 生物資源暨農學院
  3. 農藝學系
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92795
Title: 利用熵偵測基因體資料中類別型資料的關聯性
Detecting Association of Categorical Traits with Genomic Data by Entropy-Based Methods
Authors: 林上傑
Shang-Chieh Lin
Advisor: 林彥蓉
Yann-Rong Lin
Co-Advisor: 劉力瑜
Li-Yu Daisy Liu
Keyword: 熵,全基因組關聯分析,
Entropy,Genome-wide association study,
Publication Year : 2024
Degree: 碩士
Abstract: 大多數農藝性狀是通過數字或類別的方式記錄。考慮到類別型數據非線性和離散的特性,我們引入了rescaled conditional entropy(RCE)來測量類別型性狀與遺傳變異之間的關係。我們的研究結果表明,即使外表型和基因型是獨立的,外表型和基因型種類的頻度及數量都會影響RCE。利用RCE的特性,我們設計了一種演算法來驗證遺傳變異的顯著性。模擬結果表明,隨著群體大小和外表型遺傳力的增加,檢測到的數量性狀基因座(QTL)的準確性和數量也隨之增加。將RCE演算法用於3K水稻基因組數據庫中時,我們發現該演算法能夠檢測到單核苷酸多態性(SNP)之間的相互作用,與皮爾森卡方檢驗相比更具優勢。考慮到RCE對基因型類別頻率的敏感性,我們設計了另一種基於每個基因型類別的熵演算法,並將其應用於3K水稻基因組(RG)群體。熵算法的結果則通過熱圖和data mechanics進行視覺化。我們的分析表明,熵演算法檢測到的基因型類別通常與每個品種的次族群相關,而僅部分與外表型相關。此外,在同一個次族群內,基因型類別的表現模式也有所不同。熵演算法也應用於53個小麥品種的最大根長(MRL)動態,其資料特性在於不同外表型類別間具有特定結構。藉由觀察成對外表型類別所偵測到的基因型類別表現型式,我們發現差異較大的外表型類別之間所偵測到的基因型類別表現型式,能更好的分辨外表型的變異。
Most agronomic traits are recorded either numerically or categorically. Considering the non-linear and discrete property of nominal data, we introduce rescaled conditional entropy (RCE) to measure the relationship between dependency between nominal trait and genetic variants. Our findings demonstrate that both the number and frequency of phenotypic and genotypic levels affect RCE, even when the phenotype and genotypic variants are independent. Leveraging the property of RCE, we designed an algorithm to validate the significance of genetic variants. Simulation results indicated that the accuracy and number of detected quantitative trait locus (QTL) increased as the population size and heritability increased. When applying RCE algorithm in 3K Rice Genome database, we found the algorithm could detect the interaction between single nucleotide polymorphisms (SNPs) comparing to Pearson’s chi-squared test. Considering RCE is sensitive to genotypic level frequency, we designe another algorithm based on the entropy of each genotypic level and applied it to the 3K rice genome (RG) population. The result of entropy algorithm is visualized with heatmap and data mechanics. Our analysis reveal that genotypic levels detected by the entropy algorithm are typically associated with subpopulations of each variety, while only partially with phenotype. Moreover, within the same subpopulation, the pattern of genetic levels varies. The entropy algorithm is also applied to the maximum root length (MRL) dynamics of 53 wheat varieties, which has a hierarchical structure between different MRL dynamics types. Pairwise comparisons between different MRL dynamics types demonstrate that types with greater distance can be clearly distinguished by the presence-absence pattern of genetic levels.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92795
DOI: 10.6342/NTU202401263
Fulltext Rights: 未授權
Appears in Collections:農藝學系

Files in This Item:
File SizeFormat 
ntu-112-2.pdf
  Restricted Access
12.8 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved