Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 生物資源暨農學院
  3. 農藝學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92795
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor林彥蓉zh_TW
dc.contributor.advisorYann-Rong Linen
dc.contributor.author林上傑zh_TW
dc.contributor.authorShang-Chieh Linen
dc.date.accessioned2024-07-01T16:08:29Z-
dc.date.available2024-07-02-
dc.date.copyright2024-07-01-
dc.date.issued2024-
dc.date.submitted2024-06-24-
dc.identifier.citationAbdellaoui A, Yengo L, Verweij KJ, Visscher PM (2023) 15 years of GWAS discovery: Realizing the promise. Am J Hum Genet 110:179-194
Balding DJ (2006) A tutorial on statistical methods for population association studies. Nat Rev Genet 7:781-791
Barton NH, Etheridge AM, Véber A (2017) The infinitesimal model: Definition, derivation, and implications. Theor Popul Biol 118:50-73
Bolboacă SD, Jäntschi L, Sestraş AF, Sestraş RE, Pamfil DC (2011) Pearson-Fisher Chi-Square Statistic Revisited. Information 2:528-545
Broman KW, Wu H, Sen S, Churchill GA (2003) R/qtl: QTL mapping in experimental crosses. Bioinformatics 19:889-890
Cantor RM, Lange K, Sinsheimer JS (2010) Prioritizing GWAS results: A review of statistical methods and recommendations for their application. Am J Hum Genet 86:6-22
Carrizosa E, Galvis Restrepo M, Romero Morales D (2021) On clustering categories of categorical predictors in generalized linear models. Expert Syst Appl 182:115245
Carrizosa E, Mortensen LH, Romero Morales D, Sillero-Denamiel MR (2022) The tree based linear regression model for hierarchical categorical variables. Expert Syst Appl 203:117423
Chou C-H, Lin H-S, Wen C-H, Tung C-W (2022) Patterns of genetic variation and QTLs controlling grain traits in a collection of global wheat germplasm revealed by high-quality SNP markers. BMC Plant Biol 22:455
Crick F (1970) Central Dogma of Molecular Biology. Nature 227:561-563
Cui Y, Kang G, Sun K, Qian M, Romero R, Fu W (2008) Gene-centric genomewide association study via entropy. Genetics 179:637-650
Delucchi KL (1993) On the use and misuse of chi-square. A handbook for data analysis in the behavioral sciences: Statistical issues, Lawrence Erlbaum Associates, Inc, pp 295-320
Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55:997-1004
Do C, Shearer A, Suzuki M, Terry MB, Gelernter J, Greally JM, Tycko B (2017) Genetic–epigenetic interactions in cis: a major focus in the post-GWAS era. Genome Biol 18:120
Dong C, Chu X, Wang Y, Wang Y, Jin L, Shi T, Huang W, Li Y (2008) Exploration of gene–gene interaction effects using entropy-based methods. Eur J Hum Genet 16:229-235
Ferrario PG, König IR (2016) Transferring entropy to the realm of GxG interactions. Brief Bioinform 19:136-147
Franke TM, Ho T, Christie CA (2012) The chi-square test: Often used and more often misinterpreted. Am J Eval 33:448-458
Fushing H, Chen C (2014) Data mechanics and coupling geometry on binary bipartite networks. PLOS ONE 9:e106154
Goodman LA (1969) How to ransack social mobility tables and other kinds of cross-classification tables. Am J Sociol 75:1-40
Goodman LA (1971) Partitioning of chi-square, analysis of marginal contingency tables, and estimation of expected frequencies in multidimensional contingency tables. J Am Stat Assoc 66:339-344
Gordon D, Finch SJ, Nothnagel M, Ott J (2002) Power and sample size calculations for case-control genetic association tests when errors are present: application to single nucleotide polymorphisms. Hum Hered 54:22-33
Hong EP, Park JW (2012) Sample size and statistical power calculation in genetic association studies. Genomics Inform 10:117
Kang G, Zuo Y (2007) Entropy-based joint analysis for two-stage genome-wide association studies. J Hum Genet 52:747-756
Kang S-H, Shin D-W, Oh M-S, Ahn CW (2004) An investigation on the allelic chi-square test used in genetic association studies. Biom J 46:699-706
Kononenko I (1995) On biases in estimating multi-valued attributes. Ijcai, pp 1034-1040
Korte A, Farlow A (2013) The advantages and limitations of trait analysis with GWAS: a review. Plant Methods 9:29
Lancaster H (1949) The derivation and partition of χ2 in certain discrete distributions. Biometrika 36:117-129
Lande R (1975) The maintenance of genetic variability by mutation in a polygenic character with linked loci. Genet Res 26:221-235
Lewis D, Burke CJ (1949) The use and misuse of the chi-square test. Psychol Bull 46:433
Li Y-M, Xiang Y (2012) Genotype-based association analysis via entropy. J Hum Genet 57:734-737
Lo H-C (2023) Genome-wide association mapping and analyssis of genotype-by-environment interation of ehat root traits. Agronomy. National Taiwan University
Mansueto L, Fuentes RR, Borja FN, Detras J, Abriol-Santos JM, Chebotarov D, Sanciangco M, Palis K, Copetti D, Poliakov A, Dubchak I, Solovyev V, Wing RA, Hamilton RS, Mauleon R, McNally KL, Alexandrov N (2017) Rice SNP-seek database update: new SNPs, indels, and queries. Nucleic Acids Res 45:D1075-d1081
Mathieson I, McVean G (2012) Differential confounding of rare and common variants in spatially structured populations. Nat Genet 44:243-246
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M (2010) The genome analysis toolkit: a mapreduce framework for analyzing next-generation DNA sequencing data. Genome Res 20:1297-1303
Mckinney B, Pajewski N (2012) Six degrees of epistasis: statistical network models for GWAS. Front Genet 2:109
Overall JE (1980) Power of chi-square tests for 2× 2 contingency tables with small expected frequencies. Psychol Bull 87:132
Reich DE, Lander ES (2001) On the allelic spectrum of human disease. Trends Genet 17:502-510
Shannon CE (1948) A mathematical theory of communication. The Bell system technical journal 27:379-423
Simon HA (1962) The architecture of complexity. Proc Am Philos Soc 106:467-482
Spencer CCA, Su Z, Donnelly P, Marchini J (2009) Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip. PLoS Genet 5:e1000477
Stokell BG, Shah RD, Tibshirani RJ (2021) Modelling high-dimensional categorical data using nonconvex fusion penalties. J R Stat Soc Series B Stat Methodol 83:579-611
Sun L, Wang C, Hu Y-Q (2016) Utilizing mutual information for detecting rare and common variants associated with a categorical trait. PeerJ 4:e2139
Team RC (2021) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria
Thompson B (1988) Misuse of chi-square contingency-table test statistics. Educational & Psychological Research 8:39–49
Tibbs Cortes L, Zhang Z, Yu J (2021) Status and prospects of genome-wide association studies in plants. Plant Genome 14:e20077
Tourrette E, Bernardo R, Falque M, Martin OC (2019) Assessing by modeling the consequences of increased recombination in recurrent selection of Oryza sativa and Brassica rapa. G3 (Bethesda) 9:4169-4181
van der Sluis S, Verhage M, Posthuma D, Dolan CV (2010) Phenotypic complexity, measurement bias, and poor phenotypic resolution contribute to the missing heritability problem in genetic association studies. PLOS ONE 5:e13929
Wen Y, Fang Y, Hu P, Tan Y, Wang Y, Hou L, Deng X, Wu H, Zhu L, Zhu L, Chen G, Zeng D, Guo L, Zhang G, Gao Z, Dong G, Ren D, Shen L, Zhang Q, Xue D, Qian Q, Hu J (2020) Construction of a high-density genetic map based on SLAF markers and QTL analysis of leaf size in rice. Front Plant Sci 11
Yu J, Pressoir G, Briggs WH, Vroh Bi I, Yamasaki M, Doebley JF, McMullen MD, Gaut BS, Nielsen DM, Holland JB, Kresovich S, Buckler ES (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet 38:203-208
Zhao J, Boerwinkle E, Xiong M (2005) An entropy-based statistic for genomewide association studies. Am J Hum Genet 77:27-40
Zheng S, Shi N-Z, Zhang Z (2012) Generalized measures of correlation for asymmetry, nonlinearity, and beyond. J Am Stat Assoc 107:1239-1252
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/92795-
dc.description.abstract大多數農藝性狀是通過數字或類別的方式記錄。考慮到類別型數據非線性和離散的特性,我們引入了rescaled conditional entropy(RCE)來測量類別型性狀與遺傳變異之間的關係。我們的研究結果表明,即使外表型和基因型是獨立的,外表型和基因型種類的頻度及數量都會影響RCE。利用RCE的特性,我們設計了一種演算法來驗證遺傳變異的顯著性。模擬結果表明,隨著群體大小和外表型遺傳力的增加,檢測到的數量性狀基因座(QTL)的準確性和數量也隨之增加。將RCE演算法用於3K水稻基因組數據庫中時,我們發現該演算法能夠檢測到單核苷酸多態性(SNP)之間的相互作用,與皮爾森卡方檢驗相比更具優勢。考慮到RCE對基因型類別頻率的敏感性,我們設計了另一種基於每個基因型類別的熵演算法,並將其應用於3K水稻基因組(RG)群體。熵算法的結果則通過熱圖和data mechanics進行視覺化。我們的分析表明,熵演算法檢測到的基因型類別通常與每個品種的次族群相關,而僅部分與外表型相關。此外,在同一個次族群內,基因型類別的表現模式也有所不同。熵演算法也應用於53個小麥品種的最大根長(MRL)動態,其資料特性在於不同外表型類別間具有特定結構。藉由觀察成對外表型類別所偵測到的基因型類別表現型式,我們發現差異較大的外表型類別之間所偵測到的基因型類別表現型式,能更好的分辨外表型的變異。zh_TW
dc.description.abstractMost agronomic traits are recorded either numerically or categorically. Considering the non-linear and discrete property of nominal data, we introduce rescaled conditional entropy (RCE) to measure the relationship between dependency between nominal trait and genetic variants. Our findings demonstrate that both the number and frequency of phenotypic and genotypic levels affect RCE, even when the phenotype and genotypic variants are independent. Leveraging the property of RCE, we designed an algorithm to validate the significance of genetic variants. Simulation results indicated that the accuracy and number of detected quantitative trait locus (QTL) increased as the population size and heritability increased. When applying RCE algorithm in 3K Rice Genome database, we found the algorithm could detect the interaction between single nucleotide polymorphisms (SNPs) comparing to Pearson’s chi-squared test. Considering RCE is sensitive to genotypic level frequency, we designe another algorithm based on the entropy of each genotypic level and applied it to the 3K rice genome (RG) population. The result of entropy algorithm is visualized with heatmap and data mechanics. Our analysis reveal that genotypic levels detected by the entropy algorithm are typically associated with subpopulations of each variety, while only partially with phenotype. Moreover, within the same subpopulation, the pattern of genetic levels varies. The entropy algorithm is also applied to the maximum root length (MRL) dynamics of 53 wheat varieties, which has a hierarchical structure between different MRL dynamics types. Pairwise comparisons between different MRL dynamics types demonstrate that types with greater distance can be clearly distinguished by the presence-absence pattern of genetic levels.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-07-01T16:08:29Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-07-01T16:08:29Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsABSTRACT iii
ACKNOWLEDGEMENTS iv
TABLE OF CONTENTS vi
LIST OF TABLES viii
LIST OF FIGURES ix
APPENDIX xi
I. INTRODUCTION 1
1.1 Introduction 1
II. RCE ALGORITHM AND ENTROPY ALGORITHM 5
2.1 RCE algorithm 5
2.2 Entropy algorithm 10
2.3 Structured response 19
III. SIMULATION STUDY ON RCE ALGORITHM 20
3.1 Properties of RCE 20
3.2 Simulation study on RIL population 28
IV. APPLICATION OF RCE AND ENTROPY ALGORITHMS IN 3K RG DATABASE 38
4.1 Experimental setting 38
4.2 Results 41
4.3 Discussion 73
V. APPLICATION OF ENTROPY ALGORITHM IN WHEAT ROOT DYNAMICS DATA 76
5.1 Maximum root length (MRL) dynamics data 76
5.2 Results 77
5.3 Discussion 86
VI. Conclusion 87
Reference 88
APPENDIX 91
-
dc.language.isoen-
dc.title利用熵偵測基因體資料中類別型資料的關聯性zh_TW
dc.titleDetecting Association of Categorical Traits with Genomic Data by Entropy-Based Methodsen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.coadvisor劉力瑜zh_TW
dc.contributor.coadvisorLi-Yu Daisy Liuen
dc.contributor.oralexamcommittee董致韡;林亞平zh_TW
dc.contributor.oralexamcommitteeChih-Wei Tung;Ya-Ping Linen
dc.subject.keyword熵,全基因組關聯分析,zh_TW
dc.subject.keywordEntropy,Genome-wide association study,en
dc.relation.page106-
dc.identifier.doi10.6342/NTU202401263-
dc.rights.note未授權-
dc.date.accepted2024-06-24-
dc.contributor.author-college生物資源暨農學院-
dc.contributor.author-dept農藝學系-
顯示於系所單位:農藝學系

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf
  目前未授權公開取用
12.8 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved