利用基因體選拔確認最佳的基因型

陳思萍; Szu-Ping Chen

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89354

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	董致韡	zh_TW
dc.contributor.advisor	Chih-Wei Tung	en
dc.contributor.author	陳思萍	zh_TW
dc.contributor.author	Szu-Ping Chen	en
dc.date.accessioned	2023-09-07T16:39:32Z	-
dc.date.available	2023-11-09	-
dc.date.copyright	2023-09-11	-
dc.date.issued	2023	-
dc.date.submitted	2023-08-07	-
dc.identifier.citation	Akdemir, D., & Isidro-Sánchez, J. (2019). Design of training populations for selective phenotyping in genomic prediction. Scientific Reports, 9(1), 1446. Akdemir, D., Sanchez, J. I., & Jannink, J. L. (2015). Optimization of genomic selection training populations with a genetic algorithm. Genetics Selection Evolution, 47, 1-10. Atkinson, A., Donev, A., & Tobias, R. (2007). Optimum experimental designs, with SAS (Vol. 34). OUP Oxford. Blondel, M., Onogi, A., Iwata, H., & Ueda, N. (2015). A ranking approach to genomic selection. PloS one, 10(6), e0128570. Breiman, L. (2001). Random forests. Machine learning, 45, 5-32. Burges, C. J. (2010). From ranknet to lambdarank to lambdamart: An overview. Learning, 11(23-581), 81. Chung, P. Y., & Liao, C. T. (2020). Identification of superior parental lines for biparental crossing via genomic prediction. PloS one, 15(12), e0243159. Covarrubias-Pazaran, G. (2016). Genome-assisted prediction of quantitative traits using the R package sommer. PloS one, 11(6), e0156744. Endelman, J. B. (2011). Ridge regression and other kernels for genomic selection with R package rrBLUP. The plant genome, 4(3). Heffner, E. L., Lorenz, A. J., Jannink, J. L., & Sorrells, M. E. (2010). Plant breeding with genomic selection: gain per unit time and cost. Crop science, 50(5), 1681-1690. Heslot, N., & Feoktistov, V. (2020). Optimization of selective phenotyping and population design for genomic prediction. Journal of Agricultural, Biological and Environmental Statistics, 25(4), 579-600. Isidro, J., Jannink, J. L., Akdemir, D., Poland, J., Heslot, N., & Sorrells, M. E. (2015). Training set optimization under population structure in genomic selection. Theoretical and applied genetics, 128, 145-158. Järvelin, K., & Kekäläinen, J. (2017, August). IR evaluation methods for retrieving highly relevant documents. In ACM SIGIR Forum (Vol. 51, No. 2, pp. 243-250). New York, NY, USA: ACM. Kristensen, P. S., Jensen, J., Andersen, J. R., Guzmán, C., Orabi, J., & Jahoor, A. (2019). Genomic prediction and genome-wide association studies of flour yield and alveograph quality traits using advanced winter wheat breeding material. Genes, 10(9), 669. Laloë, D. (1993). Precision and information in linear models of genetic evaluation. Genetics Selection Evolution, 25(6), 557-576. Laloë, D., Phocas, F., & Menissier, F. (1996). Considerations on measures of precision and connectedness in mixed linear models of genetic evaluation. Genetics selection evolution, 28(4), 359-378. Li, P., Wu, Q., & Burges, C. (2007). Mcrank: Learning to rank using multiple classification and gradient boosting. Advances in neural information processing systems, 20. Meuwissen, T. H., Hayes, B. J., & Goddard, M. (2001). Prediction of total genetic value using genome-wide dense marker maps. genetics, 157(4), 1819-1829. Ou, J. H., & Liao, C. T. (2019). TSDFGS: training set determination for genomic selection. R package version, 1(0). Ou, J. H., & Liao, C. T. (2019). Training set determination for genomic selection. Theoretical and Applied Genetics, 132, 2781-2792. R Core Team, (2019) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Rincent, R., Laloë, D., Nicolas, S., Altmann, T., Brunel, D., Revilla, P., ... & Moreau, L. (2012). Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize inbreds (Zea mays L.). Genetics, 192(2), 715-728. Rincent, R., Charcosset, A., & Moreau, L. (2017). Predicting genomic selection efficiency to optimize calibration set and to assess prediction accuracy in highly structured populations. Theoretical and applied genetics, 130, 2231-2247. Searle, S. R., Casella, G., & McCulloch, C. E. (2009). Variance components. John Wiley & Sons. Spindel, J., Begum, H., Akdemir, D., Virk, P., Collard, B., Redoña, E., ... & McCouch, S. R. (2015). Genomic selection and association mapping in rice (Oryza sativa): effect of trait genetic architecture, training population composition, marker number and statistical model on accuracy of rice genomic selection in elite, tropical rice breeding lines. PLoS genetics, 11(2), e1004982. Stewart-Brown, B. B., Song, Q., Vaughn, J. N., & Li, Z. (2019). Genomic selection for yield and seed composition traits within an applied soybean breeding program. G3: Genes, Genomes, Genetics, 9(7), 2253-2265. Tanaka, R., & Iwata, H. (2018). Bayesian optimization for genomic selection: a method for discovering the best genotype among a large number of candidates. Theoretical and applied genetics, 131, 93-105. Tsai, S. F., Shen, C. C., & Liao, C. T. (2021). Bayesian optimization approaches for identifying the best genotype from a candidate population. Journal of Agricultural, Biological and Environmental Statistics, 26, 519-537. Xu, Y., Li, P., Zou, C., Lu, Y., Xie, C., Zhang, X., ... & Olsen, M. S. (2017). Enhancing genetic gain in the era of molecular breeding. Journal of Experimental Botany, 68(11), 2641-2666. Zhao, K., Tung, C. W., Eizenga, G. C., Wright, M. H., Ali, M. L., Price, A. H., ... & McCouch, S. R. (2011). Genome-wide association mapping reveals a rich genetic architecture of complex traits in Oryza sativa. Nature communications, 2(1), 467.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/89354	-
dc.description.abstract	在植物育種中，基因體選拔 (genomic selection) 可以基於基因型資料去挑選出優良的品系且能夠省去調查外表型的繁重工作。而在基因體選拔中則需要在族群中挑選出最佳的基因型組合去建立訓練族群 (training set)，而使得挑選出來的訓練族群所建立的模型能夠有優良的預測效果。在本篇研究中，利用基因組BLUP (GBLUP) 預測模型估計基因型值，並以正規化累計折損增益(Normalized Discounted Cumulative Gain; NDCG) 作為評估的指標。由於在育種中，大多數時間在意的是優良的品種的表現，因此採用 NDCG 基於排序正確性的方法作為評估標準。我們提出利用廣義決定係數 (generalized coefficient of determination; CD)來找出建立訓練集的最佳的基因型，並以四個資料集分別為兩組水稻資料、一組小麥資料與一組大豆資料使用R語言進行分析模擬試驗。結果顯示 CD 方法在遺傳率低且訓練族群大小較小時所挑選出的訓練集表現對於優秀品系的正確排序能力能夠優於其他方法。	zh_TW
dc.description.abstract	In plant breeding, genomic selection can select superior lines based on their genotypes without laborious phenotyping. For genomic selection, we have to choose the best genotypes to build the training set, which will have an excellent prediction performance. In this study, we predicted the genetic values by the GBLUP model and evaluated the performance by the Normalized Discounted Cumulative Gain (NDCG). As breeding primarily focuses on the performance of outstanding varieties, we utilize the NDCG score, a criterion based on their ranking quality. We proposed a method to find the best genotypes for building the training set by using the generalized coefficient of determination (CD) and illustrating the performance by four datasets, including two rice datasets (tropical rice and 44K rice), a wheat dataset, and a soybean dataset. We implement our simulation and analysis in R language. The simulation results show that the CD method outperforms in selecting great lines with the correct order when the trait heritability is low, or the training set size is small.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2023-09-07T16:39:32Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2023-09-07T16:39:32Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	口試委員會審定書 # 誌謝 i 中文摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vi LIST OF TABLES viii Chapter 1 Introduction 1 Chapter 2 Materials and Methods 4 2.1 Genetic dataset materials 4 2.1.1 Tropical rice dataset 4 2.1.2 Wheat dataset 4 2.1.3 44K rice dataset 5 2.1.4 Soybean dataset 5 2.2 Methods 6 2.2.1 Criteria for Training Set Optimization 6 2.2.2 Ability of a Training Set to Identify the Best Genotypes 9 2.2.3 The procedure to evaluate the ability of the training sets obtained by using the CD 11 2.2.4 A Comparison between optimization criteria 12 2.2.5 Genotypes selected from each cluster by D-efficiency 14 2.2.6 Evaluating the robustness of various methods 14 Chapter 3 Results 15 3.1 Generalized coefficient of determination (CD) 15 3.2 Evaluate the efficiency of the training sets 17 3.3 Comparison between optimization criteria based on WGR model and GBLUP model 22 Chapter 4 Discussion 28 4.1 Sampling rule to determine individuals to select in each subpopulation in optimal training set 28 4.2 Evaluation of the robustness of different methods 31 APPENDIX 40 REFERENCES 45	-
dc.language.iso	en	-
dc.subject	正規化累計折損增益	zh_TW
dc.subject	基因組選拔	zh_TW
dc.subject	廣義決定係數	zh_TW
dc.subject	generalized coefficient of determination	en
dc.subject	NDCG	en
dc.subject	CD	en
dc.subject	genomic selection	en
dc.title	利用基因體選拔確認最佳的基因型	zh_TW
dc.title	Identification of the best genotypes from a breeding population via genomic selection	en
dc.type	Thesis	-
dc.date.schoolyear	111-2	-
dc.description.degree	碩士	-
dc.contributor.coadvisor	廖振鐸	zh_TW
dc.contributor.coadvisor	Chen-Tuo Liao	en
dc.contributor.oralexamcommittee	蔡欣甫;高振宏	zh_TW
dc.contributor.oralexamcommittee	Shin-Fu Tsai;Chen-Hung Kao	en
dc.subject.keyword	基因組選拔,正規化累計折損增益,廣義決定係數,	zh_TW
dc.subject.keyword	genomic selection,NDCG,generalized coefficient of determination,CD,	en
dc.relation.page	47	-
dc.identifier.doi	10.6342/NTU202302577	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2023-08-07	-
dc.contributor.author-college	生物資源暨農學院	-
dc.contributor.author-dept	農藝學系	-
dc.date.embargo-lift	2024-08-01	-
顯示於系所單位：	農藝學系

文件中的檔案：

檔案	大小	格式
ntu-111-2.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	2.12 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。