藉由臺灣土雞表現序列標籤序列庫偵測新穎之單一核苷酸多型性

Yi-Hui Wang; 王怡惠

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/40668

標題:	藉由臺灣土雞表現序列標籤序列庫偵測新穎之單一核苷酸多型性 Detection of novel SNPs in Taiwan Country chicken EST libraries
作者:	Yi-Hui Wang 王怡惠
指導教授:	林恩仲(En-Chung Lin)
關鍵字:	基因型鑑定,單一核&#33527,酸多型性,生物資訊工具,表現標籤序列,臺灣土雞, Genotyping,Single nucleotide polymorphism,Bioinformatic tools,Expressed sequence tag,Taiwan Country chicken,
出版年 :	2008
學位:	碩士
摘要:	單一核苷酸多型性（single nucleotide polymorphism, SNP）為基因中常見之DNA變異，多被應用為遺傳標誌。SNP發生於基因的編碼區域（coding region）則稱為cSNP（coding SNP），會造成胺基酸序列及蛋白質表現改變者為非同義單一核苷酸多型性（non-synonymous SNP, nsSNP）；反之，則為同義單一核苷酸多型性（synonymous SNP, sSNP）。目前雞的SNP之發現多來自基因組序列，其準確性較表現序列標籤（expressed sequence tags, ESTs）為佳。然而，利用多個不同來源或組織之大規模ESTs序列庫資料可有助於提升cSNP之偵測機率。據此，本研究之目的為預測並確認臺灣土雞EST序列庫之新穎SNPs。本研究針對兩個品系來自同源祖先之臺灣土雞（B品系－高產肉，L2品系－高產蛋）所建立七個不同組織（腦垂腺、卵巢、肝臟、脂肪組織、輸卵管、殼腺及肌肉）ESTs序列庫資料進行分析。42,404個定序圖譜使用序列品質分析方法篩選（主要為去除低品質、intron、多基因嵌插、短片段及低複雜性序列），可得36,463個高品質序列組成3,455個contigs及11,649個singlets（共15,104個unique序列）以進行身分與功能性註解及SNP預測分析。身份與功能性註解方面，15,104個unique序列利用blastn比對搜尋NCBI之NT及RefSeq資料庫，有10,490個unique序列具有雞的身份註解，其中8,768個unique序列之身份註解來自為RefSeq資料庫，再以FatiGOplus進行功能性解析，得知3,457個unique序列具有功能註解（Gene Ontology annotation）。結果顯示，各組織主要之共同功能為參與初級代謝過程（primary metabolic process）與蛋白質鍵結（protein binding），其基因活化時之表現位置則以細胞內不同胞器（cell part）為主；而各組織亦有其特有之主要功能。 SNP預測則利用Phred/Phrap/Polyphred系列軟體，並設定預測品質篩選標準為SNP位置點至少是在5條EST序列重疊處及交替基因頻率大於等於0.15。所得帶有候選SNPs位置點之序列再分別與NCBI之Genome及dbSNP資料庫進行比對搜尋以確認其於基因序列之位置，並後續利用SNP基因型鑑定平台進行候選nsSNP之確認及基因型判讀。分析結果顯示，於401個contigs預測得1,107個SNPs，且每340 bp即偵測得一SNP。另網路資料庫比對結果顯示座落於5’UTR，exon，intron，3’UTR，pseudogene及未知區域之SNP個數分別為60，704，10，121，5及207個，而此704個cSNPs包含545個nsSNPs及159個sSNPs。此外，blast分析結果顯示，進行分析之1,107個候選SNPs中僅290個候選SNPs為已知。SNP確認及基因型判讀之分析結果則顯示預測結果準確率達到91%，且部分基因型分布頻率於兩品系間具有顯著差異（P＜0.05）。由此可知，本研究所使用之分析流程及策略的確可有效進行臺灣土雞之SNP預測，並進一步篩選出候選SNP以供後續研究。利用生物資訊工具及公開資料庫進行臺灣土雞SNP預測對於了解其座落位置之基因功能性研究有所裨益。然而，預測結果仍須經分子生物技術確認，藉此提升預測結果之準確性及實用性。 Single nucleotide polymorphism (SNP) is the basic DNA variation in genome as the genetic marker for biological study. A SNP locating at coding region is called cSNP, which results in two categories: non-synonymous SNP (nsSNP) with a change in corresponding amino acid and synonymous SNP (sSNP) without change of amino acid sequence. Most of the SNPs in chicken were discovered from genome sequences that are considered more accurate than expressed sequence tags (ESTs). However, the possibility of cSNP detection might be increased by using large-scale ESTs from multiple sources and tissues. The objective of this study is to predict and validate novel SNPs from Taiwan Country chicken EST libraries. This study adopted those EST libraries that were constructed from seven different tissues (pituitary gland, ovary, liver, adipose tissue, oviduct, shell gland, and muscle) of two different lines (B－meat production and L2－egg production ) of Taiwan Country chickens. The 42,404 clones of EST libraries were sequenced. These sequences were analyzed with certain quality control criteria to exclude those sequences of low quality, intron, multiple genes, short and low complexity, thus resulted in 36,463 high quality EST sequences. There were 3,455 contigs assembled along with 11,649 singlets separated after cluster analysis. The results of cluster analysis were used for sequence identity, functional annotation and SNPs prediction. For sequence identity, 15,104 unique sequences were searched by blastn against NT and RefSeq database of National Center of Biotechnology Information (NCBI). There were 10,490 unique sequences with chicken identity, in which 8,768 unique sequences were from RefSeq database. Using FatiGOplus software for functional annotation, it obtained 3,457 unique sequences with Gene Ontology (GO) annotation. The results showed the major commonly functions in the different tissues were primary metabolic process and protein binding, and the activated gene located at cell part. Besides, there were specific major functions for different tissues. The SNPs prediction was performance by Polyphred algorithm and the filtering criteria of a SNP detected in a particular contig were at least 5 EST sequences in that contig and the SNP frequency of 0.15. Those contigs and predicted SNPs were searched against Genome and dbSNP database on NCBI website for confirming their locations. In further using SNP genotyping platform for validating candidate SNPs and analyzing of genotypes. In the results, we obtained 1,107 predicted SNPs which were from 401 contigs with a predicted SNP per 340 base pairs. The number of SNPs localized at 5’ UTR, exon, intron, 3’ UTR, pseudogene, and unknown location were 60, 704, 10, 121, 5, and 207, respectively. There were 545 synonymous and 159 non-synonymous SNPs in those 704 cSNPs. In addition, only 290 out of those 1,107 predicted SNPs were found in dbSNP database using blast. The results of SNP validation and genotyping analysis were showed the accuracy of prediction was 91% and partial genotype frequency of candidate SNPs were significant difference between two lines of Taiwan Country chickens. According to the preliminary results, the analysis pipeline and strategy which used in this study was effective for SNP prediction in Taiwan Country chickens, and facilitating to select candidate SNPs for further study. The predicted SNPs might have certain effects on the functionality of those genes in which they locate. However, it is necessary to validate these predicted SNPs by certain molecular biology methods for enhancing the accuracy and practicability in Taiwan Country chickens industry.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/40668
全文授權:	有償授權
顯示於系所單位：	動物科學技術學系

文件中的檔案：

檔案	大小	格式
ntu-97-1.pdf 未授權公開取用	2.4 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。