Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/56339
Title: | 利用網路分析作用於低溫電子顯微鏡投影照片分類三維構型 Grouping 3D Structure Conformations using Network Analysis on 2D CryoEM Projection Images |
Authors: | Hung-Yi Wu 吳泓毅 |
Advisor: | 王奕翔(I-Hsiang Wang) |
Co-Advisor: | 杜憶萍(I-Ping Tu) |
Keyword: | 低溫電子顯微術,異質性問題,傅立葉切片理論,共通線距離,社群偵測, Cryo-EM,Heterogeneity problem,Fourier-slice Theorem,Common line distance,Community detection, |
Publication Year : | 2020 |
Degree: | 碩士 |
Abstract: | 低溫電子顯微術是研究巨型生物分子結構最重要的方法之一,能夠以原子的尺度描繪出分子的結構。2017年的諾貝爾化學獎就頒給了jacques Dubochet(杜波克特)、Joachim Frank(法蘭克)、與Richard Henderson(韓德森)以表揚他們對於低溫電子顯微術的重要貢獻。然而現今還是存在許未尚未解決的問題,其中目標分子於溶液中的結構差異所導致的異質性問題將是此研究的主要方向。在早期的研究中都會假設每張由低溫電子顯微鏡得到的分子照片都是來自相同種類且相同構型的分子,但事實上即使經過純化,有一些分子在溶液仍會以多種形態存在,這使得早期的演算法假設不成立。大部分的演算法都是利用三維的資訊來處理資料的異質性問題,也因此在處理的過程中會需要去估計分子照片對應到原來結構的投影角度為何,或是建立一個初始的三維結構。所以當三維結構的資訊因為構型差異大導致難以估計時,這些方法的效果也會隨之變差。我們這邊提出了兩個演算法來處理低溫電子顯微術的異質性問題,兩個方法都只用到了二維的資訊來分類三維的結構差異,第一種方法用於處理構型差異較大的資料,先利用二階段降維將照片去雜訊後,利用重建後的照片計算交互相關性並建立圖,再利用社群發現的演算法將照片分群後,對每個分群取平均得到更高的訊雜比,最後再計算平均照片間的共通線相似度並建圖,再利用社群發現的演算法分群,此時輸出的結果就當作構型的分類;第二種方法則是捨棄取平均的步驟,首先計算一部分分子照片的交互共通線距離後,同樣是依靠建圖和社群發現演算法分類構型,再將剩下的照片指派到已分類好的社群中,藉此也可以減少共通線距離的計算量。我們將兩種方法測試於兩個異質性的人造資料集,其中一個是異質性問題中的最具有代表性的資料集之一,而另一個資料集則是包含多種構型的資料集,最後我們討論各個參數帶來的影響與結果。 Cryogenic Electron microscopy (cryo-EM) is one of the most promising instruments for determining the structures of macromolecular protein complexes in near-atomic resolution. In fact, 2017 Nobel Prize in Chemistry was awarded to three scientists for their significant contributions in developing the technology. Nevertheless, there are still open challenges unsolved and here we addressed the heterogeneity problem inherent in cryo-EM data set. Originally, single particle analysis assumes the projection images come from the same molecule with the same structure conformation, but in fact, some molecules have various conformation states in the solution even after purification. Tradition approaches address this problem at 3D level. Thus, they require the information of 3D orientations and a consensus 3D structure before starting analyze the 3D variability. Besides, these approaches also suffer from potential 3D alignment error which may affect the accuracy of the analysis result. We apply two methods to address the heterogeneity problem. The first one applies two-stage dimension reduction to denoise the images then constructs a graph based on the pairwise correlation of the denoised images. Next, community detection algorithms are applied to group similar images. Averaged images that enjoy higher SNR are thus obtained by averaging each group. Finally, a second graph is constructed based on the common line distances among averaged images. The community detection algorithms are then conducted on the graph. Each detected community is considered as a conformation. The second method obviates the step of averaging images, we directly compute the pairwise common line distances among projection images. Firstly, we construct a graph based on the pairwise common line distances among a small fraction of images and run community detection algorithms to partition the images into several communities. Secondly, we assign the rest images to their nearest community based on common line distances. We test these two approaches on two synthetic heterogeneous data sets, one of them is the benchmark data-set in heterogeneity problem and the other one is the first data-set containing multiple conformational states and we discuss the result and influences of each parameter. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/56339 |
DOI: | 10.6342/NTU202001905 |
Fulltext Rights: | 有償授權 |
Appears in Collections: | 資料科學學位學程 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
U0001-2707202014232800.pdf Restricted Access | 1.98 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.