Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 理學院
  3. 數學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/24253
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳宏(Hung Chen)
dc.contributor.authorChia-Tung Chiangen
dc.contributor.author江家彤zh_TW
dc.date.accessioned2021-06-08T05:19:48Z-
dc.date.copyright2011-08-03
dc.date.issued2011
dc.date.submitted2011-07-29
dc.identifier.citation[1] Abraham, C., Cornillon, P. A., Matzner-L ber, E., and Molinari, N. (2003). Unsupervised Curve Clustering using B-Splines. Scandinavian Journal of Statistics, 30, 581-95.
[2] Ball, G.H. and Hall, D.J. (1967). A clustering technique for summarizing multivariate data. Behavioral Science, 12, 153-155.
[3] Bunea, F, Ivanescu, A.E. and Wegkamp, M. (2011). Adaptive inference for the mean of a Gaussian process in functional data. Journal of the Royal Statistical Society, Ser. B 73, part 3.
[4] Chiou, J.-M. and Li, P.-L. (2007). Functional clustering and identifying substructures of longitudinal data. Journal of the Royal Statistical Society, Ser. B, 69, 679-699.
[5] Chiou,J.M. and Li,P.L. (2008). Correlation-based functional clustering
via subspace projection. Journal of the American Statistical Association,
103, 1684-1692.
[6] Costanzo, D., Preda, C., and Saporta, G. (2006). Anticipated prediction in discriminant analysis on functional data for binary response, COMPSTAT'06, 17th Symposium on Computational Statistics, Rome, January
2006, 223-235 .
[7] Elad, M. (2010) Sparse and Redundant Representations-From Theory to Applications in Signal and Image Processing, Springer.
[8] Flury, B (1993). Estimation of principal points. Applied Statistics, 42, 139-151.
[9] Flury, B (1990). Principal points. Biometrika, 77, 33-41.
[10] G. H. Golub and C. F. Van Loan, Matrix Computations. Baltimore, MD: Johns Hopkins Univ. Press, 1983.
[11] Jain, A.K. (2010). Data clustering: 50 years beyond k-means. Pattern
Recognition Letters, 31, 651-656.
[12] James, G.M., Sugar, C.A. (2003). Clustering for sparsely sampled functional data. Journal of the American Statistical Association. 98, 397.
[13] Lloyd, S.P. (1982). Least squares quantization in PCM, unpublished Bell Laboratories technical note, 1957. IEEE Transactions on Information
Theory, 28, 129-137.
[14] MacQueen, J. (1967). Some methods for classi cation and analysis of
multivariate observations. in the 5th Berkeley Symposium of Mathemat-
ics, Statistics and Probability, 281-297.
[15] Peng, J. and Muller, H.-G. (2008). Distance-based clustering of sparsely
observed stochastic processes with applications to online auctions. Annals of Applied Statistics, 2, 1056-1077.
[16] R Development Core Team. (2009). R: A language and environment for statistical computing. Nienna, Austria .
[17] Ramsay, J.O. and Li, X. (1998). Curve registration, Journal of the Royal Statistical Society, Ser. B 60, 351-363.
[18] Ramsay J.O. and Ramsey, J.B. (2002). Functional data analysis of the dynamics of the monthly index of non-durable goods production. Journal of Econometrics, 107, 327-344.
[19] Ramsay, J.O., Silverman, B.W. (1997). Functional Data Analysis. Springer-Verlag, NewYork.
[20] Ramsay, J.O., Silverman, B.W. (2002). Applied Functional Data Analy-
sis: Methods and Case Studies. Springer-Verlag, NewYork.
[21] Stein, C.M. (1981). Estimation of the mean of a multivariate normal distribution. Annals of Statistics, 9, 1135-1151.
[22] Tarpey, T. and Kinateder, K. K. J. (2003). Clustering functional data. Journal of Classi cation, 20, 93-114.
[23] Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association. 58, 236-244. 46
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/24253-
dc.description.abstract集群分析旨在將資料分為數個相異性較大的群組,使組內的相似程度高,是分析高維度資料及大型資料庫的重要資料探勘工具之一;藉著經集群分析後的資料,可更容易的探索組內成員和有興趣的變量之間的關係。應用集群分析於高維度資料前,往往會先降低資料的維度,而以不同觀點去做資料降維,可能會使得到的結論有所不同。
本論文的研究主題為探討對函數型資料(functional data)之觀測對象分群的問題,在文獻中(Abraham, 2003),從平均函數(mean function)角度出發對資料做降維,再以k均值法對降維後的資料做分群。
在2008年Peng 和 Muller的文章中,在所有的曲線有相同的平均函數的假設之下,利用有限維度的函數型主成份分數 (functional principal component scores) 之分佈來探查資料的分群。然而,無論是以平均函數或是共變異數函數 (covariance function)為出發點對資料做降維,所得到的群集都反映出平均函數的特性。
這個現象引發了我們試圖針對這兩個方法的效用提出一套理論分析。在本文中,我們將提出說明在某些狀況下,從共變異數函數為出發點將會降低分群品質之效力。在2007年Chiou和Li的文章中提出一套以疊代重分群為主的分群演算法,在初步分群方面,主要是利用有限維度的函數型主成份分數之分佈來探查資料在平均結構上的初步分群 。依據我們的推論,我們建議在初步分群中,應從平均函數的角度來探查資料的分群。
zh_TW
dc.description.abstractOrganizing functional data into sensible groupings is one of the most fundamental modes of understanding and learning the underlying mechanism generating functional data. Clustering analysis is often employed to search for homogeneous subgroups of individuals in a data set.
In Abraham et al. (2003, Scandinavian Journal of Statistics), they start with feature extraction on the mean function and use k-means clustering procedure to determine the clusters. In Peng and Muller (2008, Annals of Applied Statistics), they assume common mean function for all units
and start with feature extraction on the covariance function.
However, the clusters found by $k$-means clustering procedure can be explained through the characteristics of mean function of each unit. This motivates a theoretical study on comparing the utilities of these two approaches under the settings of densely observed functional data.
We will only present the case that the size of clusters is two only. We will present analysis on the lose of efficiency with feature extraction on the
covariance function. In Chiou and Li (2007, Journal of the Royal Statistical Society, Series B),
they proposed an iterative functional clustering algorithm which apply the method used in Peng and Muller to the initial clustering stage.
We advocate to use the mean function in the initial stage. An analysis is provided to support this recommendation.
en
dc.description.provenanceMade available in DSpace on 2021-06-08T05:19:48Z (GMT). No. of bitstreams: 1
ntu-100-R98221035-1.pdf: 1610698 bytes, checksum: 1aed472028c4176c68043573833edb18 (MD5)
Previous issue date: 2011
en
dc.description.tableofcontents誌謝 i
摘要 ii
Abstract iii
Contents iv
List of Figures v
List of Tables vi
1 Introduction 1
2 Review on Functional Clustering Method Coupled with the
k-means Algorithm 2
3 Inference of FPCA Approach and Main Results 5
3.1 Inference of FPCA Approach . . . . . . . . . . . . . 5
3.2 Main Results . . . . . . . . . . . . . . . . . . . 6
4 Numerical Results 20
4.1 Simulation Set-up . . . . . . . . . . . . . . . . . 21
4.2 Limitation on FPCA Approach . . . . . . . . . 23
4.3 FPCA Approach Can Be a Useful Alternative . . . . . 33
4.4 Global versus Local Basis . . . . . . . . . . . . . 36
4.5 Limitation on FPCA Approach when G1(s,t)≠G2(s,t) 39
4.6 Data Application: Kneading Data . . . . . . . . . . 42
5 Discussion and Concluding Remarks 43
dc.language.isoen
dc.subject函數型主成分分析zh_TW
dc.subjectk均值法分群zh_TW
dc.subjectk-meansen
dc.subjectfunctional principal componenten
dc.title利用函數型主成分計分及平均曲線對函數型資料進行k均值法分群之探討zh_TW
dc.titleStudy effectiveness of k-means clustering of functional data: functional principal component scores feature and mean curve featureen
dc.typeThesis
dc.date.schoolyear99-2
dc.description.degree碩士
dc.contributor.oralexamcommittee江金倉(Chin-Tsang Chiang),陳素雲(Su-Yun Huang),丘政民(Jeng-Min Chiou)
dc.subject.keyword函數型主成分分析,k均值法分群,zh_TW
dc.subject.keywordfunctional principal component,k-means,en
dc.relation.page46
dc.rights.note未授權
dc.date.accepted2011-07-29
dc.contributor.author-college理學院zh_TW
dc.contributor.author-dept數學研究所zh_TW
顯示於系所單位:數學系

文件中的檔案:
檔案 大小格式 
ntu-100-1.pdf
  未授權公開取用
1.57 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved