Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電機工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/42538
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳銘憲
dc.contributor.authorYi-Hong Chuen
dc.contributor.author朱怡虹zh_TW
dc.date.accessioned2021-06-15T01:15:45Z-
dc.date.available2009-08-18
dc.date.copyright2009-08-18
dc.date.issued2009
dc.date.submitted2009-07-28
dc.identifier.citation[1] C. C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A Framework for Clustering Evolving Data Streams. In
Proceedings of the 29th International Conference on Very Large Data Bases, pp. 81-92, Berlin, Germany,
September 2003.
[2] C.C. Aggarwal, J. Han, J. Wang, and P.S. Yu. A Framework for Projected Clustering of High Dimensional
Data Streams. In Proceedings of the 30th International Conference on Very Large Data Bases, pp. 852-863,
Toronto, Canada, 2004.
[3] C.C. Aggarwal, A. Hinneburg, and D. Keim. On the Surprising Behavior of Distance Metrics in High
Dimensional Space. In Proceedings of the 8th International Conference on Database Theory, pp. 420-434,
London, UK, 2001.
[4] C.C. Aggarwal and C. Procopiuc. Fast Algorithms for Projected Clustering. In Proceedings of the 1999
ACMSIGMOD International Conference onManagement of Data, pp. 61-72, Philadelphia, PA, USA, 1999.
[5] C.C. Aggarwal and P.S. Yu. Finding Generalized Projected Clusters in High Dimensional Spaces. In
Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, pp. 70-81,
Dallas, TX, USA, 2000.
[6] C.C. Aggarwal and P.S. Yu. The IGrid Index: Reversing the Dimensionality Curse for Similarity Indexing in
High Dimensional Space. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, pp. 119-129, Boston, MA, USA, 2000.
[7] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic Subspace Clustering of High Dimensional
Data for Data Mining Applications. In Proceedings of the 1998 ACM SIGMOD International
Conference on Management of Data, pp. 94-105, Seattle, WA, USA, 1998.
[8] R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules. In Proceedings of the 20th
International Conference on Very Large Data Bases, pp. 487-499, Santiago, Chile, 1994.
[9] I. Assent, R. Krieger, E. Muller, and T. Seidl. DUSC: Dimensionality Unbiased Subspace Clustering. In
Proceedings of the 7th IEEE International Conference on Data Mining, San Jose, pp. 409-414, Omaha,
NE, USA, 2007.
[10] K. Beyer, J. Goldstein, R. Ramakrishnan, and U. Shaft. When is Nearest Neighbors Meaningful. In
Proceedings of the 7th International Conference on Database Theory, pp. 217-235, Jerusalem, Israel, 1999.
[11] A. Blum and P. Langley. Selection of Relevant Features and Examples in Machine Learning. Artificial
Intelligence, 1997.
100
[12] H.-L. Chen, M.-S. Chen, and S.-C. Lin. Catching the Trend: A Framework for Clustering Concept-Drifting
Categorical Data. IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 5, pp. 652-665,
May 2009.
[13] M.-S. Chen, J. Han, and P.S. Yu. Data Mining: An Overview from Database Perspective. IEEE Transactions
on Knowledge and Data Engineering, vol. 8, no. 6, pp. 866-883, December 1996.
[14] Y. Chen and L. Tu. Density-Based Clustering for Real-Time Stream Data. In Proceedings of the 13rd ACM
SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 133-142, San Jose, CA,
USA, 2007.
[15] Y.-J. Chen, Y.-H. Chu, and M.-S. Chen. Mining Quality-Aware Subspace Clusters. In Proceedings of the
12th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 53-63, Osaka, Japan, 2008.
[16] C.H. Cheng, A.W. Fu, and Y. Zhang. Entropy-based Subspace Clustering for Mining Numerical Data. In
Proceedings of the 5th ACMSIGKDD International Conference on Knowledge Discovery and Data Mining,
pp. 84-93, San Diego, CA, USA, 1999.
[17] D.-S. Cho, B.-H. Hong, and J. Max. Efficient Region Query Processing by Optimal Page Ordering. In
Proceedings of 2000 ADBIS-DASFAA Symposium on Advances in Databases and Information Systems, pp.
315-322, Prague, Czech Republic, 2000.
[18] Y.-H. Chu, Y.-J. Chen, D.-N. Yang, and M.-S. Chen. Reducing Redundancy in Subspace Clustering. To
appear in IEEE Transactions on Knowledge and Data Engineering.
[19] Y.-H. Chu, K.-T. Chuang, and M.-S. Chen. QED: An Efficient Framework for Temporal Region Query
Processing. In Proceedings of the 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining,
pp. 323-332, Hanoi, Vietnam, 2005.
[20] Y.-H. Chu, J.-W. Huang, K.-T. Chuang, and M.-S. Chen. On Subspace Clustering with Density Consciousness.
In Proceedings of the 15th International Conference on Information and Knowledge Management,
pp. 804-805, Arlington, VA, USA, 2006.
[21] Y.-H. Chu, J.-W. Huang, K.-T. Chuang, D.-N. Yang, and M.-S. Chen. Density Conscious Subspace Clustering
for High-Dimensional Data. To appear in IEEE Transactions on Knowledge and Data Engineering.
[22] T.M. Cover and J.A. Thomas. Elements of Information Theory. Wiley, 1991.
[23] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A Density-Based Algorithm for Discovering Clusters in
Large Spatial Databases with Noise. In Proceedings of the 2nd ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, pp. 226-231, Portland, Orgon, 1996.
[24] H. Fang, C. Zhai, L. Liu, and J. Yang. Subspace Clustering for Microarray Data Analysis: Multiple Criteria
and Significance Assessment. In Proceedings of IEEE Computational Systems Bioinformatics Conference,
pp. 582-583, Stanford, CA, USA, 2004.
[25] U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, , and R. Uthurasamy. Advances in Knowledge Discovery
and Data Mining. MIT Press, Cambridge, MA, 1996.
[26] S. Goil, H. Nagesh, and A. Choudhary. MAFIA: Efficient and Scalable Subspace Clustering for Very Large
Data Sets. Technical Report, Northwestern University, 1999.
[27] M. Hadjeleftheriou, G. Kollios, D. Gunopulos, and V. Tsotras. On-Line Discovery of Dense Areas in
Spatio-temporal Databases. In Proceedings of the 8th International Symposium on Spatial and Temporal
Databases, pp. 306-324, Santorini Island, Greece, 2003.
[28] J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, 2000.
[29] J. Han, J. Pei, and Y. Yin. Mining Frequent Patterns without Candidate Generation. In Proceedings of the
2000 ACM SIGMOD International Conference on Management of Data, pp. 1-12, Dallas, TX, USA, 2000.
[30] S. Hettich and S. Bay. The UCI KDD archive. [http://kdd.ics.uci.edu], 1999.
[31] A. Hinneburg, C.C. Aggarwal, and D. Keim. What is the Nearest Neighbor in High Dimensional Spaces? In
Proceedings of the 26th International Conference on Very Large Data Bases, pp. 506a˛V515, Cairo, Egypt,
2000.
[32] C.-M Hsu and M.-S. Chen. Subspace Clustering of High Dimensional Spatial Data with Noises . In
Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 31-40,
Sydney, Australia, 2004.
[33] K. Kailing, H.-P. Kriegel, and P. Kroger. Density-Connected Subspace Clustering for High-Dimensional
Data. In Proceedings of the 4th IEEE International Conference on Data Mining, pp.246-257, Brighton,
UK, 2004, 2004.
[34] L. Kaufman and P.J. Rousseeuw. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley
and Sons, 1990.
[35] Y. B. Kim, J. H. Oh, and J. Gao. Emerging Pattern Based Subspace Clustering of Microarray Gene Expression
Data using Mixture Models. In Proceedings of International Conference on Bioinformatics and
its Applications, pp. 13-24, Fort Lauderdale, 2004.
[36] Y.B. Kim, J.H. Oh, and J. Gao. Emerging Pattern Based Subspace Clustering of Microarray Gene Expression
Data using Mixture Models. In Proceedings of International Conference on Bioinformatics and its
Applications, pp. 13-24, Fort Lauderdale, 2004.
[37] K. Koperski, J. Han, and J. Adhikary. Mining Knowledge in Geographical Data. In Communications of
ACM, 1998.
[38] H.-P. Kriegel, P. Kroger, M. Renz, and S. Wurst. A Generic Framework for Efficient Subspace Clustering
of High-Dimensional Data. In Proceedings of the 5th IEEE International Conference on Data Mining, San
Jose, pp. 250-257, Houston, TX, USA, 2005.
[39] G.G. Lai, D. Fussel, and D.F.Wong. HV/VH Trees: A New Spatial Data Structure for Fast Region Queries.
In Proceedings of the 30th Design Automation Conference, pp. 43-47, Dallas, TX, USA, 1993.
[40] B.S. Lee, R.R. Snapp, R. Musick, and T. Critchlow. MetadataModels for Ad Hoc Queries on Terabyte-Scale
Scientific Simulations. In Journal of the Brazilian Computer Society, 2002.
[41] H. Liu and H. Motoda. Feature Selection for Knowledge Discovery and Data Mining. Boston: Kluwer
Academic Publishers, 1998.
[42] J. Liu, K. Strohmaier, andW.Wang. Revealing True Subspace Clusters in High Dimensions. In Proceedings
of the 4th IEEE International Conference on Data Mining, pp.463-466, Brighton, UK, 2004.
[43] L. Lu and R. Vidal. Combined Central and Subspace clustering for Computer Vision Applications. In
Proceedings of the 23rd International Conference on Machine Learning, pp. 593-600, Pittsburgh, PA, USA,
2006.
[44] G. Moise, J. Sander, and M. Ester. P3C: A Robust Projected Clustering Algorithm. In Proceedings of the
6th IEEE International Conference on Data Mining, San Jose, pp. 414-425, Hong Kong, China, 2006.
[45] H.S. Nagesh, S. Goil, and A. Choudhary. Adaptive Grids for Clustering Massive Data Sets. In Proceedings
of the 1st IEEE International Conference on Data Mining, San Jose, CA, USA, 2001.
[46] D.J. Newman, S. Hettich, C.L. Blake, and C.J. Merz. UCI Repository of Machine Learning Databases.
http://www.ics.uci.edu/mlearn/mlrepository.html, 1998.
[47] R.T. Ng and J. Han. CLARANS: A Method for Clustering Objects for Spatial Data Mining. IEEE Transactions
on Knowledge and Data Engineering, vol. 14, no. 5, pp. 1003-1016, Sep./Oct. 2002.
[48] R.T. Ng and J. Han. Efficient and Effective Clustering Methods for Spatial Data Mining. In Proceedings of
the 20th International Conference on Very Large Data Bases, pp. 144-155, Santiago, Chile, 1994.
[49] L. Parsons, E. Haque, and H. Liu. Subspace Clustering for High Dimensional Data: A Review. ACM
SIGKDD Explorations Newsletter, vol. 6, no. 1, 2004.
[50] C.M. Procopiuc, M. Jones, P. K. Agarwal, and T.M. Murali. A Monte Carlo Algorithm for Fast Projective
Clustering. In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data,
pp. 418-427, Madison, Wisconsin, USA, 2002.
[51] J. Sander, M. Ester, H.-P. Kriegel, and X. Xu. Density-Based Clustering in Spatial Databases: The Algorithm
gdbscan and Its Applications. Data Mining and Knowledge Discovery, 1998.
[52] G. Sheikholeslami, S. Chatterjee, and A. Zhang. WaveCluster: A Multi-Resolution Clustering Approach
for Very Large Spatial Databases. In Proceedings of the 24th International Conference on Very Large Data
Bases, pp. 428-439, New York, USA, 1998.
[53] W.Wang, J. Yang, and R. Muntz. STING: A Statistical Information Grid Approach to Spatial Data Mining.
In Proceedings of the 23rd International Conference on Very Large Data Bases, pp. 186a˛V195, Athens,
Greece, 1997.
[54] K.-G. Woo, J.-H. Lee, M.-H. Kim, and Y.-J. Lee. FINDIT: A Fast and Intelligent Subspace Clustering
Algorithm using Dimension Voting. Information and software technology, 2004.
[55] J. Wu, Q. Liu, and S. Luo. Clustering Technology Application in e-Commerce Recommendation System.
In Proceedings of the International Conference on Management of e-Commerce and e-Government, pp.
200-203, 2008, 2008.
[56] A.M. Yip, E.H. Wu, M.K. Ng, and T.F. Chan. An Efficient Algorithm for Dense Regions Discovery from
Large-Scale Data Streams. In Proceedings of the 8th Pacific-Asia Conference on Knowledge Discovery and
Data Mining, pp. 116-120, Sydney, Australia, 2004.
[57] K.Y. Yip, D.W. Cheung, and M.K. Ng. HARP: A Practical Projected Clustering Algorithm. IEEE Transactions
on Knowledge and Data Engineering, vol. 16, no. 11, pp. 1387-1397, Nov. 2004.
[58] M. L. Yiu and N. Mamoulis. Frequent-Pattern based Iterative Projected Clustering. In Proceedings of the
3rd IEEE International Conference on Data Mining, pp.689-692, Melbourne, FL, USA, 2003.
[59] M.L. Yiu and N. Mamoulis. Iterative Projected Clustering by Subspace Mining. IEEE Transactions on
Knowledge and Data Engineering, vol. 17, no. 2, pp. 176-189, Feb. 2005.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/42538-
dc.description.abstract資料叢集分析是資料探勘領域中一項非常重要的技術。很多資料叢集演算法採用以密度為
考量的方式去挖掘資料叢集:它們將資料叢集視為空間中密度很高的區域。然而,我們發現
了兩個很重要但尚未被充分解決的問題: 如何挖掘隱藏在時間裡的資料叢集,如何挖掘隱
藏在空間裡的資料叢集。我們將依序探討這兩個問題並設計相關的資料叢集技術。
首先,我們研究具時間感知之資料叢集技術。我們發現之前的資料叢集演算法大多是
在全部的資料下找尋資料叢集,然而資料的特性會隨著時間做改變,有些資料叢集只會存
在於某個時間區間裡,而當我們是在整體資料下挖掘資料叢集時這些隱藏在時間區間裡的
資料叢集反而不會被挖掘出來。因此,能讓使用者可以指定不同的時間區間來挖掘資料叢
集是非常重要的。有鑒於此,我們設計了Temporal cluster query 這個問題來讓使用者在不
同時間區間找尋資料叢集。然而,因為欲挖掘資料叢集的時段無法事先知道,若要使用之
前的演算法就只能等到使用者提出查詢後再執行這些演算法,這樣的方法非常沒效率且不
適合即時互動式的查詢環境。因此,我們再提出了一個新穎的QEC 架構來有效率的執行
Temporal cluster query。
接下來,我們研究具空間感知之資料叢集技術。對於高維度資料,歷史文獻中的研究
轉而去找隱藏在子空間的資料叢集。然而我們發現到之前的子空間叢集挖掘演算法會找出
大量的子空間資料叢集,但這些叢集有很嚴重的資訊重複的問題。而且因為有些存在於低
維度叢集的資料並沒有被包含在任何高維度資料叢集內,因此若直接刪除那些與高維度叢
集有很大資訊重複的低維度叢集,將可能導致遺失了那些存在於被刪除的叢集中但卻沒有
被包含在高維度叢集中的資料的資訊。為了解決這個問題,我們提出了NORSC演算法來自
動的找出一組精簡的子空間資料叢集但同時又可以維持一定的資料涵蓋程度。
最後,我們更進一步的提出一個新的子空間資料叢集探勘模型。因為我們發現到之前
的子空間叢集挖掘演算法對於不同子空間維度下的資料叢集無法同時得到很好的品質,主
要的原因是因為他們忽略了一個很重要的現象:不同子空間維度裡的資料叢集會有不同的
密度。因此我們設計了一個新的子空間資料叢集探勘模型,我們將採用不同的密度參數來
找不同子空間維度裡的資料叢集。但是這新模型定義下的資料叢集將不再滿足單一性特性
(Monotonicity property),因此我們無法直接利用之前演算法根據單一性特性所設計的類
Apriori技術來找新模性定義下的資料叢集。有鑒於此,我們更進一步的設計了DENCOS演
算法來有效率的挖掘新模型定義下的資料叢集。
zh_TW
dc.description.abstractData clustering has been recognized as an important and valuable technique in data mining field. Most
of clustering works adopt density-based approach to discover the clusters, where clusters are regarded
as high-density regions in the space. In this dissertation, we explore two important but not well solved
topics in data clustering, i.e. discovering time-aware clusters and discovering space-aware clusters, and
devise innovative clustering techniques to deal with these topics.
We first devise time-aware clustering techniques. We note that most of previous clustering works
treat all data as one large segment and execute the clustering task over the entire database. However,
the characteristics of the data may change over time. Some dense regions (clusters) may only exist in
certain time intervals but will not be discovered if taking all data records into account. Thus, discovering
clusters over different time intervals is very important for users to get the interesting patterns hidden
in data. In view of this, we explore in this dissertation a novel problem, called temporal cluster query,
to address the cluster discovery in constrained time intervals, where users can specify varying time
intervals to discover the clusters. Since the queried time intervals are unknown in advance, the direct
extension of previous clustering works would be to delay the cluster discovery until the user queries
the data set, which is, however, inefficient for an interactive query environment. Thus, we also devise
an innovative framework, called QEC, to efficiently execute temporal cluster queries.
After that, we devise space-aware clustering techniques. For high-dimensional data, research advance
in the literature turns to discover the clusters hidden in subspaces. However, we find that previous
subspace clustering works will discover a large amount of subspace clusters but there exists large information
redundancy in the clustering result. In addition, since some data points in a lower-dimensional
cluster may not be members of any higher-dimensional cluster, directly removing a cluster having large
information overlapping with higher-dimensional clusters may cause the loss of information of those
data points which are contained in this cluster but cannot be found in higher-dimensional clusters. To
remedy this, we devise the NORSC algorithm to automatically discover a succinct collection of subspace
clusters while also maintaining the required degree of data coverage.
Furthermore, we devise a novel subspace clustering model to discover the subspace clusters. We
explore that previous works are difficult to discover high qualities of the clusters in all subspaces since
they lack of considering a critical problem, which is that densities vary in different subspace cardinalities.
Thus, we devise a new subspace clustering model, where different density thresholds will
be utilized to discover the dense regions (clusters) in different subspace cardinalities. However, since
the monotonicity property of the dense regions no longer exists in our subspace clustering model, the
Apriori-like generate-and-test scheme adopted in most previous works to constrain the searching of
dense regions is infeasible in our subspace clustering model. For this challenge, we also devise the
algorithm DENCOS to efficiently discover the clusters based on this novel clustering model.
en
dc.description.provenanceMade available in DSpace on 2021-06-15T01:15:45Z (GMT). No. of bitstreams: 1
ntu-98-F91921022-1.pdf: 1001864 bytes, checksum: 14c2b43499ca2964e4117b58a2842fb6 (MD5)
Previous issue date: 2009
en
dc.description.tableofcontents1 Introduction 1
1.1 MotivationandOverviewof theDissertation . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 DiscoveringClusters inConstrainedTime Intervals . . . . . . . . . . . . . . . 2
1.1.2 Reducing Redundancy in Subspace Clustering . . . . . . . . . . . . . . . . . 3
1.1.3 DiscoveringDensityConsciousSubspaceClusters . . . . . . . . . . . . . . . 5
1.2 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Discovering Clusters in Constrained Time Intervals 8
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Temporal Cluster Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 Overview of the QEC Framework . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 OfflineMaintainingPhase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 UniformRegion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.2 Algorithm for Constructing the UR-tree . . . . . . . . . . . . . . . . . . . . . 17
2.5 Online Query Processing Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5.1 Algorithm for Combining the UR-trees . . . . . . . . . . . . . . . . . . . . . 19
2.5.2 Execute the Temporal Cluster Query . . . . . . . . . . . . . . . . . . . . . . . 23
2.5.3 Analysis of the Time Complexity . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6.1 QualityofUR-tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6.2 AccuracyofQEC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.6.3 Time Efficiency of QEC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3 Reducing Redundancy in Subspace Clustering 30
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1.1 SubspaceClusteringinHigh-DimensionalData . . . . . . . . . . . . . . . . . 30
3.1.2 Motivation of this Chapter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Relatedwork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3.1 Redundant Cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3.2 Identification of Redundant Cluster . . . . . . . . . . . . . . . . . . . . . . . 40
3.4 The NORSC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4.1 Mining Maximal Dense Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4.2 Mining Non-Redundant Cluster . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.4.3 Discussion on NORSC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.5.1 Study on Data Coverage Problem . . . . . . . . . . . . . . . . . . . . . . . . 56
3.5.2 AccuracyofNORSC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.5.3 Time Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.5.4 SpaceEfficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.5.5 Discuss on Redundancy Threshold . . . . . . . . . . . . . . . . . . . . . . . . 62
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4 Discovering Density Conscious Subspace Clusters 65
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.2 RelatedWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3 Density Conscious Subspace Clustering . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.3.1 Preliminary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.4 TheDENCOSAlgorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4.1 Preprocessing Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.4.2 DiscoveringPhase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.5 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.5.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.5.2 Algorithm Accuracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.5.3 Algorithm Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5 Conclusions 97
dc.language.isozh-TW
dc.subject時間zh_TW
dc.subject資料叢集zh_TW
dc.subject資料探勘zh_TW
dc.subject子空間資料叢集zh_TW
dc.subject空間zh_TW
dc.subjectSubspace Clusteringen
dc.subjectSpaceen
dc.subjectTimeen
dc.subjectData Clusteringen
dc.subjectData Miningen
dc.title具時間和空間感知之資料叢集技術zh_TW
dc.titleTime-Aware and Space-Aware Data Clustering Techniquesen
dc.typeThesis
dc.date.schoolyear97-2
dc.description.degree博士
dc.contributor.oralexamcommittee陳孟彰,陳伶志,陳信希,楊東麟,李強,呂永和
dc.subject.keyword資料探勘,資料叢集,時間,空間,子空間資料叢集,zh_TW
dc.subject.keywordData Mining,Data Clustering,Time,Space,Subspace Clustering,en
dc.relation.page113
dc.rights.note有償授權
dc.date.accepted2009-07-28
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電機工程學研究所zh_TW
顯示於系所單位:電機工程學系

文件中的檔案:
檔案 大小格式 
ntu-98-1.pdf
  未授權公開取用
978.38 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved