Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資料科學學位學程
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93036
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor張明中zh_TW
dc.contributor.advisorMing-Chung Changen
dc.contributor.author郭晉良zh_TW
dc.contributor.authorChin-Liang Kuoen
dc.date.accessioned2024-07-12T16:23:12Z-
dc.date.available2024-07-13-
dc.date.copyright2024-07-12-
dc.date.issued2024-
dc.date.submitted2024-07-08-
dc.identifier.citationT. Kanungo, D.M. Mount, N.S. Netanyahu, C.D. Piatko, R. Silverman, and A.Y.Wu. An efficient k-means clustering algorithm: analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7):881–892, 2002.
Omer Sagi and Lior Rokach. Ensemble learning: A survey. WIREs Data Mining and Knowledge Discovery, 8(4):e1249, 2018.
Oludare Isaac Abiodun, Aman Jantan, Abiodun Esther Omolara, Kemi Victoria Dada, Nachaat AbdElatif Mohamed, and Humaira Arshad. State-of-the-art in ar-tificial neural network applications: A survey. Heliyon, 4(11):e00938, 2018.
Wolfgang Härdle. Applied nonparametric regression. Number 19. Cambridge uni-versity press, 1990.
James N. Morgan and John A. Sonquist. Problems in the analysis of survey data, and a proposal. Journal of the American Statistical Association, 58(302):415–434, 1963.
Badr HSSINA, Abdelkarim MERBOUHA, Hanane EZZIKOURI, and Mohammed ERRITALI. A comparative study of decision tree id3 and c4.5. International Journal of Advanced Computer Science and Applications(IJACSA), Special Issue on Advances in Vehicular Ad Hoc Networking and Applications 2014, 4(2), 2014.
L. Breiman, J. Friedman, C.J. Stone, and R.A. Olshen. Classification and Regression Trees. Taylor & Francis, 1984.
F. Esposito, D. Malerba, G. Semeraro, and J. Kay. A comparative analysis of methods for pruning decision trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5):476–491, 1997.
V. Roshan Joseph and Simon Mak. Supervised compression of big data. Statistical Analysis and Data Mining: The ASA Data Science Journal, 14(3):217–229, 2021.
Partitioning Estimates, In: A Distribution-Free Theory of Nonparametric Regression, pages 52–69. Springer New York, New York, NY, 2002.
J. MacQueen. Some methods for classification and analysis of multivariate observations. Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, 1:281–297, 1967.
-
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93036-
dc.description.abstract本文研究了一種基於分割法的無母數迴歸技術,旨在提升迴歸問題的預測精度。本文首先介紹了使用機器學習處理分類及迴歸問題的基本概念,再針對處理迴歸問題的常見方法進行探討,最後聚焦在本文所使用的無母數迴歸方法。在本文的核心研究中,提出了一種新的演算法 PE-Kmeans,該演算法在非監督式學習中的 K-means 演算法的基礎上進行改進,形成二階段的分群方法。第一階段在輸出空間進行 K-means 分群,第二階段則在每個母群中進行輸入空間的再次 K-means 分群,這種方法充分考慮了輸出變量的信息,使得它成為一個監督式學習模型,可以用來處理迴歸問題。本文以 Supervised Compression 及著名的 Regression Tree作為比較模型,前項方法通過選擇性以輸入空間或輸出空間作為分割中心點,逐步將輸入空間分割成不規則的 Voronoi region 子區域,後項方法則是透過二元分類將輸入空間分割為長方形。通過對模擬資料和真實世界資料的實驗,本文驗證了前述三種方法在不同情境下的性能,實驗結果表明,PE-Kmeans在處理相對不平滑的函數及真實世界資料時,能夠更有效地進行預測。zh_TW
dc.description.abstractThis study investigates a non-parametric regression technique based on segmentation, aimed at enhancing the prediction accuracy of regression problems. The paper first introduces the basic concepts of using machine learning to handle classification and regression problems. It then discusses common methods for addressing regression issues, with a focus on the non-parametric regression method employed in this study. At the core of this research, a new algorithm called PE-Kmeans is proposed. This algorithm improves upon the K-means algorithm used in unsupervised learning, forming a two-stage clustering method. In the first stage, K-means clustering is performed in the output space. In the second stage, K-means clustering is again performed in the input space within each parent cluster. This method fully considers the information of the output variables, making it a supervised learning model suitable for handling regression problems. The study compares the proposed method with Supervised Compression and the well-known Regression Tree. The former method selectively uses either the input space or the output space as the segmentation center, gradually dividing the input space into irregular Voronoi region sub-regions. The latter method segments the input space into rectangles through binary classification. Through experiments on simulated and real-world data, the study validates the performance of the three aforementioned methods under different scenarios. The experimental results demonstrate that PE-Kmeans can more effectively make predictions when dealing with relatively non-smooth functions and real-world data.en
dc.description.provenanceSubmitted by admin ntu (admin@lib.ntu.edu.tw) on 2024-07-12T16:23:12Z
No. of bitstreams: 0
en
dc.description.provenanceMade available in DSpace on 2024-07-12T16:23:12Z (GMT). No. of bitstreams: 0en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee   i
Acknowledgements   ii
摘要   iii
Abstract   iv
Contents   v
List of Figures   viii
List of Tables   x
Denotation   xi
Chapter 1 緒論   1
1.1 監督式學習   1
1.2 處理迴歸問題的常見方法   2
1.2.1 線性模型   2
1.2.2 集成學習   4
1.2.3 類神經網路   4
1.3 無母數迴歸   5
1.4 研究目的   6
1.5 論文架構   6
Chapter 2 文獻回顧   7
2.1 Regression Tree   7
2.2 Supervised Compression   8
2.3 Partitioning Estimate   9
Chapter 3 研究方法   10
3.1 模型   10
3.1.1 以Supervised Compression 做Partitioning Estimate   10
3.1.2 以PE-Kmeans 做Partitioning Estimate   11
3.1.3 模型超參數設定   12
3.2 比較方法之指標   13
3.2.1 預測誤差   13
3.2.2 組間、組內之殘差平方和分析   13
3.2.3 運行時間   14
3.2.4 可視化   14
Chapter 4 模擬資料分析   15
4.1 資料集設定   15
4.2 模擬   15
4.2.1 Two Dimensional Michalewicz 函數   15
4.2.1.1 可視化   21
4.2.2 Dropwave 函數   26
4.2.2.1 可視化   31
4.2.3 OTL Circuit 函數   36
4.2.4 Piston 函數   41
4.2.5 Borehole 函數   44
4.2.6 函數模擬實驗總結   46
Chapter 5 真實資料分析   48
5.1 資料集設定   49
5.2 預測誤差   49
5.3 組內組間變異分析   50
5.3.1 組內變異   50
5.3.2 組間變異   50
5.4 運行時間   51
Chapter 6 結論與未來展望   52
References   54
-
dc.language.isozh_TW-
dc.subject機器學習zh_TW
dc.subject無母數迴歸zh_TW
dc.subject監督式學習zh_TW
dc.subject分割法zh_TW
dc.subject資料壓縮zh_TW
dc.subject迴歸樹zh_TW
dc.subject群集分析zh_TW
dc.subjectSegmentation Methoden
dc.subjectNon-parametric Regressionen
dc.subjectSupervised Learningen
dc.subjectCluster Analysisen
dc.subjectRegression Treeen
dc.subjectData Compressionen
dc.subjectMachine Learningen
dc.title基於分割法的無母數迴歸zh_TW
dc.titleNon-parametric Regression Using Partitioning Methodsen
dc.typeThesis-
dc.date.schoolyear112-2-
dc.description.degree碩士-
dc.contributor.coadvisor楊鈞澔zh_TW
dc.contributor.coadvisorChun-Hao Yangen
dc.contributor.oralexamcommittee紀建名;黃學涵zh_TW
dc.contributor.oralexamcommitteeChien-Ming Chi;Hsueh-Han Huangen
dc.subject.keyword機器學習,無母數迴歸,監督式學習,分割法,資料壓縮,迴歸樹,群集分析,zh_TW
dc.subject.keywordMachine Learning,Non-parametric Regression,Supervised Learning,Segmentation Method,Data Compression,Regression Tree,Cluster Analysis,en
dc.relation.page55-
dc.identifier.doi10.6342/NTU202401551-
dc.rights.note同意授權(全球公開)-
dc.date.accepted2024-07-09-
dc.contributor.author-college電機資訊學院-
dc.contributor.author-dept資料科學學位學程-
顯示於系所單位:資料科學學位學程

文件中的檔案:
檔案 大小格式 
ntu-112-2.pdf1.43 MBAdobe PDF檢視/開啟
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved