Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19584Full metadata record
| ???org.dspace.app.webui.jsptag.ItemTag.dcfield??? | Value | Language |
|---|---|---|
| dc.contributor.advisor | 鄭士康 | |
| dc.contributor.author | Yen-Chieh Wang | en |
| dc.contributor.author | 王彥傑 | zh_TW |
| dc.date.accessioned | 2021-06-08T02:06:52Z | - |
| dc.date.copyright | 2016-03-08 | |
| dc.date.issued | 2016 | |
| dc.date.submitted | 2016-02-02 | |
| dc.identifier.citation | [1] Team predictions for the 2015 season
http://ppt.cc/SQaL [2] B. James, The Bill James Baseball Abstracts, 1977. [3] J. Albert, J. Bennett, Curve Ball: Baseball, Statistics, and the Role of Chance in the Game, Copernicus Books, 1st ed., 2001. [4] G. Chandler, G. Stevens, “An Exploratory Study of Minor League Baseball Statistics,” Journal of Quantitative Analysis in Sports, Vol. 8, Issue 4, 2012. [5] G. Gartheeban, J. Guttag, “A data-driven method for in-game decision making in MLB: when to pull a starting pitcher,” Knowledge Discovery and Data Mining, 2013, pp. 973-979. [6] T. W. Redelius, “Did the Best Team Win? Analysis of the 2010 Major League Baseball Postseason Using Monte Carlo Simulation,” Journal of Quantitative Analysis in Sports, vol. 8, Issue 1, 2012. [7] R. A. Johnson, D. W. Wichern, Applied Multivariate Statistical Analysis, Pearson, 6th ed., 2007. [8] J. Ross Quinlan, C4.5: Programs For Machine Learning, Morgan Kaufmann Publishers Inc. San Francisco, CA, USA, 1st ed., 1993. [9] L. Breiman, J.H. Friedman, R. A. Olshen, and C.J. Stone, Classification and Regression Trees, Chapman and Hall/CRC, 1st ed., 1984. [10] G. V. Kass, “An Exploratory Technique for Investigating Large Quantities of Categorical Data,” Journal of the Royal Statistical Society. Series C (Applied Statistics), vol. 29, no. 2, 1980, pp. 119-127. [11] V. Vapnik et al, “Support-vector network,” Machine Learning, vol. 20, Issue 3, 1995, pp. 273-297. [12] Baseball-Reference.com http://www.baseball-reference.com/ [13] Factor analysis – MATLAB factoran http://www.mathworks.com/help/stats/factoran.html [14] Recursive Partitioning and Regression Trees https://stat.ethz.ch/R-manual/R-devel/library/rpart/html/rpart.html [15] C.- C. Chang, C.- J. Lin, “LIBSVM: A library for support vector machine,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 2, Issue 3, 2011. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19584 | - |
| dc.description.abstract | 棒球界的最高殿堂--美國職棒大聯盟 (MLB) 聚集了全世界頂尖的棒球選手,一向是最受全世界的棒球迷矚目的焦點,全聯盟30支球隊都希望強化自己球隊的戰力,一求打進十月份的季後賽,甚至是拿下最後的世界大賽冠軍。然而每年能打進季後賽的球隊,在其團隊數據上有何種特質,一直都是球團、球迷們所關心的。
本論文先介紹基本的棒球數據以及MLB季後賽相關制度,接著以MLB啟用三分區制度的1995年起至2015年,這期間每支球隊例行賽的團隊各項總數據,以及各年度所有球隊進入季後賽與否,分別以因素分析 (Factor Analysis)、決策樹 (Decision Tree)、以及支持向量機 (Support Vector Machine),探究能進季後賽的球隊的在團隊數據表現有什麼特質是其他沒有打進季後賽球隊所沒有的;並由這三種方法所得出的結果來預測:新的球季開打後,有這些特質的球隊是否能打進該年度的季後賽。 | zh_TW |
| dc.description.abstract | Major League Baseball (MLB) gathers the top baseball players around the world. It’s the most popular professional baseball league that its fans are worldwide. Every season, the 30 teams of MLB enhance their power to make them qualify the postseason games in October. Moreover, they all hope to win the World Series Championship. Baseball fans and teams would like to know what attributes makes a team go to the postseason games.
In the thesis, we first introduce the baseball statistics and the history of MLB postseason system. We adopt the factor analysis, the decision tree, and the support vector machine to analyze what attributes the postseason teams are with. The teams’ statistics from season 1995 to 2015 and whether they made postseason appearances or not are used in these analyses. Result shows that the accuracy of the prediction by these method can reach at least 70%. Fans can use the analysis in the thesis to predict which teams will make postseason appearance in the new baseball season. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-08T02:06:52Z (GMT). No. of bitstreams: 1 ntu-105-R99942124-1.pdf: 3600862 bytes, checksum: 06de5e9cce3c960b5764fa1828eec581 (MD5) Previous issue date: 2016 | en |
| dc.description.tableofcontents | 誌謝 i
中文摘要 ii ABSTRACT iii CONTENTS iv LIST OF FIGURES vi LIST OF TABLES vii Chapter 1 Introduction 1 1.1 Motivation 1 1.2 Literature Survey 2 1.3 Contribution 2 1.4 Organization of Thesis 3 Chapter 2 Background Knowledge of Baseball Statistics 4 2.1 Batting Statistics 4 2.1.1 Batting Average 4 2.1.2 On-Base Percentage 4 2.1.3 Slugging Percentage 5 2.2 Base-Stealing Statistics 5 2.3 Pitching Statistics 6 2.3.1 Earned Run Average 6 2.3.2 Fielding Independent Pitching 6 2.3.3 Walks plus Hits per Inning Pitched 7 2.3.4 Strikeout-to-Walk Ratio 7 2.4 Defense Statistics 7 2.4.1 Putouts, Assists and Errors 7 2.4.2 Fielding Percentage 8 2.5 History of MLB Postseason System 8 2.5.1 1903-1968: One Round 8 2.5.2 1969-1993: Two Rounds 9 2.5.3 1994-2011: Three Rounds 9 2.5.4 2012-present: Wildcard Game 9 Chapter 3 Statistical Methods and Machine Learning of Classification 11 3.1 Factor Analysis 11 3.1.1 The Orthogonal Factor Model 11 3.1.2 Methods of Estimation 13 3.1.3 Factor Rotation 14 3.2 Decision Tree 14 3.2.1 Introduction 14 3.2.2 Algorithms of Making the Rules 15 3.3 Support Vector Machine (SVM) 16 3.3.1 Introduction 16 3.3.2 The Primal Problem of SVM 17 Chapter 4 Prediction of Postseason Appearance 19 4.1 Factor Analysis – Selection of the Attributes 19 4.1.1 Factor Analysis on Basic Statistics 20 4.1.2 Factor Analysis on Derived Statistics 22 4.1.3 Selection of the Attributes 23 4.2 Decision Tree – Postseason Teams’ Attributes 23 4.2.1 Adjustments on the Statistics 23 4.2.2 Decision Trees with Various Combination of Attributes 24 4.3 Support Vector Machine – Prediction from the Previous Seasons 29 4.3.1 Selection of Training Data and Testing Data 29 4.3.2 Seasonal Prediction by the Previous Seasons 30 Chapter 5 Results and Discussions 31 5.1 Accuracy of the Decision Tree 31 5.2 Accuracy of the SVM 32 Chapter 6 Conclusions 34 References 35 | |
| dc.language.iso | en | |
| dc.title | 以統計分析和機器學習預測美國職棒大聯盟季後賽資格 | zh_TW |
| dc.title | Prediction of Postseason Appearance in Major League Baseball by Statistical Analysis and Machine Learning | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 104-1 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 陳銘憲,盧俊成 | |
| dc.subject.keyword | 統計分析,機器學習,美國職棒大聯盟, | zh_TW |
| dc.subject.keyword | Statistical Analysis,Machine Learning,Major League Baseball, | en |
| dc.relation.page | 36 | |
| dc.rights.note | 未授權 | |
| dc.date.accepted | 2016-02-02 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 電信工程學研究所 | zh_TW |
| Appears in Collections: | 電信工程學研究所 | |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-105-1.pdf Restricted Access | 3.52 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
