Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19584
Full metadata record
???org.dspace.app.webui.jsptag.ItemTag.dcfield???ValueLanguage
dc.contributor.advisor鄭士康
dc.contributor.authorYen-Chieh Wangen
dc.contributor.author王彥傑zh_TW
dc.date.accessioned2021-06-08T02:06:52Z-
dc.date.copyright2016-03-08
dc.date.issued2016
dc.date.submitted2016-02-02
dc.identifier.citation[1] Team predictions for the 2015 season
http://ppt.cc/SQaL
[2] B. James, The Bill James Baseball Abstracts, 1977.
[3] J. Albert, J. Bennett, Curve Ball: Baseball, Statistics, and the Role of Chance in the Game, Copernicus Books, 1st ed., 2001.
[4] G. Chandler, G. Stevens, “An Exploratory Study of Minor League Baseball Statistics,” Journal of Quantitative Analysis in Sports, Vol. 8, Issue 4, 2012.
[5] G. Gartheeban, J. Guttag, “A data-driven method for in-game decision making in MLB: when to pull a starting pitcher,” Knowledge Discovery and Data Mining, 2013, pp. 973-979.
[6] T. W. Redelius, “Did the Best Team Win? Analysis of the 2010 Major League Baseball Postseason Using Monte Carlo Simulation,” Journal of Quantitative Analysis in Sports, vol. 8, Issue 1, 2012.
[7] R. A. Johnson, D. W. Wichern, Applied Multivariate Statistical Analysis, Pearson, 6th ed., 2007.
[8] J. Ross Quinlan, C4.5: Programs For Machine Learning, Morgan Kaufmann Publishers Inc. San Francisco, CA, USA, 1st ed., 1993.
[9] L. Breiman, J.H. Friedman, R. A. Olshen, and C.J. Stone, Classification and Regression Trees, Chapman and Hall/CRC, 1st ed., 1984.
[10] G. V. Kass, “An Exploratory Technique for Investigating Large Quantities of Categorical Data,” Journal of the Royal Statistical Society. Series C (Applied Statistics), vol. 29, no. 2, 1980, pp. 119-127.
[11] V. Vapnik et al, “Support-vector network,” Machine Learning, vol. 20, Issue 3, 1995, pp. 273-297.

[12] Baseball-Reference.com
http://www.baseball-reference.com/
[13] Factor analysis – MATLAB factoran
http://www.mathworks.com/help/stats/factoran.html
[14] Recursive Partitioning and Regression Trees
https://stat.ethz.ch/R-manual/R-devel/library/rpart/html/rpart.html
[15] C.- C. Chang, C.- J. Lin, “LIBSVM: A library for support vector machine,” ACM Transactions on Intelligent Systems and Technology (TIST), vol. 2, Issue 3, 2011.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/19584-
dc.description.abstract棒球界的最高殿堂--美國職棒大聯盟 (MLB) 聚集了全世界頂尖的棒球選手,一向是最受全世界的棒球迷矚目的焦點,全聯盟30支球隊都希望強化自己球隊的戰力,一求打進十月份的季後賽,甚至是拿下最後的世界大賽冠軍。然而每年能打進季後賽的球隊,在其團隊數據上有何種特質,一直都是球團、球迷們所關心的。
本論文先介紹基本的棒球數據以及MLB季後賽相關制度,接著以MLB啟用三分區制度的1995年起至2015年,這期間每支球隊例行賽的團隊各項總數據,以及各年度所有球隊進入季後賽與否,分別以因素分析 (Factor Analysis)、決策樹 (Decision Tree)、以及支持向量機 (Support Vector Machine),探究能進季後賽的球隊的在團隊數據表現有什麼特質是其他沒有打進季後賽球隊所沒有的;並由這三種方法所得出的結果來預測:新的球季開打後,有這些特質的球隊是否能打進該年度的季後賽。
zh_TW
dc.description.abstractMajor League Baseball (MLB) gathers the top baseball players around the world. It’s the most popular professional baseball league that its fans are worldwide. Every season, the 30 teams of MLB enhance their power to make them qualify the postseason games in October. Moreover, they all hope to win the World Series Championship. Baseball fans and teams would like to know what attributes makes a team go to the postseason games.
In the thesis, we first introduce the baseball statistics and the history of MLB postseason system. We adopt the factor analysis, the decision tree, and the support vector machine to analyze what attributes the postseason teams are with. The teams’ statistics from season 1995 to 2015 and whether they made postseason appearances or not are used in these analyses. Result shows that the accuracy of the prediction by these method can reach at least 70%. Fans can use the analysis in the thesis to predict which teams will make postseason appearance in the new baseball season.
en
dc.description.provenanceMade available in DSpace on 2021-06-08T02:06:52Z (GMT). No. of bitstreams: 1
ntu-105-R99942124-1.pdf: 3600862 bytes, checksum: 06de5e9cce3c960b5764fa1828eec581 (MD5)
Previous issue date: 2016
en
dc.description.tableofcontents誌謝 i
中文摘要 ii
ABSTRACT iii
CONTENTS iv
LIST OF FIGURES vi
LIST OF TABLES vii
Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Literature Survey 2
1.3 Contribution 2
1.4 Organization of Thesis 3
Chapter 2 Background Knowledge of Baseball Statistics 4
2.1 Batting Statistics 4
2.1.1 Batting Average 4
2.1.2 On-Base Percentage 4
2.1.3 Slugging Percentage 5
2.2 Base-Stealing Statistics 5
2.3 Pitching Statistics 6
2.3.1 Earned Run Average 6
2.3.2 Fielding Independent Pitching 6
2.3.3 Walks plus Hits per Inning Pitched 7
2.3.4 Strikeout-to-Walk Ratio 7
2.4 Defense Statistics 7
2.4.1 Putouts, Assists and Errors 7
2.4.2 Fielding Percentage 8
2.5 History of MLB Postseason System 8
2.5.1 1903-1968: One Round 8
2.5.2 1969-1993: Two Rounds 9
2.5.3 1994-2011: Three Rounds 9
2.5.4 2012-present: Wildcard Game 9
Chapter 3 Statistical Methods and Machine Learning of Classification 11
3.1 Factor Analysis 11
3.1.1 The Orthogonal Factor Model 11
3.1.2 Methods of Estimation 13
3.1.3 Factor Rotation 14
3.2 Decision Tree 14
3.2.1 Introduction 14
3.2.2 Algorithms of Making the Rules 15
3.3 Support Vector Machine (SVM) 16
3.3.1 Introduction 16
3.3.2 The Primal Problem of SVM 17
Chapter 4 Prediction of Postseason Appearance 19
4.1 Factor Analysis – Selection of the Attributes 19
4.1.1 Factor Analysis on Basic Statistics 20
4.1.2 Factor Analysis on Derived Statistics 22
4.1.3 Selection of the Attributes 23
4.2 Decision Tree – Postseason Teams’ Attributes 23
4.2.1 Adjustments on the Statistics 23
4.2.2 Decision Trees with Various Combination of Attributes 24
4.3 Support Vector Machine – Prediction from the Previous Seasons 29
4.3.1 Selection of Training Data and Testing Data 29
4.3.2 Seasonal Prediction by the Previous Seasons 30
Chapter 5 Results and Discussions 31
5.1 Accuracy of the Decision Tree 31
5.2 Accuracy of the SVM 32
Chapter 6 Conclusions 34
References 35
dc.language.isoen
dc.title以統計分析和機器學習預測美國職棒大聯盟季後賽資格zh_TW
dc.titlePrediction of Postseason Appearance in Major League Baseball by Statistical Analysis and Machine Learningen
dc.typeThesis
dc.date.schoolyear104-1
dc.description.degree碩士
dc.contributor.oralexamcommittee陳銘憲,盧俊成
dc.subject.keyword統計分析,機器學習,美國職棒大聯盟,zh_TW
dc.subject.keywordStatistical Analysis,Machine Learning,Major League Baseball,en
dc.relation.page36
dc.rights.note未授權
dc.date.accepted2016-02-02
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept電信工程學研究所zh_TW
Appears in Collections:電信工程學研究所

Files in This Item:
File SizeFormat 
ntu-105-1.pdf
  Restricted Access
3.52 MBAdobe PDF
Show simple item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved