Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28334
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor劉長遠(Chen-Yuan Liou)
dc.contributor.authorShing-Bing Wuen
dc.contributor.author吳仕斌zh_TW
dc.date.accessioned2021-06-13T00:05:26Z-
dc.date.available2009-07-31
dc.date.copyright2007-07-31
dc.date.issued2007
dc.date.submitted2007-07-30
dc.identifier.citation[1] A. G. Barto, R. S., and C. W. Anderson. Neuronlike adaptive elementsthat can solve di¢ cult learning control problems. IEEE Transactionson Systems, Man, and Cybernetics, SMC-13(5):834–846, 2002.
[2] R. Bellman. Dynamic Programming. Princeton. Princeton University Press, 1957. NJ.
[3] D. Bergmark. Collection synthesis. In JCDL ’02: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, pages 253–262, New York, NY, USA, 2002. ACM Press.
[4] D. P. Bertsekas. Dynamic Programming: Deterministic and Stochastic Models. Englewood Cli¤s. Prentice-Hall, 1987. NJ.
[5] J. A. Boyan. Least-squares temporal-di¤erence. Maching Learning, 49(2-3):233–246, 2002.
[6] L. Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.
[7] L. Breiman, J. H. Friedman, R. A. Olsen, and C. J. Stone. Classi…cation And Regression Trees. Wadsworth International (California), 1984.
[8] Sergey Brin and Lawrence Page. The anatomy of a large-scale hyper-textual web search engine. In WWW7: Proceedings of the seventh inter-national conference on World Wide Web 7, pages 107–117, Amsterdam, The Netherlands, The Netherlands, 1998. Elsevier Science Publishers B. V.
[9] S. Chakrabarti, Martin van den Berg, and B. Dom. Focused crawling:a new approach to topic-speci…c resource discovery. Submitted to the World Wide Web Conference, jan 1999.
[10] A. Y. Ng D. Bagnell, S. Kakade and J. Schneider. Policy search by dynamic programming. In Neural Information Processing Systems, 2003.
[11] Dmitry Davidov and Shaul Markovitch. Multiple-goal heuristic search. Journal of Arti…cial Intelligence Research, 26:417–451, 2006.
[12] J. Dean and M. R. Henzinger. Finding related pages in the world wide web. In WWW ’99: Proceeding of the eighth international conference on World Wide Web, pages 1467–1479, New York, NY, USA, 1999. Elsevier North-Holland, Inc.
[13] M. Diligenti, F. Coetzee, S. Lawrence, C. L. Giles, and M. Gori. Focused crawling using context graphs. In VLDB ’00: Proceedings of the 26th International Conference on Very Large Data Bases, pages 527–534, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc.
[14] D. Ernst, P. Geurts, , and L. Wehenkel. Iteratively extending time hori-
zon reinforcement learning. In N. Lavra, L. Gamberger, and L. Todor-ovski, editors, Proceedings of the 14th European Conference on Machine Learning, Dubrovnik,Croatia, September 2003. Springer-Verlag Heidel-berg.
[15] D. Ernst, P. Geurts, and L. Wehenkel. Tree-based batch mode reinforcement learning. Jouranl of Machine Learning Research, 6:503–556, 2005.
[16] P. Geurts, D. Ernst, and L Wehenkel. Extremely randomized trees. Machine Learning, 63(1):3–42, 2006.
[17] A. M. Grigoriadis and G. Paliouras. Focused crawling using temporaldifference-learning. In SETN, pages 142–153, 2004.
[18] L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Journal of Arti…cial Intelligence Research, 4:237–285,1996.
[19] M. Kitsuregawa, M. Toyoda, and I. Pramudiono. Web community mining and web log mining: Commodity cluster based execution. In Australasian Database Conference, 2002.
[20] J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604–632, 1999.
[21] Jon M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604–632, 1999.
[22] M. G. Lagoudakis and R. Parr. Least-squares policy iteration. Jouranl of Machine Learning Research, 4:1107–1149, 2003.
[23] D. Ormoneit and S. Sen. Kernel-based reinforcement learning. Machine Learning, 49(2-3):161–178, 2002.
[24] M. Puterman. Markov decision processes : Discrete stochastic dynamic programming. John Wiley & Sons, New York, 1994.
[25] J. Qin and H. Chen. Using genetic algorithm in building domain-specific collections: An experiment in the nanotechnology domain. In HICSS’05: Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS’05) - Track 4, page 102.2, Washington, DC, USA, 2005. IEEE Computer Society.
[26] Jason D. M. Rennie and A. McCallum. Using reinforcement learning to spider the web efficiently. In ICML ’99: Proceedings of the Six-teenth International Conference on Machine Learning, pages 335-343, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc.
[27] R. S. Sutton and A. G. Barto. Reinforcement learning, an Introduction. MIT Press, 1998.
[28] C. J. C. H. Watkins. Learning from delayed rewards, 1989.
[29] C. J. C. H. Watkins and P. Dayan. Q-learning. Machine Learning, 8(3):279–292, 1992.
[30] R. J.Williams and L. C. Baird III. Analysis of some incremental variants of policy iteraion: First steps toward understanding actor-critic learning systme. College of computer science, Tech. rep. NU-CCS-93-11, North-eastern University, Boston, 1993. MA.
[31] R. J.Williams and L. C. Baird III. Tight performance bounds on greedy policies based on imperfect value functions. College of computer science, Tech. rep. NU-CCS-93-14, Northeastern University, Boston, 1993. MA.
[32] Jaeyoung Yang, Jinbeom Kang, and Joongmin Choi. A focused crawler with document segmentation. In IDEAL, pages 94–101, 2005.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28334-
dc.description.abstract如何從環境獲得資訊一直是增強式學習的主要議題。在過去Q-learning以表格來示Q-函式 (Q-函式定義了狀態-行為的空間,並用來找到最佳的策). 但是當狀態和行為的空間是連續的或數量很大時,Q-函式即無法再以表格來表示。在樹狀批次模式的增強式學習,我們以四-元素組合所建立的整體樹來表示Q-函式。在過去的一些相關研究,解決以四-元素組合近似Q-函式的supervised learning問題。我們將研究以extremely randomized tree 演算法來建立focused crawler.
增強式學習有許多相關的應用,例如最佳化控制、game playing和web crawler。 Web crawler是一種瀏覽網際網路的程式,主要負責複製網頁,可提供即時的資料。因為有相當大數量的頁面需要被下載、備份和處理,我們需要良好的策略來完成。 聚焦爬蟲主要負責搜尋一特定主題的頁面。在我們的研究,以 樹狀批次模式增強式學習 來建立聚焦爬蟲,並且也和其它的方法進行效能的比較,並試著找到其優、缺點。
本論文取材自國科會計劃NSC 94-2213-E-002-105,題目與解答為指導教授所授,程式製作及網站設計為學生完成。
zh_TW
dc.description.abstractThe main issue of reinforcement learning is to find an optimal control policy by getting information from environment. In the past, Q-learning used tabular form to represent the Q-function (Q-function is defined on the state-action space and can be derived to find the optimal policy). When the state or action space is continuous or very large, the Q-function can not be represented in tabular form. In tree-based batch mode reinforcement learning, we can approximate the Q-function based on the ensemble trees which are constructed by a set of four-tuples (xt, ut, rt, xt+1) where xt is the current state, ut is the next state, rt is the instantaneous reward and xt+1 is the next state. In the past, we could find some related work that try to approximate the Q-function from a set of four-tuples by solving the supervised learning problem. We study how to use the extremely randomized tree algorithm and reinforcement learning to build the focused crawler.
There are many application of reinforcement learning, including some optimal control topic, game playing and web crawler (also know as Web crawler or Web robot) crawler is a program which browses the World Wide Web. The main work of the Web crawler is crating a copy of the visited page which will be processed by search engine. Because there are a huge number of pages needs to choose and download in a given time, we must have a policy that states which page we need. Focused crawler aims to search the relevant document on a given specific topic. In this paper, we constructed and built focused crawler by tree-based batch mode reinforcement learning and we also compared the performance with other methods. We aim to practice tree-based batch mode reinforcement and find its strength and shortcoming from experiment. We used the dataset of World Wide knowledge-based project (WebKB project) and also provide the analysis of the experiment result.
This work was partially supported by National Science Council, ROC under contract number NSC 94-2213-E-002-105
Keywords: tree-based batch mode reinforcement learning, ensemble tree algorithm, supervised learning, Q-learning, optimal control, Web crawler, focused crawler.
en
dc.description.provenanceMade available in DSpace on 2021-06-13T00:05:26Z (GMT). No. of bitstreams: 1
ntu-96-R94922100-1.pdf: 956899 bytes, checksum: 547945446350e3306ecc6db77c5db6f6 (MD5)
Previous issue date: 2007
en
dc.description.tableofcontents口試委員會審定書…………………………………………………… i
前言…………………………………………………………………… ii
中文摘要……………………………………………………………… iv
英文摘要……………………………………………………………… v
1. Introduction………………………………………………………. 1
1.1 Related Work ………………………………………………... 5
2. Reinforcement Learning ………………………………………… 10
2.1 Learning Model …………………………………………….. 10
2.2 Related Work ……………………………………………….. 15
2.3 Supervised Learning Based Algorithm …………………… 18
2.4 Fitted Q Iteration Algorithm ……………………………….. 21
3. Extra-Trees Algorithm ………………………………………….. 26
4. Experimental Setup ……………………………………………... 30
5. Results …………………………………………………………... 35
6. Conclusions and Future Work …………………………………. 45
7. Appendix A ……………………………………………………… 52
8. Appendix B ……………………………………………………… 52
dc.language.isoen
dc.subjectensemble treezh_TW
dc.subject增強式學習zh_TW
dc.subject樹狀批次模式增強式學習zh_TW
dc.subjectsupervised learningzh_TW
dc.subjectQ-learningzh_TW
dc.subjectweb crawlerzh_TW
dc.subject聚焦爬蟲zh_TW
dc.subjectfocused crawleren
dc.subjectensemble tree algorithmen
dc.subjecttree-based batch mode reinforcement learningen
dc.subjectsupervised learningen
dc.subjectQ-learningen
dc.subjectoptimal controlen
dc.subjectWeb crawleren
dc.title以樹狀批次模式的增強式學習建立聚焦爬蟲zh_TW
dc.titleBuilding Focused Crawler by Using the Tree-based Batch Mode Reinforcement Learningen
dc.typeThesis
dc.date.schoolyear95-2
dc.description.degree碩士
dc.contributor.oralexamcommittee吳建銘(Jiann-Ming Wu),王榮華(Jung-Hua Wang),徐宏民(Winston H. Hsu)
dc.subject.keyword增強式學習,樹狀批次模式增強式學習,ensemble tree,supervised learning,Q-learning,web crawler,聚焦爬蟲,zh_TW
dc.subject.keywordtree-based batch mode reinforcement learning,ensemble tree algorithm,supervised learning,Q-learning,optimal control,Web crawler,focused crawler,en
dc.relation.page54
dc.rights.note有償授權
dc.date.accepted2007-07-30
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-96-1.pdf
  未授權公開取用
934.47 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved