以樹狀批次模式的增強式學習建立聚焦爬蟲

Shing-Bing Wu; 吳仕斌

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28334

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	劉長遠(Chen-Yuan Liou)
dc.contributor.author	Shing-Bing Wu	en
dc.contributor.author	吳仕斌	zh_TW
dc.date.accessioned	2021-06-13T00:05:26Z	-
dc.date.available	2009-07-31
dc.date.copyright	2007-07-31
dc.date.issued	2007
dc.date.submitted	2007-07-30
dc.identifier.citation	[1] A. G. Barto, R. S., and C. W. Anderson. Neuronlike adaptive elementsthat can solve di¢ cult learning control problems. IEEE Transactionson Systems, Man, and Cybernetics, SMC-13(5):834–846, 2002. [2] R. Bellman. Dynamic Programming. Princeton. Princeton University Press, 1957. NJ. [3] D. Bergmark. Collection synthesis. In JCDL ’02: Proceedings of the 2nd ACM/IEEE-CS joint conference on Digital libraries, pages 253–262, New York, NY, USA, 2002. ACM Press. [4] D. P. Bertsekas. Dynamic Programming: Deterministic and Stochastic Models. Englewood Cli¤s. Prentice-Hall, 1987. NJ. [5] J. A. Boyan. Least-squares temporal-di¤erence. Maching Learning, 49(2-3):233–246, 2002. [6] L. Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996. [7] L. Breiman, J. H. Friedman, R. A. Olsen, and C. J. Stone. Classi…cation And Regression Trees. Wadsworth International (California), 1984. [8] Sergey Brin and Lawrence Page. The anatomy of a large-scale hyper-textual web search engine. In WWW7: Proceedings of the seventh inter-national conference on World Wide Web 7, pages 107–117, Amsterdam, The Netherlands, The Netherlands, 1998. Elsevier Science Publishers B. V. [9] S. Chakrabarti, Martin van den Berg, and B. Dom. Focused crawling:a new approach to topic-speci…c resource discovery. Submitted to the World Wide Web Conference, jan 1999. [10] A. Y. Ng D. Bagnell, S. Kakade and J. Schneider. Policy search by dynamic programming. In Neural Information Processing Systems, 2003. [11] Dmitry Davidov and Shaul Markovitch. Multiple-goal heuristic search. Journal of Arti…cial Intelligence Research, 26:417–451, 2006. [12] J. Dean and M. R. Henzinger. Finding related pages in the world wide web. In WWW ’99: Proceeding of the eighth international conference on World Wide Web, pages 1467–1479, New York, NY, USA, 1999. Elsevier North-Holland, Inc. [13] M. Diligenti, F. Coetzee, S. Lawrence, C. L. Giles, and M. Gori. Focused crawling using context graphs. In VLDB ’00: Proceedings of the 26th International Conference on Very Large Data Bases, pages 527–534, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc. [14] D. Ernst, P. Geurts, , and L. Wehenkel. Iteratively extending time hori- zon reinforcement learning. In N. Lavra, L. Gamberger, and L. Todor-ovski, editors, Proceedings of the 14th European Conference on Machine Learning, Dubrovnik,Croatia, September 2003. Springer-Verlag Heidel-berg. [15] D. Ernst, P. Geurts, and L. Wehenkel. Tree-based batch mode reinforcement learning. Jouranl of Machine Learning Research, 6:503–556, 2005. [16] P. Geurts, D. Ernst, and L Wehenkel. Extremely randomized trees. Machine Learning, 63(1):3–42, 2006. [17] A. M. Grigoriadis and G. Paliouras. Focused crawling using temporaldifference-learning. In SETN, pages 142–153, 2004. [18] L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. Journal of Arti…cial Intelligence Research, 4:237–285,1996. [19] M. Kitsuregawa, M. Toyoda, and I. Pramudiono. Web community mining and web log mining: Commodity cluster based execution. In Australasian Database Conference, 2002. [20] J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604–632, 1999. [21] Jon M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604–632, 1999. [22] M. G. Lagoudakis and R. Parr. Least-squares policy iteration. Jouranl of Machine Learning Research, 4:1107–1149, 2003. [23] D. Ormoneit and S. Sen. Kernel-based reinforcement learning. Machine Learning, 49(2-3):161–178, 2002. [24] M. Puterman. Markov decision processes : Discrete stochastic dynamic programming. John Wiley & Sons, New York, 1994. [25] J. Qin and H. Chen. Using genetic algorithm in building domain-specific collections: An experiment in the nanotechnology domain. In HICSS’05: Proceedings of the Proceedings of the 38th Annual Hawaii International Conference on System Sciences (HICSS’05) - Track 4, page 102.2, Washington, DC, USA, 2005. IEEE Computer Society. [26] Jason D. M. Rennie and A. McCallum. Using reinforcement learning to spider the web efficiently. In ICML ’99: Proceedings of the Six-teenth International Conference on Machine Learning, pages 335-343, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc. [27] R. S. Sutton and A. G. Barto. Reinforcement learning, an Introduction. MIT Press, 1998. [28] C. J. C. H. Watkins. Learning from delayed rewards, 1989. [29] C. J. C. H. Watkins and P. Dayan. Q-learning. Machine Learning, 8(3):279–292, 1992. [30] R. J.Williams and L. C. Baird III. Analysis of some incremental variants of policy iteraion: First steps toward understanding actor-critic learning systme. College of computer science, Tech. rep. NU-CCS-93-11, North-eastern University, Boston, 1993. MA. [31] R. J.Williams and L. C. Baird III. Tight performance bounds on greedy policies based on imperfect value functions. College of computer science, Tech. rep. NU-CCS-93-14, Northeastern University, Boston, 1993. MA. [32] Jaeyoung Yang, Jinbeom Kang, and Joongmin Choi. A focused crawler with document segmentation. In IDEAL, pages 94–101, 2005.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/28334	-
dc.description.abstract	如何從環境獲得資訊一直是增強式學習的主要議題。在過去Q-learning以表格來示Q-函式 (Q-函式定義了狀態-行為的空間，並用來找到最佳的策). 但是當狀態和行為的空間是連續的或數量很大時，Q-函式即無法再以表格來表示。在樹狀批次模式的增強式學習，我們以四-元素組合所建立的整體樹來表示Q-函式。在過去的一些相關研究，解決以四-元素組合近似Q-函式的supervised learning問題。我們將研究以extremely randomized tree 演算法來建立focused crawler. 增強式學習有許多相關的應用，例如最佳化控制、game playing和web crawler。 Web crawler是一種瀏覽網際網路的程式，主要負責複製網頁，可提供即時的資料。因為有相當大數量的頁面需要被下載、備份和處理，我們需要良好的策略來完成。聚焦爬蟲主要負責搜尋一特定主題的頁面。在我們的研究，以樹狀批次模式增強式學習來建立聚焦爬蟲，並且也和其它的方法進行效能的比較，並試著找到其優、缺點。本論文取材自國科會計劃NSC 94-2213-E-002-105，題目與解答為指導教授所授，程式製作及網站設計為學生完成。	zh_TW
dc.description.abstract	The main issue of reinforcement learning is to find an optimal control policy by getting information from environment. In the past, Q-learning used tabular form to represent the Q-function (Q-function is defined on the state-action space and can be derived to find the optimal policy). When the state or action space is continuous or very large, the Q-function can not be represented in tabular form. In tree-based batch mode reinforcement learning, we can approximate the Q-function based on the ensemble trees which are constructed by a set of four-tuples (xt, ut, rt, xt+1) where xt is the current state, ut is the next state, rt is the instantaneous reward and xt+1 is the next state. In the past, we could find some related work that try to approximate the Q-function from a set of four-tuples by solving the supervised learning problem. We study how to use the extremely randomized tree algorithm and reinforcement learning to build the focused crawler. There are many application of reinforcement learning, including some optimal control topic, game playing and web crawler (also know as Web crawler or Web robot) crawler is a program which browses the World Wide Web. The main work of the Web crawler is crating a copy of the visited page which will be processed by search engine. Because there are a huge number of pages needs to choose and download in a given time, we must have a policy that states which page we need. Focused crawler aims to search the relevant document on a given specific topic. In this paper, we constructed and built focused crawler by tree-based batch mode reinforcement learning and we also compared the performance with other methods. We aim to practice tree-based batch mode reinforcement and find its strength and shortcoming from experiment. We used the dataset of World Wide knowledge-based project (WebKB project) and also provide the analysis of the experiment result. This work was partially supported by National Science Council, ROC under contract number NSC 94-2213-E-002-105 Keywords: tree-based batch mode reinforcement learning, ensemble tree algorithm, supervised learning, Q-learning, optimal control, Web crawler, focused crawler.	en
dc.description.provenance	Made available in DSpace on 2021-06-13T00:05:26Z (GMT). No. of bitstreams: 1 ntu-96-R94922100-1.pdf: 956899 bytes, checksum: 547945446350e3306ecc6db77c5db6f6 (MD5) Previous issue date: 2007	en
dc.description.tableofcontents	口試委員會審定書…………………………………………………… i 前言…………………………………………………………………… ii 中文摘要……………………………………………………………… iv 英文摘要……………………………………………………………… v 1. Introduction………………………………………………………. 1 1.1 Related Work ………………………………………………... 5 2. Reinforcement Learning ………………………………………… 10 2.1 Learning Model …………………………………………….. 10 2.2 Related Work ……………………………………………….. 15 2.3 Supervised Learning Based Algorithm …………………… 18 2.4 Fitted Q Iteration Algorithm ……………………………….. 21 3. Extra-Trees Algorithm ………………………………………….. 26 4. Experimental Setup ……………………………………………... 30 5. Results …………………………………………………………... 35 6. Conclusions and Future Work …………………………………. 45 7. Appendix A ……………………………………………………… 52 8. Appendix B ……………………………………………………… 52
dc.language.iso	en
dc.subject	ensemble tree	zh_TW
dc.subject	增強式學習	zh_TW
dc.subject	樹狀批次模式增強式學習	zh_TW
dc.subject	supervised learning	zh_TW
dc.subject	Q-learning	zh_TW
dc.subject	web crawler	zh_TW
dc.subject	聚焦爬蟲	zh_TW
dc.subject	focused crawler	en
dc.subject	ensemble tree algorithm	en
dc.subject	tree-based batch mode reinforcement learning	en
dc.subject	supervised learning	en
dc.subject	Q-learning	en
dc.subject	optimal control	en
dc.subject	Web crawler	en
dc.title	以樹狀批次模式的增強式學習建立聚焦爬蟲	zh_TW
dc.title	Building Focused Crawler by Using the Tree-based Batch Mode Reinforcement Learning	en
dc.type	Thesis
dc.date.schoolyear	95-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	吳建銘(Jiann-Ming Wu),王榮華(Jung-Hua Wang),徐宏民(Winston H. Hsu)
dc.subject.keyword	增強式學習,樹狀批次模式增強式學習,ensemble tree,supervised learning,Q-learning,web crawler,聚焦爬蟲,	zh_TW
dc.subject.keyword	tree-based batch mode reinforcement learning,ensemble tree algorithm,supervised learning,Q-learning,optimal control,Web crawler,focused crawler,	en
dc.relation.page	54
dc.rights.note	有償授權
dc.date.accepted	2007-07-30
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-96-1.pdf 未授權公開取用	934.47 kB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。