請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/37692
完整後設資料紀錄
DC 欄位 | 值 | 語言 |
---|---|---|
dc.contributor.advisor | 林智仁(Chih-Jen Lin) | |
dc.contributor.author | Hsiang-Jui Wang | en |
dc.contributor.author | 王湘叡 | zh_TW |
dc.date.accessioned | 2021-06-13T15:38:56Z | - |
dc.date.available | 2008-07-21 | |
dc.date.copyright | 2008-07-21 | |
dc.date.issued | 2008 | |
dc.date.submitted | 2008-07-09 | |
dc.identifier.citation | J. Abate, C. H. Bischof, L. Roh, and A. Carle. Algorithms and design for a
second-order automatic differentiation module. In ISSAC, pages 149–155, 1997. L. E. Baum, T. Petrie, G. Soules, and N. Weiss. A maximization techinique occruring in the statistical of probabilistic functions of markov chains. Ann. Math. Statist., 41(1):164–171, 1970. C. H. Bischof and H. M. Bücker. Computing derivatives of computer pro- grams. In J. Grotendorst, editor, Modern Methods and Algorithms of Quantum Chemistry: Proceedings, Second Edition, volume 3 of NIC Series, pages 315–327. NIC-Directors, Jülich, 2000. URL http://www.fz-juelich.de/nic-series/ Volume3/bischof.pdf. C. H. Bischof, L. Roh, and A. J. Mauer-Oats. Adic: An extensible automatic differentiation tool for ansi-c. Softw., Pract. Exper., 27(12):1427–1456, 1997. M. Collins, R. E. Schapire, and Y. Singer. Logistic regression, adaboost and bregman distances. Machine Learning, 48(1-3):253–285, 2002. A. Griewank. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation. Society for Industrial and Applied Mathematics, Philadelphia, 2000. ISBN 0-89871-451-6. S. S. Keerthi and D. DeCoste. A modified finite Newton method for fast solution of large scale linear SVMs. Journal of Machine Learning Research, 6:341–361, 2005. T. Kudo, K. Yamamoto, and Y. Matsumoto. Applying conditional random fields to japanese morphological analysis. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2004. J. D. Lafferty, A. McCallum, and F. C. N. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In C. E. Brodley and A. P. Danyluk, editors, ICML, pages 282–289. Morgan Kaufmann, 2001. ISBN 1-55860-778-1. D. D. Lewis, Y. Yang, T. G. Rose, and F. Li. RCV1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5:361– 397, 2004. C.-J. Lin and J. J. Moré. Newton’s method for large-scale bound constrained prob- lems. Preprint, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, Illinois, 1998. Submitted to SIAM Journal on Optimization. C.-J. Lin, R. C. Weng, and S. S. Keerthi. Trust region Newton method for large- scale logistic regression. In Proceedings of the 24th International Conference on Machine Learning (ICML), 2007. Software available at http://www.csie.ntu. edu.tw/~cjlin/liblinear. D. C. Liu and J. Nocedal. On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(1):503–528, 1989. Y. Liu, J. Carbonell, P. Weigele, and V. Gopalakrishnan. Segmentation con- ditional random fields (scrfs): A new approach for protein fold recognition. In ACM International conference on Research in Computational Molecular Biology (RECOMB05), 2005. R. Malouf. A comparison of algorithms for maximum entropy parameter estima- tion. In Proceedings of the 6th conference on Natural language learning, pages 1–7. Association for Computational Linguistics, 2002. T. P. Minka. A comparison of numerical optimizers for logistic regression, 2003. URL http://research.microsoft.com/~minka/papers/logreg/. D. J. Newman, S. Hettich, C. L. Blake, and C. J. Merz. UCI repository of ma- chine learning databases. Technical report, University of California, Irvine, Dept. of Information and Computer Sciences, 1998. URL http://www.ics.uci.edu/ ~mlearn/MLRepository.html. F. Peng, F. Feng, and A. McCalum. Chinese segmentation and new word detec- tion using conditional random fields. In Proceedings of The 20th International Conference on Computational Linguistics (COLING), pages 562–568, 2004. S. D. Pietra, V. D. Pietra, and J. Lafferty. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4):380–393, 1997. J. C. Platt. Fast training of support vector machines using sequential minimal optimization. In B. Schölkopf, C. J. C. Burges, and A. J. Smola, editors, Advances in Kernel Methods - Support Vector Learning, Cambridge, MA, 1998. MIT Press. L. R. RABINER. A tutorial on hidden markov models and selected applications in speech recognition. In Proceedings of the IEEE, pages 257–286, 1989. E. F. T. K. Sang and S. Buchholz. Introduction to the conll-2000 shared task: Chunking. CoRR, cs.CL/0009008, 2000. K. Sato and Y. Sakakibara. Rna secondary structural alignment with conditional random fields. In ECCB/JBI, page 242, 2005. F. Sha and F. C. N. Pereira. Shallow parsing with conditional random fields. In HLT-NAACL, 2003. C. Sutton and A. McCallum. An introduction to conditional random fields for relational learning. In L. Getoor and B. Taskar, editors, Introduction to Statistical Relational Learning. MIT Press, 2006. J. Utke, U. Naumann, M. Fagan, N. Tallent, M. Strout, P. Heimbach, C. Hill, and C. Wunsch. Openad/f: A modular, open-source tool for automatic differentiation of Fortran codes. ACM Transactions on Mathematical Software, 34(4), 2008. S. V. N. Vishwanathan, N. N. Schraudolph, M. W. Schmidt, and K. P. Mur- phy. Accelerated training of conditional random fields with stochastic gradient methods. In W. W. Cohen and A. Moore, editors, ICML, volume 148 of ACM International Conference Proceeding Series, pages 969–976. ACM, 2006. ISBN 1-59593-383-2. A. J. Viterbi. Error bounds for convolutional codes and an asymptotically optimal decoding algorithm. IEEE Transactioins on Information Theory, 13(2):260–269, 1967. H. Wallach. Efficient training of conditional random fields. Master’s thesis, Uni- versity of Edinburgh, 2002. | |
dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/37692 | - |
dc.description.abstract | 近年來,很多領域興起將序列的資料標上標籤。條件隨機場則是一種常用來解此類問題的方法,但其封閉形式的海森矩陣並不易導出。這困難致使一些使用二次微分資訊的最佳化方法不適用,如牛頓法。自動微分則是一種技巧,可以用來計算一個函數的導數值而無梯度函數。並且,藉由自動微分來計算海森矩陣與向量之乘積只需梯度函數而無需海森矩陣。本篇論文先說明自動微分的背景知識。然後結合截斷牛頓法及自動微分,並用之於解決條件隨機場。 | zh_TW |
dc.description.abstract | In recent years, labeling sequential data arises in many fields. Conditional random fields are a popular model for solving this type of problems. Its Hessian matrix in a closed form is not easy to derive. This difficulty causes that optimization methods using second-order information like the Hessian-vector products may not be suitable. Automatic differentiation is a technique to evaluate derivatives of a function without its gradient function. Moreover, computing Hessian-vector products by automatic differentiation only requires the gradient function but not the Hessian matrix. This thesis first gives a study on the background knowledge of automatic differentiation. Then it merges truncated Newton methods with automatic differentiation for solving conditional random fields. | en |
dc.description.provenance | Made available in DSpace on 2021-06-13T15:38:56Z (GMT). No. of bitstreams: 1 ntu-97-R95922073-1.pdf: 967065 bytes, checksum: cd2e9623f85c3da2bc95cb647a169997 (MD5) Previous issue date: 2008 | en |
dc.description.tableofcontents | 口試委員會審定書 . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii CHAPTER I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1 II. Automatic differentiation . . . . . . . . . . . . . . . . . . . 4 2.1 The Fundamental of the Automatic Differentiation . . . . . 6 2.2 Implementation of Differentiation . . . . . . . . . . . . 9 2.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 Evaluating Hessian-vector Products . . . . . . . . . . . . 11 III. Implementation of TRON with Automatic Differentiation . . . . . 15 3.1 Implementation Details . . . . . . . . . . . . . . . . . . 18 3.2 Real-World Data Sets . . . . . . . . . . . . . . . . . . . 20 3.2.1 a9a . . . . . . . . . . . . . . . . . . . . . . . 20 3.2.2 RCV1-V2 . . . . . . . . . . . . . . . . . . . . . 20 3.2.3 news20 . . . . . . . . . . . . . . . . . . . . . . 21 3.2.4 real-sim . . . . . . . . . . . . . . . . . . . . . 21 3.3 Experiments . . . . . . . . . . . . . . . . . . . . . . . 21 IV. Conditional Random Fields . . . . . . . . . . . . . . . . . . . . 23 4.1 Named-Entity Recognition . . . . . . . . . . . . . . . . . 23 4.2 Hidden Markov Models . . . . . . . . . . . . . . . . . . . 25 4.3 Conditional Random Fields . . . . . . . . . . . . . . . . 29 4.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . 31 4.4.1 Data Set . . . . . . . . . . . . . . . . . . . . . 31 4.4.2 Features . . . . . . . . . . . . . . . . . . . . . 32 4.4.3 Evaluation . . . . . . . . . . . . . . . . . . . . 32 4.4.4 Experiment Settings . . . . . . . . . . . . . . . 33 4.4.5 Results . . . . . . . . . . . . . . . . . . . . . 34 V. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 | |
dc.language.iso | en | |
dc.title | 應用自動微分及截斷牛頓法於條件隨機場 | zh_TW |
dc.title | Applying Automatic Differentiation and Truncated Newton Methods to Conditional Random Fields | en |
dc.type | Thesis | |
dc.date.schoolyear | 96-2 | |
dc.description.degree | 碩士 | |
dc.contributor.oralexamcommittee | 李育杰(Yuh-Jye Lee),鮑興國(Hsing-Kuo Kenneth Pao) | |
dc.subject.keyword | 自動微分,共軛梯度法,截斷牛頓法,最大熵值法,條件隨機場, | zh_TW |
dc.subject.keyword | automatic differentiation,conjugate gradient methods,truncated New- ton methods,maximum entropy,conditional random fields, | en |
dc.relation.page | 42 | |
dc.rights.note | 有償授權 | |
dc.date.accepted | 2008-07-09 | |
dc.contributor.author-college | 電機資訊學院 | zh_TW |
dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
顯示於系所單位: | 資訊工程學系 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
ntu-97-1.pdf 目前未授權公開取用 | 944.4 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。