數理推演與對話中的自然語言理解之研究

Ting-Rui Chiang; 江廷睿

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/7087

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	陳縕儂(Yun-Nung Chen)
dc.contributor.author	Ting-Rui Chiang	en
dc.contributor.author	江廷睿	zh_TW
dc.date.accessioned	2021-05-17T15:59:31Z	-
dc.date.available	2020-03-11
dc.date.available	2021-05-17T15:59:31Z	-
dc.date.copyright	2020-03-11
dc.date.issued	2020
dc.date.submitted	2020-03-04
dc.identifier.citation	[1] S. Mandal and S. K. Naskar, “Solving arithmetic mathematical word problems: A review and recent advancements,” in Information Technology and Applied Mathe- matics, pp. 95–114, Springer, 2019. [2] P. Rajpurkar, J. Zhang, K. Lopyrev, and P. Liang, “Squad: 100,000+ questions for machine comprehension of text,” in Proceedings of the 2016 Conference on Empir- ical Methods in Natural Language Processing, pp. 2383–2392, 2016. [3] T. Nguyen, M. Rosenberg, X. Song, J. Gao, S. Tiwary, R. Majumder, and L. Deng, “Ms marco: A human generated machine reading comprehension dataset,” arXiv preprint arXiv:1611.09268, 2016. [4] P. Rajpurkar, R. Jia, and P. Liang, “Know what you don’t know: Unanswerable questions for squad,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pp. 784–789, 2018. [5] E. Choi, H. He, M. Iyyer, M. Yatskar, W.-t. Yih, Y. Choi, P. Liang, and L. Zettle- moyer, “Quac: Question answering in context,” in Proceedings of the 2018 Confer- ence on Empirical Methods in Natural Language Processing, pp. 2174–2184, 2018. [6] S. Reddy, D. Chen, and C. D. Manning, “Coqa: A conversational question answering challenge,” Transactions of the Association for Computational Linguistics, vol. 7, pp. 249–266, 2019. [7] H.-Y. Huang, E. Choi, and W. tau Yih, “FlowQA: Grasping flow in history for con- versational machine comprehension,” in International Conference on Learning Rep- resentations, 2019. [8] C. Zhu, M. Zeng, and X. Huang, “Sdnet: Contextualized attention-based deep network for conversational question answering,” arXiv preprint arXiv:1812.03593, 2018. [9] Y.-T. Yeh and Y.-N. Chen, “Flowdelta: Modeling flow information gain in reasoning for conversational machine comprehension,” in Proceedings of the 2nd Workshop on Machine Reading for Question Answering, pp. 86–90, 2019. [10] M. Seo, A. Kembhavi, A. Farhadi, and H. Hajishirzi, “Bidirectional attention flow for machine comprehension,” in Proceedings of ICLR, 2016. [11] T.-R. Chiang and Y.-N. Chen, “Semantically-aligned equation generation for solving and reasoning math word problems,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 2656–2668, 2019. [12] T.-R. Chiang, H.-T. Ye, and Y.-N. Chen, “An empirical study of content understand- ing in conversational question answering,” arXiv preprint arXiv:1909.10743, 2019. [13] J. L. Elman, “Finding structure in time,” Cognitive science, vol. 14, no. 2, pp. 179– 211, 1990. [14] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. [15] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, pp. 5998–6008, 2017. [16] J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer normalization,” arXiv preprint arXiv:1607.06450, 2016. 54 doi:10.6342/NTU2020[17] C. C. Shao, T. Liu, Y. Lai, Y. Tseng, and S. Tsai, “Drcd: a chinese machine reading comprehension dataset,” arXiv preprint arXiv:1806.00920, 2018. [18] A. Trischler, T. Wang, X. Yuan, J. Harris, A. Sordoni, P. Bachman, and K. Suleman, “Newsqa: A machine comprehension dataset,” in Proceedings of the 2nd Workshop on Representation Learning for NLP, pp. 191–200, 2017. [19] G. Lai, Q. Xie, H. Liu, Y. Yang, and E. Hovy, “Race: Large-scale reading com- prehension dataset from examinations,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 785–794, 2017. [20] D. Dua, Y. Wang, P. Dasigi, G. Stanovsky, S. Singh, and M. Gardner, “DROP: A reading comprehension benchmark requiring discrete reasoning over paragraphs,” in Proc. of NAACL, 2019. [21] A. Amini, S. Gabriel, S. Lin, R. Koncel-Kedziorski, Y. Choi, and H. Hajishirzi, “Mathqa: Towards interpretable math word problem solving with operation-based formalisms,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 2357–2367, 2019. [22] S. Wang and J. Jiang, “Machine comprehension using match-lstm and answer pointer,” 2016. [23] D. Weissenborn, G. Wiese, and L. Seiffe, “Making neural qa as simple as possible but not simpler,” in Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pp. 271–280, 2017. [24] Y. Shen, P.-S. Huang, J. Gao, and W. Chen, “Reasonet: Learning to stop reading in machine comprehension,” in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1047–1055, ACM, 2017. [25] M. Hu, Y. Peng, Z. Huang, X. Qiu, F. Wei, and M. Zhou, “Reinforced mnemonic reader for machine reading comprehension,” in Proceedings of the 27th International Joint Conference on Artificial Intelligence, pp. 4099–4106, AAAI Press, 2018. [26] C. Xiong, V. Zhong, and R. Socher, “DCN+: Mixed objective and deep residual coattention for question answering,” in International Conference on Learning Rep- resentations, 2018. [27] X. Liu, Y. Shen, K. Duh, and J. Gao, “Stochastic answer networks for machine read- ing comprehension,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1694–1704, 2018. [28] N. Kushman, L. Zettlemoyer, R. Barzilay, and Y. Artzi, “Learning to automatically solve algebra word problems,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, ACL 2014, pp. 271–281, 2014. [29] M. J. Hosseini, H. Hajishirzi, O. Etzioni, and N. Kushman, “Learning to solve arith- metic word problems with verb categorization,” in Proceedings of the 2014 Confer- ence on Empirical Methods in Natural Language Processing, pp. 523–533, 2014. [30] S. Roy, T. Vieira, and D. Roth, “Reasoning about quantities in natural language,” TACL, vol. 3, pp. 1–13, 2015. [31] S. Roy and D. Roth, “Solving general arithmetic word problems,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, pp. 1743–1752, 2015. [32] R. Koncel-Kedziorski, H. Hajishirzi, A. Sabharwal, O. Etzioni, and S. D. Ang, “Pars- ing algebraic word problems into equations,” TACL, vol. 3, pp. 585–597, 2015. [33] S. Roy, S. Upadhyay, and D. Roth, “Equation parsing : Mapping sentences to grounded equations,” in Proceedings of the 2016 Conference on Empirical Meth- ods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016, pp. 1088–1097, 2016. [34] S. Upadhyay, M. Chang, K. Chang, and W. Yih, “Learning from explicit and im- plicit supervision jointly for algebra word problems,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 297–306, 2016. [35] S. Upadhyay and M. Chang, “Annotating derivations: A new evaluation strategy and dataset for algebra word problems,” in Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pp. 494– 504, 2017. [36] S. Roy and D. Roth, “Mapping to declarative knowledge for word problem solving,” TACL, vol. 6, pp. 159–172, 2018. [37] L. Wang, D. Zhang, L. Gao, J. Song, L. Guo, and H. T. Shen, “MathDQN: Solving arithmetic word problems via deep reinforcement learning,” in Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018. [38] P. Mehta, P. Mishra, V. Athavale, M. Shrivastava, and D. M. Sharma, “Deep neural network based system for solving arithmetic word problems,” in Proceedings of the IJCNLP 2017, pp. 65–68, 2017. [39] Y. Wang, X. Liu, and S. Shi, “Deep neural solver for math word problems,” in Pro- ceedings of the 2017 Conference on Empirical Methods in Natural Language Pro- cessing, pp. 845–854, 2017. [40] W. Ling, D. Yogatama, C. Dyer, and P. Blunsom, “Program induction by rationale generation: Learning to solve and explain algebraic word problems,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017, pp. 158–167, 2017. [41] S. Shi, Y. Wang, C. Lin, X. Liu, and Y. Rui, “Automatically solving number word problems by semantic parsing and reasoning,” in Proceedings of the 2015 Con- ference on Empirical Methods in Natural Language Processing, EMNLP 2015, pp. 1132–1142, 2015. [42] R. Jia and P. Liang, “Adversarial examples for evaluating reading comprehension systems,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2021–2031, 2017. [43] D. Kaushik and Z. C. Lipton, “How much reading does reading comprehension re- quire? a critical investigation of popular benchmarks,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 5010–5015, 2018. [44] M.-A. Rondeau and T. J. Hazen, “Systematic error analysis of the stanford ques- tion answering dataset,” in Proceedings of the Workshop on Machine Reading for Question Answering, pp. 12–20, 2018. [45] M. Yatskar, “A qualitative comparison of coqa, squad 2.0 and quac,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Com- putational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 2318–2323, 2019. [46] Y. Nie, Y. Wang, and M. Bansal, “Analyzing compositionality-sensitivity of nli models,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 6867–6874, 2019. [47] C. Sankar, S. Subramanian, C. Pal, S. Chandar, and Y. Bengio, “Do neural dialog systems use the conversation history effectively? an empirical study,” in Proceed- ings of the 57th Annual Meeting of the Association for Computational Linguistics, (Florence, Italy), pp. 32–37, Association for Computational Linguistics, July 2019. [48] R. Geirhos, P. Rubisch, C. Michaelis, M. Bethge, F. A. Wichmann, and W. Bren- del, “Imagenet-trained CNNs are biased towards texture; increasing shape bias im- proves accuracy and robustness.,” in International Conference on Learning Repre- sentations, 2019. [49] W. Brendel and M. Bethge, “Approximating CNNs with bag-of-local-features mod- els works surprisingly well on imagenet,” in International Conference on Learning Representations, 2019. [50] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neu- ral networks,” in Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, pp. 3104–3112, 2014. [51] T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine translation,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421, 2015. [52] A. Graves, G. Wayne, and I. Danihelka, “Neural turing machines,” arXiv preprint arXiv:1410.5401, 2014. [53] A. Meurer, C. P. Smith, M. Paprocki, O. Čertík, S. B. Kirpichev, M. Rocklin, A. Ku- mar, S. Ivanov, J. K. Moore, S. Singh, T. Rathnayake, S. Vig, B. E. Granger, R. P. Muller, F. Bonazzi, H. Gupta, S. Vats, F. Johansson, F. Pedregosa, M. J. Curry, A. R. Terrel, v. Roučka, A. Saboo, I. Fernando, S. Kulal, R. Cimrman, and A. Scopatz, “Sympy: symbolic computing in python,” PeerJ Computer Science, vol. 3, p. e103, Jan. 2017. [54] B. Robaidek, R. Koncel-Kedziorski, and H. Hajishirzi, “Data-driven methods for solving algebra word problems,” CoRR, vol. abs/1804.10718, 2018. [55] H.-Y. Huang, C. Zhu, Y. Shen, and W. Chen, “Fusionnet: Fusing via fully-aware attention with application to machine comprehension,” in International Conference on Learning Representations, 2018. [56] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186, 2019.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/7087	-
dc.description.abstract	本論文主要嘗試研究目前深度學習技術對於自然語言理解的能力。所研究的理解能力主要包含兩者：第一是模型對於數理問題理解與推演的能力，第二則是模型理解對話的能力。以上兩者能力都是人類所具有的能力，但卻是還沒有被自然語言處理界使用深度學習深入調查的問題。為了研究深度學習模型對於數理問題理解與推演的能力，本論文的第一部份著重在數學應用問題的解題任務之上。受到人類解決數學應用問題的方式的啟發，本篇論文設計了一個讓機器可以根據符號的語意產生算式的架構。結果證明，使用符號的語意果然可以增強機器的推演能力。本論文的第二部份則著重於調查前人所提出的對話問答模型對於對話理解的能力。這部份主要關注兩個著名的的對話問題資料集 QuAC 以及 CoQA 。本篇論文在這個部份提出了一系列的實驗以檢驗模型理解的能力，並發現了一些潛在的問題。期望這些貢獻能夠有助於未來在這個方向的研究。透過這兩方面的研究，本篇論文深入地探討了現有模型在語言理解能力上的不足。在數學應用問題的解題任務方面，仍然需要有更完善的方式檢驗模型泛化的能力。而在對話理解方面，如何設計出一個具有理解對話能力的模型也是一個未解的問題。這些都是未來的研究能夠繼續探討的方向。	zh_TW
dc.description.abstract	This work mainly attempt to investigate the natural language understanding capability of current deep learning models. The main understanding capabilities include two: first, the capability of language understanding and arithmetic reasoning capability for math word problems; second, the capability of conversation understanding. The above capabilities are possessed by human, but have not been well explored in the natural language processing filed with deep learning. To investigate the arithmetic reasoning capability of deep learning models, the first part of this work focuses on the task of math work problems solving. Motivated by the solving process of human, this work proposes a framework that allows the model to generate math expressions by manipulating the symbols based on their semantics. The results show its effectiveness of improving the arithmetic reasoning capability. The second part of this work investigates the conversation understanding capability of previous proposed models. Two renown datasets QuAC and CoQA are focused here. This part proposed a series of experiments that can serve as a tool to diagnose the conversation understanding capability of models, discovering some potential hazards. By investigation in this two aspects, this work scrutinizes the incapabilities of current models. For the task of math word problems solving, some more efforts are still required to validate the generalizability of current models. For the task of conversation understanding, the way to design a model that understands conversations remains an unsolved problem. All of these are prospective future research directions.	en
dc.description.provenance	Made available in DSpace on 2021-05-17T15:59:31Z (GMT). No. of bitstreams: 1 ntu-109-R07922052-1.pdf: 1046071 bytes, checksum: 2c3cac4e9f19c4ee7bd14934b3fdb857 (MD5) Previous issue date: 2020	en
dc.description.tableofcontents	Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . iii 摘要 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1 Recurrent Neural Models . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.1 Recurrent Neural Network (RNN) . . . . . . . . . . . . . . . . . . . . . 7 2.1.2 Long-Short Term Memory (LSTM) . . . . . . . . . . . . . . . . . . . . . . 8 2.1.3 Bi-Directional Long-Short Term Memory (BiLSTM) . . . . . . . . . . . . . 8 2.2 Transformer Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Learning Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3.1 Maximum Likelihood for Seq2Seq Learning . . . . . . . . . . . . . . . . . 11 2.3.2 Maximum Likelihood for Answer Span Selection . . . . . . . . . . . . . . 12 3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.1 Question Answering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2 Arithmetic Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.3 Conversation Comprehension . . . . . . . . . . . . . . . . . . . . . . . . 14 4 Math Word Problem Solving . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.1 Encoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2 Decoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3 Decoding State Features . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.3.1 Stack Action Selector . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3.2 Stack Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.3.3 Operand Selector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.3.4 Semantic Transformer . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4.4 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.5 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.6 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.6.1 Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.6.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.6.3 Ablation Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.7 Qualitative Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 4.7.1 Constant Embedding Analysis . . . . . . . . . . . . . . . . . . . . . . . 28 4.7.2 Decoding Process Visualization . . . . . . . . . . . . . . . . . . . . . 29 4.7.3 Error Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 4.7.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 5 Conversation Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.1.1 FlowQA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 5.1.2 BERT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.1.3 SDNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.2 How Well the Performance Reflects Content Comprehension? . . . . . . . . . 39 5.2.1 Experimental Settings . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.2.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.3 Do Models Understand Conversation Content? . . . . . . . . . . . . . . . . 41 5.3.1 Repeat Attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 5.3.2 Predict without Previous Answer Text . . . . . . . . . . . . . . . . . . 45 5.3.3 Predict without Previous Answer Position . . . . . . . . . . . . . . . . 47 5.3.4 Implication of Above Experiments . . . . . . . . . . . . . . . . . . . . 48 5.4 Dataset and Model Analysis . . . . . . . . . . . . . . . . . . . . . . . . 48 5.4.1 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 6 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . 51 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
dc.language.iso	en
dc.title	數理推演與對話中的自然語言理解之研究	zh_TW
dc.title	Investigating Language Understanding in Arithmetic Reasoning and Conversation Modeling	en
dc.type	Thesis
dc.date.schoolyear	108-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	古倫維(Lun-Wei Ku),黃挺豪(Ting-Hao Huang)
dc.subject.keyword	語言理解,對話問答,數學應用問題解題,	zh_TW
dc.subject.keyword	language understanding,conversational question answering,math word problem solving,	en
dc.relation.page	59
dc.identifier.doi	10.6342/NTU202000671
dc.rights.note	同意授權(全球公開)
dc.date.accepted	2020-03-05
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-109-1.pdf	1.02 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。