Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/81002
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor陳信希(Hsin-Hsi Chen)
dc.contributor.authorNatsuki Yamashitaen
dc.contributor.author山下夏輝zh_TW
dc.date.accessioned2022-11-24T03:25:40Z-
dc.date.available2021-11-05
dc.date.available2022-11-24T03:25:40Z-
dc.date.copyright2021-11-05
dc.date.issued2021
dc.date.submitted2021-10-14
dc.identifier.citation20 newsgroups. URL http://qwone.com/~jason/20Newsgroups/. Chat translation task ­ emnlp fifth conference on machine translation. URL http://www.statmt.org/wmt20/chat-task.html. Translation quality. URL https://www.deepl.com/en/quality.html. News­commentary v16. URL https://opus.nlpl.eu/News-Commentary.php. Yelp open dataset. URL https://www.yelp.com/dataset. The reuters dataset, Jul 2017. URL https://martin-thoma.com/nlp-reuters/. Ai 企業が考察する google 翻訳超え機械翻訳「deepl」のスゴさ, Apr 2020. URL https://ledge.ai/deepl/. Ahmed Abdelali, Francisco Guzman, Hassan Sajjad, and Stephan Vogel. The amara cor­ pus: Building parallel language resources for the educational domain. In LREC, vol­ ume 14, pages 1044–1054, 2014. Dean C Barnlund and Miho Yoshioka. Apologies: Japanese and american styles. International Journal of Intercultural Relations, 14(2):193–206, 1990. Shoshana Blum­Kulka. Learning to say what you mean in a second language; a study of the speech act performance of learners of hebrew as a second language. 1980. Shoshana Blum­Kulka and Elite Olshtain. Requests and apologies: A cross­cultural study of speech act realization patterns (ccsarp). Applied linguistics, 5(3):196–213, 1984. Keith Carlson, Allen Riddell, and Daniel Rockmore. Evaluating prose style transfer with the bible. Royal Society open science, 5(10):171920, 2018. Rong Chen, Lin He, and Chunmei Hu. Chinese requests: In comparison to american and japanese requests and with reference to the 'east­west divide'. Journal of Pragmatics, 55:140–161, 2013. Yi­Ting Chen, Hen­Hsen Huang, and Hsin­Hsi Chen. Mpdd: A multi­party dialogue dataset for analysis of emotions and interpersonal relationships. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 610–614, 2020. Stephanie Weijung Cheng. An exploratory cross­sectional study of interlanguage pragmatic development of expressions of gratitude by Chinese learners of English. The University of Iowa, 2005. Jacob Devlin, Ming­Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre­ training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805, 2018. Sanae Enomoto and Helen Marriott. Investigating evaluative behavior in japanese tour guiding interaction. 1994. Xiaowen Guan, Hee Sun Park, and Hye Eun Lee. Cross­cultural differences in apology. International Journal of Intercultural Relations, 33(1):32–45, 2009. Beverly Hill, Sachiko Ide, Shoko Ikuta, Akiko Kawasaki, and Tsunao Ogino. Univer­ sals of linguistic politeness: Quantitative evidence from japanese and american english. Journal of pragmatics, 10(3):347–371, 1986. Juliane House and Gabriele Kasper. Politeness markers in english and german. In Conversational routine, pages 157–186. De Gruyter Mouton, 2011. Chin­Lan Huang, Cindy K Chung, Natalie Hui, Yi­Cheng Lin, Yi­Tai Seih, Ben CP Lam, Wei­Chuan Chen, Michael H Bond, and James W Pennebaker. The development of the chinese linguistic inquiry and word count dictionary. Chinese Journal of Psychology, 2012. Ming­Shu Huang. A comparative study of chinese and japanese linguistic behavior in the closing section in agreement situation of invitational discourse : Focusing on two situations with different burden degrees. Japanese language education, (48・49):22–31, 2015. ISSN 0917­4206. URL https://ci.nii.ac.jp/naid/120005770816/. William Hwang, Hannaneh Hajishirzi, Mari Ostendorf, and Wei Wu. Aligning sentences from standard wikipedia to simple wikipedia. In Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 211–217, 2015. Asuka Ichihara. 're­thanking on previous indebtedness'as a thanking strategy of japanese language : Compared with chinese language discourse. Japanese language education, (51):21–29, 2016. Harsh Jhamtani, Varun Gangal, Eduard Hovy, and Eric Nyberg. Shakespearizing modern language using copy­enriched sequence­to­sequence models. arXiv preprint arXiv:1707.01161, 2017. Ken Lang. Newsweeder: Learning to filter netnews. In Proceedings of the Twelfth International Conference on Machine Learning, pages 331–339, 1995. Wong Lee and Mei Song. Imperatives in requests: Direct or impolite–observations from chinese. Pragmatics, 4(4):491–515, 1994. Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick Van Kleef, Sören Auer, et al. Dbpedia–a large­scale, multilingual knowledge base extracted from wikipedia. Semantic web, 6(2):167–195, 2015. Pierre Lison and Jörg Tiedemann. Opensubtitles2016: Extracting large parallel corpora from movie and tv subtitles. 2016. Yun Meng. Pragmatic transfer by chinese advanced learners of japanese: expressing politeness in the context of refusals to requests. Forum of international development studies, (36):241–254, 2008. ISSN 13413732. Hajime Morita, Daisuke Kawahara, and Sadao Kurohashi. Morphological analysis for un­ segmented languages using recurrent neural network language model. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 2292–2297, Lisbon, Portugal, September 2015. Association for Computational Linguis­ tics. doi: 10.18653/v1/D15­1276. URLhttps://aclanthology.org/D15-1276. Franz Josef Och and Hermann Ney. A systematic comparison of various statistical align­ ment models. Computational Linguistics, 29(1):19–51, 2003. Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? sentiment classifica­ tion using machine learning techniques. arXiv preprint cs/0205070, 2002. Kishore Papineni, Salim Roukos, Todd Ward, and Wei­Jing Zhu. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pages 311–318, 2002. Barbara Pizziconi. Re­examining politeness, face and the japanese language. Journal of pragmatics, 35(10­11):1471–1506, 2003. Reid Pryzant, Richard Diehl Martinez, Nathan Dass, Sadao Kurohashi, Dan Jurafsky, and Diyi Yang. Automatically neutralizing subjective bias in text. In Proceedings of the aaai conference on artificial intelligence, volume 34, pages 480–489, 2020. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. Exploring the limits of transfer learn­ ing with a unified text­to­text transformer. Journal of Machine Learning Research, 21 (140):1–67, 2020. URL http://jmlr.org/papers/v21/20-074.html. Sudha Rao and Joel Tetreault. Dear sir or madam, may i introduce the gyafc dataset: Corpus, benchmarks and metrics for formality style transfer. arXiv preprint arXiv:1803.06535, 2018. Nils Reimers and Iryna Gurevych. Making monolingual sentence embeddings multilin­ gual using knowledge distillation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 11 2020. URL https://arxiv.org/abs/2004.09813. Matīss Rikters, Ryokan Ri, Tong Li, and Toshiaki Nakazawa. Designing the business conversation corpus. In Proceedings of the 6th Workshop on Asian Translation, pages 54–61, Hong Kong, China, November 2019. Association for Computational Linguis­tics. doi: 10.18653/v1/D19­5204. URL https://www.aclweb.org/anthology/ D19-5204. Xuqing Shi. Study on pragmatic failures in cross­culture communication. 2020. Mei Song and Wong Lee. Qing/please–a polite or requestive marker?: Observations from chinese. 1994. Jenny Thomas. Cross­cultural pragmatic failure. Applied linguistics, 4(2):91–112, 1983. Jörg Tiedemann. Parallel data, tools and interfaces in opus. In Lrec, volume 2012, pages 2214–2218. Citeseer, 2012. Changhan Wang, Anne Wu, and Juan Pino. Covost 2 and massively multilingual speech­ to­text translation. arXiv preprint arXiv:2007.10310, 2020. Nessa Wolfson. Compliments in cross­cultural perspective. TESOL quarterly, 15(2): 117–124, 1981. Junjie Xing. Practical applications of the understanding of foreign cultures in japanese­ language teachings. Journal of Contemporary Educational Research, 5(6):19–22, 2021. Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al­Rfou, Aditya Sid­ dhant, Aditya Barua, and Colin Raffel. mT5: A massively multilingual pre­ trained text­to­text transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 483–498, Online, June 2021. Association for Computational Lin­ guistics. doi: 10.18653/v1/2021.naacl­main.41. URLhttps://aclanthology.org/ 2021.naacl-main.41. Yi­bo Yan. The comparative study on politeness between chinese and english. US­China Foreign Language, 544, 2015. Ming­Chung Yu. Universalistic and culture­specific perspectives on variation in the ac­ quisition of pragmatic competence in a second language. Pragmatics, 9(2):281–312, 1999. Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q Weinberger, and Yoav Artzi. Bertscore: Evaluating text generation with bert. arXiv preprint arXiv:1904.09675, 2019. 林瑋芳, 黃金蘭, 林以正, 李嘉玲, and James W. Pennebaker. 語言探索與字詞計算詞 典 2015 中文版之修訂. 調查研究­方法與應用, 45:73–118, 2020. 張穎. 依頼会話の展開パタンに関する日中対照研究. 言語文化と日本語教育, 28: 8–14, 2004. 小磯花絵, 天谷晴香, 石本祐一, 居關友里子, 臼田泰如, 柏野和佳子, 川端良子, 田中 弥生, 伝康晴, and 西川賢哉. 『日本語日常会話コーパス』モニター公開版コ ーパスの設計と特徴. 国語研究所日常会話コーパスプロジェクト報告書 3, mar 2019. 小磯花絵, 天谷晴香, 居關友里子, 臼田泰如, 柏野和佳子, 川端良子, 田中弥生, 伝康 晴, and 西川賢哉. 『日本語日常会話コーパス』モニター版の設計・評価・予備 的分析. 国立国語研究所論集, (18):17–33, jan 2020.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/81002-
dc.description.abstract隨著機器翻譯技術的快速發展,我們使用機器翻譯與來自不同文化背景的人進行交流變得越來越普遍。在跨文化交際中,說話人的語言能力越高,在特定情況下當說話人的語言使用與聽話人的對話文化的語用規則相悖時,說話人就越有可能被認為性格不好或缺乏社交禮儀。這種跨文化對人際關係的負麵影響問題在未來可能會變得更加嚴重,更加頻繁,因為連目前的機器翻譯都擁有與外語學習者相同或更好的語言能力。因此,機器譯員有必要進行文化意識的翻譯,即在聽眾的對話文化中適當使用語言的翻譯。 在這項研究中,我們首先確認,語言使用在不同的文化中是不同的,這取決於情景和人際關係。然後,為了通過文化意識的機器翻譯減少不同文化帶來的問題,我們創建了一個文化意識的中日對話平行語料集,包括人際關係標簽和三種情況標簽:道歉、請求和感謝。我們對我們的創建的語料集進行了統計分析。此外,我們使用了一個sequence-to-sequence模型來對我們的語料集進行情況分類和文化感知的機器翻譯。 在本研究中,我們驗證了人際關係對日文的情景分類和中翻日的文化感知機器翻譯的貢獻。此外,對於中文的情景分類和日翻中的文化感知機器翻譯來說,這一點的結果也與一個觀點相吻合。中文在不同背景資訊中語言使用會大變化的語用規則比較模糊。這些結果很重要,因為該模型可以透過包括人際關係以及背景資訊在對話文化中實現文化感知。zh_TW
dc.description.provenanceMade available in DSpace on 2022-11-24T03:25:40Z (GMT). No. of bitstreams: 1
U0001-3108202119115000.pdf: 1081286 bytes, checksum: 91782880fa6e5213bb753967dc523b5a (MD5)
Previous issue date: 2021
en
dc.description.tableofcontents口試委員審定書 ii Acknowledgements v 摘要 vii Abstract ix Contents xi List of Figures xv List of Tables xvii 1 Introduction 1 1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Research Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.4 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Literature Review 7 2.1 Cultural Difference in Empirical Studies . . . . . . . . . . . . . . . . . . 7 2.1.1 Situation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.2 Interpersonal Relation . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.3 Negative Pragmatic Transfer . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Related Tasks and Data in Natural Language Processing . . . . . . . . . . 12 3 Dataset 15 3.1 Base Corpora . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2 Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.1 Relation Label . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2.2 Situation Label . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3 Creation of Parallel Texts . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3.1 Machine­translated Text . . . . . . . . . . . . . . . . . . . . . . . . 19 3.3.2 Human­translated Text . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.3.2.1 Translator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 3.3.2.2 Texts to be Human­Translated . . . . . . . . . . . . . . . . . . 21 3.3.2.3 Information Given to the Translators . . . . . . . . . . . . . . . 21 3.4 Findings on Our Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.4.1 Difference Between Machine and Human­translated Utterances . . . 22 3.4.2 Tendency of Cultural Difference of Language Use . . . . . . . . . . 26 3.4.2.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 3.4.2.2 Data Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4.2.3 Analysis for the Labeled Data . . . . . . . . . . . . . . . . . . . 30 4 Situation Classification 33 4.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.1.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.1.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.1.3 Prefix Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.1.4 Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.1.5 Comparison of Scores . . . . . . . . . . . . . . . . . . . . . . . . . 36 4.2 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.4.1 The Model in Confusion . . . . . . . . . . . . . . . . . . . . . . . . 40 4.4.2 Difference between Culture in Mainland of China and Taiwan . . . . 40 4.4.3 Utterance Length . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5 Culture­Aware Machine Translation 43 5.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.1.1 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.1.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 5.1.3 Prefix Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 5.1.4 Measurement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.2 Experiment Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 5.4.1 Confusion by Relation Label . . . . . . . . . . . . . . . . . . . . . . 51 5.4.2 Well Culture­Aware Translation . . . . . . . . . . . . . . . . . . . . 52 5.4.3 Not Well Translation . . . . . . . . . . . . . . . . . . . . . . . . . . 54 6 Conclusion 58 6.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 References 60
dc.language.isoen
dc.subject機器翻譯zh_TW
dc.subject跨文化溝通zh_TW
dc.subject語用zh_TW
dc.subject文化差異zh_TW
dc.subject語料集zh_TW
dc.subjectpragmaticen
dc.subjectdataseten
dc.subjectmachine translationen
dc.subjectculture-awareen
dc.subjectcross-cultural communicationen
dc.subjectcultural differenceen
dc.title中文多方對話語料集和日語日常會話語料集在表達歉意、請求和感謝上的文化差異及其在機器翻譯上的應用zh_TW
dc.title"Application of Cultural Differences in Expressing Apologies, Requests and Thanks in Multi-Party Dialogue and Everyday Japanese Conversation Datasets for Machine Translation"en
dc.date.schoolyear109-2
dc.description.degree碩士
dc.contributor.oralexamcommittee陳光華(Hsin-Tsai Liu),郭俊桔(Chih-Yang Tseng),黃瀚萱
dc.subject.keyword語料集,機器翻譯,文化差異,跨文化溝通,語用,zh_TW
dc.subject.keyworddataset,machine translation,culture-aware,cross-cultural communication,cultural difference,pragmatic,en
dc.relation.page67
dc.identifier.doi10.6342/NTU202102914
dc.rights.note同意授權(限校園內公開)
dc.date.accepted2021-10-15
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
U0001-3108202119115000.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
1.06 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved