群眾運算機制於翻譯網路方言之研究

Ming-Tung Hong; 洪明彤

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/55665

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	許永真(Jane Yung-Jen Hsu)
dc.contributor.author	Ming-Tung Hong	en
dc.contributor.author	洪明彤	zh_TW
dc.date.accessioned	2021-06-16T04:15:55Z	-
dc.date.available	2014-09-18
dc.date.copyright	2014-08-25
dc.date.issued	2014
dc.date.submitted	2014-08-20
dc.identifier.citation	Bibliography [1] A. Agarwal, B. Xie, I. Vovsha, O. Rambow, and R. Passonneau. Sentiment analysis of twitter data. In Proceedings of the Workshop on Languages in Social Media, pages 30–38. Association for Computational Linguistics, 2011. [2] A. Aw, M. Zhang, J. Xiao, and J. Su. A phrase-based statistical model for sms text normalization. In Proceedings of the COLING/ACL on Main conference poster ses- sions, pages 33–40. Association for Computational Linguistics, 2006. [3] S.Bird,E.Klein,andE.Loper.NaturalLanguageProcessingwithPython.O’Reilly Media, 2009. [4] M.Choudhury,R.Saraf,V.Jain,A.Mukherjee,S.Sarkar,andA.Basu.Investigation and modeling of the structure of texting language. International Journal of Document Analysis and Recognition (IJDAR), 10(3-4):157–174, 2007. [5] E. Clark and K. Araki. Text normalization in social media: progress, problems and applications for a pre-processing system of casual english. Procedia-Social and Be- havioral Sciences, 27:2–11, 2011. [6] P.CookandS.Stevenson.Anunsupervisedmodelfortextmessagenormalization.In Proceedings of the Workshop on Computational Approaches to Linguistic Creativity, pages 71–78. Association for Computational Linguistics, 2009. [7] D. Crystal. Texting. ELT journal, 62(1):77–83, 2008. 41 [8] S. Dow, A. Kulkarni, S. Klemmer, and B. Hartmann. Shepherding the crowd yields better work. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work, pages 1013–1022. ACM, 2012. [9] S.Gouws,D.Metzler,C.Cai,andE.Hovy.Contextualbearingonlinguisticvariation in social media. In Proceedings of the Workshop on Languages in Social Media, pages 20–29. Association for Computational Linguistics, 2011. [10] M. Kaufmann and J. Kalita. Syntactic normalization of twitter messages. In Inter- national conference on natural language processing, Kharagpur, India, 2010. [11] C. Kobus, F. Yvon, and G. Damnati. Normalizing sms: are two metaphors better thanone? InProceedingsofthe22ndInternationalConferenceonComputational Linguistics-Volume 1, pages 441–448. Association for Computational Linguistics, 2008. [12] E. Kouloumpis, T. Wilson, and J. Moore. Twitter sentiment analysis: The good the bad and the omg! ICWSM, 11:538–541, 2011. [13] E.LawandL.VonAhn.Input-agreement:anewmechanismforcollectingdatausing human computation games. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1197–1206. ACM, 2009. [14] G. A. Miller. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39–41, 1995. [15] C.-C.MusatThisone,A.Ghasemi,andB.Faltings.Sentimentanalysisusinganovel human computation game. In Proceedings of the 3rd Workshop on the People’s Web Meets NLP: Collaboratively Constructed Semantic Resources and their Applications to NLP, pages 1–9. Association for Computational Linguistics, 2012. [16] O. Owoputi, B. O’Connor, C. Dyer, K. Gimpel, N. Schneider, and N. A. Smith. Improved part-of-speech tagging for online conversational text with word clusters. In Proceedings of NAACL-HLT, pages 380–390, 2013. 42 [17] A. J. Quinn and B. B. Bederson. Human computation: a survey and taxonomy of a growing field. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pages 1403–1412. ACM, 2011. [18] K. Roschke. The text generation: Is English the next dead language? PhD thesis, Master’s thesis, Arizona State University, Tempe, AZ). Retrieved from http://mwtc. composing. org/grad/projects/roschke. pdf, 2008. [19] J. Ross, L. Irani, M. Silberman, A. Zaldivar, and B. Tomlinson. Who are the crowd- workers?: shifting demographics in mechanical turk. In CHI’10 Extended Abstracts on Human Factors in Computing Systems, pages 2863–2872. ACM, 2010. [20] N. Seemakurty, J. Chu, L. Von Ahn, and A. Tomasic. Word sense disambiguation via human computation. In Proceedings of the acm sigkdd workshop on human computation, pages 60–63. ACM, 2010. [21] L. Von Ahn and L. Dabbish. Designing games with a purpose. Communications of the ACM, 51(8):58–67, 2008. [22] L. Wasden. Internet lingo dictionary: A parents guide to codes used in chat rooms, instant messaging, text messaging. Technical report, and blogs. Technical report, Attorney General, 2010. 43
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/55665	-
dc.description.abstract	Lingo is an emerging language on the Internet. To understand the meaning of lingo can help analyze the web content and understand various cultures in the online communities. However, providing a standardized definition remains difficult due to continuous changes made to its nature. We proposed Tranzzl!n9o, a crossword puzzle game for engaging crowds to translate Internet lingo. In our game, players provide explanations for lingo in parallel and iteratively verify the explanations from other players. We conducted experiments with 45 qualified workers to evaluate our design on Amazon Mechanical Turk. There are 138 explanations generated from 20 puzzles by 45 qualified players. Results show that we achieved 77.06% precision and 85.71% recall for collecting explanations of lingo. With at least twice agreements, we achieved 90.57% precision and 48.98% recall. Moreover, crowed-sourced explanations are very informative, not only explaining lingo itself but also containing lingo usage. Follow-up questionnaires show that over 60% of players like our game and would like to play it again. Considering weekly players, 75% of them said so. By keeping our lingo dictionary updated, we hope to support out-of-vocabulary issues in language processing and an annotated corpus of lingo for machine learning, and help Internet users better-understand lingo.	en
dc.description.provenance	Made available in DSpace on 2021-06-16T04:15:55Z (GMT). No. of bitstreams: 1 ntu-103-R01922115-1.pdf: 2308333 bytes, checksum: 797ce38db3e8deac7a83e8cbf0b44018 (MD5) Previous issue date: 2014	en
dc.description.tableofcontents	Contents 口試委員會審定書 iii 誌謝 v Abstract vii 1 Introduction 1 1.1 Motivation.................................. 1 1.2 TranslatingInternetLingo ......................... 2 1.3 ProblemDefinition ............................. 3 2 Related Work 7 2.1 InternetLingoDictionaryLookup ..................... 7 2.2 MachineComputationinTextNormalization . . . . . . . . . . . . . . . 8 2.2.1 SpellingCorrectionApproach ................... 8 2.2.2 MachineTranslationApproach................... 8 2.2.3 AutomaticSpeechRecognitionApproach . . . . . . . . . . . . . 9 2.3 Human 2.3.1 Jinx for Generating Word Sense Disambiguation Dataset . . . . . 10 2.3.2 GuesstimentforSentimentAnnotation. . . . . . . . . . . . . . . 10 3 Internet Lingo Translating System 11 3.1 SystemDesignIncentive .......................... 11 3.2 InternetLingoExtraction.......................... 13 ComputationinWordGames.................... 9 ix 3.2.1 LanguageDetection ........................ 13 3.2.2 Rule-basedFiltering ........................ 14 3.3 DictionarySetupandQuestionPool .................... 15 3.4 HumanComputation:WordGame..................... 15 3.4.1 PuzzleGenerator .......................... 16 3.4.2 GameWorkflow .......................... 17 3.5 HumanComputation:DesignFeatures................... 19 3.5.1 PlayerQualification ........................ 19 3.5.2 Two-stageAgreement ....................... 21 4 Experiments and Evaluation 23 4.1 SystemDeployment............................. 23 4.2 DataProfile................................. 24 4.2.1 TweetsDataStatistics ....................... 24 4.2.2 CrosswordPuzzleSetup ...................... 25 4.3 Experiments................................. 26 4.3.1 LessonLearnedfromPilotStudy ................. 26 4.3.2 DiscussionofCollectedExplanations . . . . . . . . . . . . . . . 27 4.4 Evaluation.................................. 29 4.4.1 GroundTruthCollection...................... 29 4.4.2 EvaluationMetrics ......................... 30 4.4.3 Evaluation:LingofyTask...................... 31 4.4.4 Evaluation:UnlingofyTask .................... 32 4.4.5 Evaluation:EngaginginTranzz!n9o. . . . . . . . . . . . . . . . 34 5 Conclusions and Future work 39 5.1 Conclusions................................. 39 5.2 Limitations ................................. 40 5.3 FutureWork................................. 40 Bibliography 41 x
dc.language.iso	en
dc.subject	群眾運算	zh_TW
dc.subject	網路方言	zh_TW
dc.subject	網路方言	zh_TW
dc.subject	群眾運算	zh_TW
dc.subject	Human Computation	en
dc.subject	Internet Lingo	en
dc.subject	Human Computation	en
dc.subject	Internet Lingo	en
dc.title	群眾運算機制於翻譯網路方言之研究	zh_TW
dc.title	A Human Computation Approach to English Translation of Internet Lingo	en
dc.type	Thesis
dc.date.schoolyear	102-2
dc.description.degree	碩士
dc.contributor.oralexamcommittee	林守德(Shou-De Lin),陳昇瑋(Sheng-Wei Chen),蔡宗翰(Tzong-Han Tsai),林光龍
dc.subject.keyword	群眾運算,網路方言,	zh_TW
dc.subject.keyword	Human Computation,Internet Lingo,	en
dc.relation.page	43
dc.rights.note	有償授權
dc.date.accepted	2014-08-20
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-103-1.pdf 未授權公開取用	2.25 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。