跨語言線上百科連結

Yu-Chun Wang; 王昱鈞

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52983

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	項潔(Jieh Hsiang)
dc.contributor.author	Yu-Chun Wang	en
dc.contributor.author	王昱鈞	zh_TW
dc.date.accessioned	2021-06-15T16:37:33Z	-
dc.date.available	2018-08-16
dc.date.copyright	2015-08-16
dc.date.issued	2015
dc.date.submitted	2015-08-12
dc.identifier.citation	[1] T. H. Davenport and L. Prusak, Working knowledge: How organizations manage what they know. Harvard Business Press, 1998. [2] W. A. Gale and K. W. Church, “A program for aligning sentences in bilingual corpora,” in Proceedings of Association for Computational Linguistics, 1991. [3] P. F. Brown, J. C. Lai, and R. L. Mercer, “Aligning sentences in parallel corpora,” in Proceedings of the 29th annual meeting on Association for Computational Linguistics, pp. 169–176, 1991. [4] M. M. Hasan and Y. Matsumoto, “Multilingual document alignment-a study with chinese and japanese,” in Proceedings of Natural Language Processing Paciﬁc Rim Symposium, pp. 617–623, 2001. [5] R. Nazar, “Parallel corpus alignment at the document, sentence and vocabulary levels,” Procesamiento del lenguaje natural, vol. 47, pp. 129–136, 2011. [6] T. Lee and S.-w. Hwang, “Bootstrapping entity translation on weakly comparable corpora,” in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2013), vol. 1, pp. 631–640, 2013. [7] D. Nguyen, A. Overwijk, C. Hauff, D. R. Trieschnigg, D. Hiemstra, and F. De Jong, “Wikitranslate: query translation for cross-lingual information retrieval using only wikipedia,” in Evaluating Systems for Multilingual and Multimodal Information Access, pp. 58–65, Springer, 2009. [8] J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. van Kleef, S. Auer, et al., “Dbpedia – a large-scale, multilingual knowledge base extracted from wikipedia,” Semantic Web Journal, vol. 1, pp. 1–5, 2014. [9] P. Sorg and P. Cimiano, “Enriching the crosslingual link structure of wikipedia - a classiﬁcation-based approach,” in Proceedings of the AAAI 2008 Workshop on Wikipedia and Artiﬁcal Intelligence, pp. 49–54, 2008. [10] J.-H. Oh, D. Kawahara, K. Uchimoto, J. Kazama, and K. Torisawa, “Enriching multilingual language resources by discovering missing cross-language links in wikipedia,” in Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI/IAT 2008), vol. 1, pp. 322–328, 2008. [11] M. Jiang, “The business and politics of search engines: A comparative study of baidu and google’s search results of internet events in china,” New Media & Society, vol. 16, no. 2, pp. 212–233, 2014. [12] H.-T. Liao, “How does localization inﬂuence online visibility of user-generated encyclopedias? a study on chinese-language search engine result pages (serps),” in Proceedings of the 9th International Symposium on Open Collaboration, p. 27, ACM, 2013. [13] M. Diab and S. Finch, “A statistical word-level translation model for comparable corpora,” in Proceedings of Conference on Content-based Multimedia Information Access, 2000. [14] L. Shao and H. T. Ng, “Mining new word translations from comparable corpora,”in Proceedings of the 20th international conference on Computational Linguistics, Association for Computational Linguistics, 2004. [15] C.-J. Lee, J. S. Chang, and J.-S. R. Jang, “Extraction of transliteration pairs from parallel corpora using a statistical transliteration model,” Information Sciences, vol. 176, no. 1, pp. 67–90, 2006. [16] M. Nagata, T. Saito, and K. Suzuki, “Using the web as a bilingual dictionary,” in Proceedings of the workshop on Data-driven methods in machine translation, vol. 14, pp. 1–8, 2001. [17] W.-H. Lu, L.-F. Chien, and H.-J. Lee, “Translation of web queries using anchor text mining,” ACM Transactions on Asian Language Information Processing (TALIP), vol. 1, no. 2, pp. 159–172, 2002. [18] D. Zhou, M. Truran, T. Brailsford, and H. Ashman, “Ntcir-6 experiments using pattern matched translation extraction,” in Proceedings of NTCIR-6 Workshop Meeting, 2006. [19] K. Kwok, P. Deng, N. Dinstl, H. Sun, W. Xu, P. Peng, and J. Doyon, “Chinet: a chinese name ﬁnder system for document triage,” in Proceedings of 2005 International Conference on Intelligence Analysis, 2005. [20] G. Cao, J. Gao, J.-Y. Nie, and W. Redmond, “A system to mine large-scale bilingual dictionaries from monolingual web,” Proceedings of MT Summit XI, pp. 57–64, 2007. [21] D. Lin, S. Zhao, B. Van Durme, and M. Pas¸ca, “Mining parenthetical translations from the web by word alignment,” in Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pp. 944–1002, 2008. [22] I. D. Melamed, “Models of translational equivalence among words,” Computational Linguistics, vol. 26, no. 2, pp. 221–249, 2000. [23] W. A. Gale and K. W. Church, “Identifying word correspondences in parallel texts,”in Proceedings of the workshop on Speech and Natural Language, vol. 91, pp. 152–157, 1991. [24] X. Wu, N. Okazaki, and J. Tsujii, “Semi-supervised lexicon mining from parenthetical expressions in monolingual web pages,” in Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 424–432, 2009. [25] S. Karimi, F. Scholer, and A. Turpin, “Machine transliteration survey,” ACM Computting Surveys, vol. 43, no. 3, p. 17, 2011. [26] K. Knight and J. Graehl, “Machine transliteration,” Computational Linguistics, vol. 24, no. 4, pp. 599–612, 1998. [27] H. M. Meng, W.-K. Lo, B. Chen, and K. Tang, “Generating phonetic cognates to handle named entities in english-chinese cross-language spoken document retrieval,” in Automatic Speech Recognition and Understanding, 2001. ASRU’01. IEEE Workshop on, pp. 311–314, IEEE, 2001. [28] P. Virga and S. Khudanpur, “Transliteration of proper names in cross-language applications,” in Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, pp. 365–366, 2003. [29] P. Virga and S. Khudanpur, “Transliteration of proper names in cross-lingual information retrieval,” in Proceedings of the ACL 2003 workshop on Multilingual and mixed-language named entity recognition-Volume 15, pp. 57–64, Association for Computational Linguistics, 2003. [30] J. S. Lee and K.-S. Choi, “English to korean statistical transliteration for information retrieval,” Computer Processing of Oriental Languages, vol. 12, no. 1, pp. 17–37, 1998. [31] J. S. Lee, An English-Korean transliteration and retransliteration model for Cross-lingual information retrieval. PhD thesis, Computer Science Dept., KAIST, 1999. [32] K. S. Jeong, S. H. Myaeng, J. S. Lee, and K.-S. Choi, “Automatic identiﬁcation and back-transliteration of foreign words for information retrieval,” Information Processing and Management, vol. 35, pp. 523–540, 1999. [33] J. Kim, J. S. Lee, and K.-S. Choi, “Pronunciation unit based automatic english-korean transliteration model using neural network,” in Proceedings of Korea Cognitive Science Association (in Korean), 1999. [34] B.-J. Kang and K.-S. Choi, “Automatic transliteration and back-transliteration by decision tree learning,” in Proceedings of the 2nd International Conference on Language Resources and Evaluation, pp. 1135–1141, 2000. [35] B.-J. Kang, A resolution of word mismatch problem caused by foreign word transliterations and English words in Korean information retrieval. PhD thesis, Computer Science Dept., KAIST, 2001. [36] I.-H. Kang and G. Kim, “English-to-korean transliteration using multiple unbounded overlapping phoneme chunks,” in Proceedings of the 18th International Conference on Computational Linguistics, pp. 418–424, 2000. [37] I. Goto, N. Kato, N. Uratani, and T. Ehara, “Transliteration considering context information based on the maximum entropy method,” in Proceedings of MT-Summit IX, pp. 125–132, 2003. [38] H. Li, M. Zhang, and J. Su, “A joint source-channel model for machine transliteration,” in Proceedings of the 42nd Annual Meeting on association for Computational Linguistics, pp. 160–167, 2004. [39] M. Zhang, H. Li, and J. Su, “Direct orthographical mapping for machine transliteration,” in Proceedings of the 20th international conference on Computational Linguistics, p. 716, 2004. [40] Y. Al-Onaizan and K. Knight, “Translating named entities using monolingual and bilingual resources,” in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 400–408, Association for Computational Linguistics, 2002. [41] J.-H. Oh, “Machine learning based english-to-korean transliteration using grapheme and phoneme information,” IEICE transactions on information and systems, vol. 88, no. 7, pp. 1737–1748, 2005. [42] J.-H. Oh, K.-S. Choi, and H. Isahara, “A hybrid model for extracting transliteration equivalents from parallel corpora,” in Proceedings of the 9th International Conference on Text, Speech and Dialogue, pp. 119–126, 2006. [43] J.-H. Oh, K.-S. Choi, and H. Isahara, “Improving machine transliteration performance by using multiple transliteration models,” in Proceedings of the 21st International Conference on Computer Processing of Oriental Languages, pp. 85–96, 2006. [44] P. McNamee and H. T. Dang, “Overview of the tac 2009 knowledge base population track,” in Text Analysis Conference (TAC), vol. 17, pp. 111–113, 2009. [45] H. Ji, R. Grishman, H. T. Dang, K. Grifﬁtt, and J. Ellis, “Overview of the tac 2010 knowledge base population track,” in Third Text Analysis Conference (TAC 2010), 2010. [46] Y. Guo, G. Tang, W. Che, T. Liu, and S. Li, “Hit approaches to entity linking at tac 2011,” in Proceedings of the Fourth Text Analysis Conference (TAC 2011), 2011. [47] W. Zhang, J. Su, B. Chen, W. Wang, Z. Toh, Y. Sim, Y. Cao, C. Y. Lin, and C. L. Tan, “I2R-nus-msra at TAC 2011: Entity linking,” in Proceedings of Text Analysis Conference (TAC 2011), 2011. [48] A. Fahrni, T. Gぴockel, and M. Strube, “Hitsmonolingual and cross-lingual entity linking system at tac 2012: A joint approach,” in Proceedings of the Sixth Text Analysis Conference (TAC 2012), vol. 1, 2012. [49] B. Hachey, W. Radford, and J. R. Curran, “Graph-based named entity linking with wikipedia,” in Web Information System Engineering–WISE 2011, pp. 213–226, Springer, 2011. [50] P. McNamee, J. Mayﬁeld, D. Lawrie, D. W. Oard, and D. S. Doermann, “Cross-language entity linking,” in Proceedings of International Joint Conference on Natural Language Processing (IJCNLP 2011), pp. 255–263, 2011. [51] J. Clarke, Y. Merhav, G. Suleiman, S. Zheng, and D. Murgatroyd, “Basis technology at tac 2012 entity linking,” in Proceedings of the Sixth Text Analysis Conference (TAC 2012), 2012. [52] Q. Miao, R. Fang, Y. Meng, and S. Zhang, “Frdc’s cross-lingual entity linking system at tac 2013,” in Proceedings of Text Analysis Conference (TAC 2013), 2013. [53] Y. Chen and C. Zong, “A structure-based model for chinese organization name translation,” ACM Transactions on Asian Language Information Processing (TALIP), vol. 7, no. 1, pp. 1–30, 2008. [54] D. Lowd and P. Domingos, “Efﬁcient weight learning for markov logic networks,”in Lecture Notes in Computer Science, 4702, pp. 200–211, Springer, 2007. [55] U. Manber and G. Myers, “Sufﬁx arrays: a new method for on-line string searches,”siam Journal on Computing, vol. 22, no. 5, pp. 935–948, 1993. [56] H. Poon and P. Domingos, “Sound and efﬁcient inference with probabilistic and deterministic dependencies,” in Proceedings of the National Conference on Artiﬁcial Intelligence, vol. 6, pp. 458–463, 2006. [57] S. Jiampojamarn, C. Cherry, and G. Kondrak, “Integrating joint n-gram features into a discriminative training framework,” in Proceedings of NAACL-2010, (Los Angeles, CA), Association for Computational Linguistics, June 2010. [58] J.-H. Oh and K.-S. Choi, “An ensemble of transliteration models for information retrieval,” Information processing & management, vol. 42, no. 4, pp. 980–1002, 2006. [59] H. Li, M. Zhang, and J. Su, “A joint source-channel model for machine transliteration,” in Proceedings of the 42nd Annual Meeting on association for Computational Linguistics, p. 159, Association for Computational Linguistics, 2004. [60] Y. Jia, D. Zhu, and S. Yu, “A noisy channel model for grapheme-based machine transliteration,” in Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration, pp. 88–91, Association for Computational Linguistics, 2009. [61] S. Wan and C. M. Verspoor, “Automatic english-chinese name transliteration for development of multilingual resources,” in Proceedings of the 17th international conference on Computational linguistics-Volume 2, pp. 1352–1356, Association for Computational Linguistics, 1998. [62] L. Jiang, M. Zhou, L.-F. Chien, and C. Niu, “Named entity translation with web mining and transliteration.,” in IJCAI, vol. 7, pp. 1629–1634, 2007. [63] C. Zhang, T. Li, and T. Zhao, “Syllable-based machine transliteration with extrac phrase features,” in Proceedings of NEWS 2012, pp. 52–56, 2012. [64] S. Jiampojamarn, G. Kondrak, and T. Sherif, “Applying many-to-many alignments and hidden markov models to letter-to-phoneme conversion,” in Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference, (Rochester, New York), pp. 372–379, Association for Computational Linguistics, April 2007. [65] S. Jiampojamarn, C. Cherry, and G. Kondrak, “Joint processing and discriminative training for letter-to-phoneme conversion,” in Proceedings of ACL-08: HLT, (Columbus, Ohio), pp. 905–913, Association for Computational Linguistics, June 2008. [66] R. Zens and H. Ney, “Improvements in phrase-based statistical machine translation,”in HLTNAACL 2004: Main Proceedings, (Boston, Massachusetts, USA), pp. 257–264, 2004. [67] K. Crammer and Y. Singer, “Ultraconservative online algorithms for multiclass problems,” The Journal of Machine Learning Research, vol. 3, pp. 951–991, 2003. [68] G. Kondrak, “Phonetic alignment and similarity,” Computers and the Humanities, vol. 37, no. 3, pp. 273–291, 2003. [69] L. Cambell, Historical linguistics: an introduction. The MIT Press, 2004. [70] S. Robertson and H. Zaragoza, The probabilistic relevance framework: BM25 and beyond. Now Publishers Inc, 2009. [71] C. Cortes and V. Vapnik, “Support-vector networks,” Machine learning, vol. 20, no. 3, pp. 273–297, 1995. [72] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent dirichlet allocation,” Journal of Machine Learning Research, vol. 3, no. 4-5, pp. 993–1022, 2003. [73] T. Zhang, K. Liu, and J. Zhao, “Cross lingual entity linking with bilingual topic model,” in Proceedings of the 23rd international joint conference on Artiﬁcial Intelligence (IJCAI 2013), pp. 2218–2224, AAAI Press, 2013. [74] T. Nguyen, V. Moreira, H. Nguyen, H. Nguyen, and J. Freire, “Multilingual schema matching for wikipedia infoboxes,” Proceedings of the VLDB Endowment, vol. 5, no. 2, pp. 133–144, 2011. [75] M. Zhang, H. Li, A. Kumaran, and M. Liu, “Report of news 2012 machine transliteration shared task,” in Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, pp. 10–20, 2012.
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/52983	-
dc.description.abstract	線上百科全書（如維基百科等）已成為目前網路上最重要的內容服務之一。將線上百科全書中不同語言的條目建立連結在多語知識庫的建置與整合上是一相當重要的課題，許多先前之相關研究主要著重在建立維基百科不同語言版本間之跨語言連結，然而維基百科於各個語言的條目涵蓋數量有相當顯著的差異，為解決此問題，將數個重要的不同語言之單語線上百科之條目建立其連結以建置一個跨語言線上百科全書已成為一個重要的研究課題。於本論文之中，我們定義了跨語言線上百科連結之研究問題，並提出一個利用雙語主題模型與相關翻譯內容為特徵之基於支持向量機的跨語言線上百科連結方法，將英文維基百科與中文百度百科之對應條目建立連結。為驗證我們所提出之方法的有效性，我們自中文百度百科與英文維基百科收集了一定數量之對應條目並以此建置了數個實驗資料集。實驗之數據顯示我們所提出之跨語言線上百科連結方法於平均倒數排名(MRR) 評估指標可達到0.8252，較基準系統高了0.1745 (+26.82%)，其數據說明我們的方法在建立英中跨語言線上百科連結是相當有效的。我們的方法並非高度依賴語言之特性，可易於擴展應用於建立其它語言間之線上百科條目之連結。	zh_TW
dc.description.abstract	Online encyclopedias, like Wikipedia, are one of the most widely used internet services around the world. Though Wikipedia has many language editions, their coverage is imbalanced when compared to the number of language users both online and offline. Furthermore, large alternative online encyclopedias exist for some languages, such as Chinese Baidu Baike. We could improve access to the knowledge in these various sources by constructing and integrating multiple online encyclopedias into large multilingual knowledge bases. The main task in such a project is creating links between articles in different encyclopedias in different languages. Most research to date has focused on linking articles in the different language editions of Wikipedia, yet little work has been done in linking other platform encyclopedias. In this thesis, we develop a method for cross-language encyclopedia article linking (CLEAL) between encyclopedias on different platforms, English Wikipedia and Chinese Baidu Baike. We use a bilingual topic model and translation features based on an SVM model to link articles between these two encyclopedias. To evaluate our approach, we compile datasets from Baidu Baike articles and their corresponding En Wikipedia articles. The evaluation results show that our approach achieves 0.8252 in MRR, outperforming the baseline system by 0.1745 (+26.82%). Our method does not heavily depend on specific platform formats or linguistic characteristics, so it could be easily extended to generate cross-language article links among other online encyclopedias in other languages and on other platforms.	en
dc.description.provenance	Made available in DSpace on 2021-06-15T16:37:33Z (GMT). No. of bitstreams: 1 ntu-104-D97922023-1.pdf: 3203616 bytes, checksum: ed7f58ff41a3e82d47a22944472c1e83 (MD5) Previous issue date: 2015	en
dc.description.tableofcontents	口試委員會審定書 i 誌謝 ii 中文摘要 iii 英文摘要 iv 1 Introduction 1 2 Problem Statement 5 2.1 Articles in Online Encyclopedias . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Formal Deﬁnition of CLEAL . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Difference Between CLEAL and Document Alignment . . . . . . . . . . 8 2.4 Necessity of CLEAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 Related Work 16 3.1 Named Entity Translation . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.1.1 Automatic Extraction of Bilingual Translations . . . . . . . . . . 16 3.1.2 Parenthetical Translation Mining . . . . . . . . . . . . . . . . . . 18 3.2 Machine Transliteration . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2.1 Phoneme-Based Machine Transliteration Methods . . . . . . . . 23 3.2.2 Grapheme-Based Machine Transliteration Methods . . . . . . . . 24 3.2.3 Hybrid Machine Transliteration Methods . . . . . . . . . . . . . 26 3.3 Cross-language Encyclopedia Article Linking . . . . . . . . . . . . . . . 27 3.3.1 Entity Linking . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3.2 Cross-Language Entity Linking . . . . . . . . . . . . . . . . . . 28 3.3.3 Cross-Language Link Discovery . . . . . . . . . . . . . . . . . . 29 4 Named Entity Translation 31 4.1 Web-based Parenthetical Translation Extraction . . . . . . . . . . . . . . 31 4.1.1 Parenthetical Translation Extraction Approach . . . . . . . . . . 33 4.1.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.1.3 Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . . 44 4.1.4 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.1.5 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4.2 Machine Transliteration . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 4.2.1 Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2.2 Many-to-Many Alignment . . . . . . . . . . . . . . . . . . . . . 50 4.2.3 DirecTL+ Training . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.2.4 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.3 Transliteration Phonetic Similarity . . . . . . . . . . . . . . . . . . . . . 54 4.3.1 Transliteration Identiﬁcation Method in Chinese . . . . . . . . . 56 4.3.2 Phonetic similarity . . . . . . . . . . . . . . . . . . . . . . . . . 57 5 Cross-Language Encyclopedia Article Linking 64 5.1 Candidate Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 5.2 Candidate Ranking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5.2.1 Title Matching and Title Similarity Feature (Baseline) . . . . . . 67 5.2.2 Bilingual Topic Model Similarity Feature (BTMS) . . . . . . . . 67 5.2.3 Mixed-language Topic Model Similarity Feature (MTMS) . . . . 69 5.2.4 Hypernym Translation Feature (HT) . . . . . . . . . . . . . . . . 70 5.2.5 English Title Occurrence Feature (ETO) . . . . . . . . . . . . . . 72 5.2.6 Infobox Similarity Feature (IS) . . . . . . . . . . . . . . . . . . . 72 5.2.7 Transliteration Phonetic Similarity (PS) . . . . . . . . . . . . . . 74 5.3 NIL Candidate Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . 75 6 Evaluation and Discussion 77 6.1 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 6.1.1 Evaluation Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 6.1.2 Comparison to Other Methods . . . . . . . . . . . . . . . . . . . 78 6.1.3 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 79 6.1.4 Evaluation Results . . . . . . . . . . . . . . . . . . . . . . . . . 80 6.2 Effectiveness of Translation Methods . . . . . . . . . . . . . . . . . . . . 84 6.2.1 Parenthetical Translation . . . . . . . . . . . . . . . . . . . . . . 85 6.2.2 Machine Transliteration . . . . . . . . . . . . . . . . . . . . . . 86 6.3 Comparision Between Bilingual LDA and Mixed-Language LDA . . . . 88 6.4 Effectiveness of the Bilingual LDA (BTMS Feature) . . . . . . . . . . . 89 6.5 Performance Analysis of Each Feature . . . . . . . . . . . . . . . . . . . 93 6.6 Difﬁcult Cases in Baidu Baike . . . . . . . . . . . . . . . . . . . . . . . 97 6.7 False Positive Errors in the Third Dataset . . . . . . . . . . . . . . . . . 98 6.8 Extension to Other Language Pairs (Japanese and Korean). . . . . . . 99 7 Conclusion 102 References 104
dc.language.iso	en
dc.subject	跨語言	zh_TW
dc.subject	線上百科連結	zh_TW
dc.subject	主題模型	zh_TW
dc.subject	括號翻譯	zh_TW
dc.subject	online encyclopedia article linking	en
dc.subject	cross-language	en
dc.subject	topic model	en
dc.subject	parenthetical translation	en
dc.title	跨語言線上百科連結	zh_TW
dc.title	Cross-Language Encyclopedia Article Linking	en
dc.type	Thesis
dc.date.schoolyear	103-2
dc.description.degree	博士
dc.contributor.coadvisor	蔡宗翰(Richard Tzong-Han Tsai)
dc.contributor.oralexamcommittee	陳信希(Hsin-Hsi Chen),鄭卜壬(Pu-Jen Cheng),高照明(Zhao-Ming Gao)
dc.subject.keyword	線上百科連結,跨語言,主題模型,括號翻譯,	zh_TW
dc.subject.keyword	online encyclopedia article linking,cross-language,topic model,parenthetical translation,	en
dc.relation.page	114
dc.rights.note	有償授權
dc.date.accepted	2015-08-12
dc.contributor.author-college	電機資訊學院	zh_TW
dc.contributor.author-dept	資訊工程學研究所	zh_TW
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-104-1.pdf 未授權公開取用	3.13 MB	Adobe PDF

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。