請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62222完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 鄭卜壬(Pu-Jen Cheng) | |
| dc.contributor.author | Shih-Ying Chen | en |
| dc.contributor.author | 陳世穎 | zh_TW |
| dc.date.accessioned | 2021-06-16T13:34:45Z | - |
| dc.date.available | 2014-07-26 | |
| dc.date.copyright | 2013-07-26 | |
| dc.date.issued | 2013 | |
| dc.date.submitted | 2013-07-18 | |
| dc.identifier.citation | [1] Introducing the knowledge graph: things, not
strings. http://googleblog.blogspot.tw/2012/05/ introducing-knowledge-graph-things-not.html. [2] D. E. Avison and H. U. H. U. Shah. The information systems development life cycle Texte imrpime : a first course in information systems. McGraw-Hill Companies, 1997. [3] M. Bautin and S. Skiena. Concordance-based entity-oriented search. Web Intelli. and Agent Sys., 7(4), Dec. 2009. [4] O. Ben-Yitzhak, N. Golbandi, N. Har’El, R. Lempel, A. Neumann, S. Ofek-Koifman, D. Sheinwald, E. Shekita, B. Sznajder, and S. Yogev. Beyond basic faceted search. In Proc. of WSDM, 2008. [5] A. L. Berger and V. O. Mittal. Ocelot: a system for summarizing web pages. In Proc. of SIGIR, 2000. [6] P. Bhaskar and S. Bandyopadhyay. A query focused multi document automatic summarization. In Proc. of PACLIC, 2010. [7] W. Dakka and P. G. Ipeirotis. Automatic extraction of useful facet hierarchies from text databases. In Proc. of ICDE, 2008. [8] Z. Dou, S. Hu, Y. Luo, R. Song, and J.-R. Wen. Finding dimensions for queries. In Proc. of CIKM, 2011. [9] G. Erkan and D. R. Radev. Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res., 22(1), Dec. 2004. [10] D. Gillick and B. Favre. A scalable global model for summarization. In Proc. of ILP Workshop, 2009. [11] M. A. Hearst. Clustering versus faceted categories for information exploration. Commun. ACM, 49(4), 2006. [12] Y. Ko, H. An, and J. Seo. Pseudo-relevance feedback and statistical query expansion for web snippet generation. Inf. Process. Lett., 109(1), 2008. [13] J. Kupiec, J. Pedersen, and F. Chen. A trainable document summarizer. In Proc. of SIGIR, 1995. [14] C.-Y. Lin. Rouge: A package for automatic evaluation of summaries. In Proc. of ACL workshop, 2004. [15] X. Ling, Q. Mei, C. Zhai, and B. Schatz. Mining multi-faceted overviews of arbitrary topics in a text collection. In Proc. of SIGKDD, 2008. [16] D. M. McDonald and H. Chen. Summary in context: Searching versus browsing. ACM Trans. Inf. Syst., 24(1), Jan. 2006. [17] R. McDonald. A study of global inference algorithms in multidocument summarization. In Proc. of ECIR, 2007. [18] J.-P. Ng, P. Bysani, Z. Lin, M.-Y. Kan, and C.-L. Tan. Exploiting category-specific information for multi-document summarization. In Proc. of COLING, 2012. [19] D. R. Radev, W. Fan, and Z. Zhang. Webinessence: a personalized web-based multi-document summarization and recommendation system. In In NAACL 2001 Workshop on Automatic Summarization, 2001. [20] D. R. Radev, H. Jing, M. Styś, and D. Tam. Centroid-based summarization of multiple documents. Inf. Process. Manage., 40(6), Nov. 2004. [21] C. Sauper and R. Barzilay. Automatically generating wikipedia articles: a structure-aware approach. In Proc. of ACL and IJCLP of the AFNLP, 2009. [22] W. Song, Q. Yu, Z. Xu, T. Liu, S. Li, and J.-R. Wen. Multi-aspect query summarization by composite query. In Proc. of SIGIR, 2012. [23] E. Stoica, M. A. Hearst, and M. Richardson. Automating creation of hierarchical faceted metadata structures. In Proc. of NAACL HLT, 2007. [24] A. Tombros and M. Sanderson. Advantages of query biased summaries in information retrieval. In Proc. of SIGIR, 1998. [25] R. Varadarajan. A system for query-specific document summarization. In Proc. of CIKM, 2006. [26] X. Wang and C. Zhai. Learn from web search logs to organize search results. In Proc. of SIGIR, 2007. [27] M. White, T. Korelsky, C. Cardie, V. Ng, D. Pierce, and K. Wagstaff. Multidocument summarization via information extraction. In Proc. of HLT, 2001. [28] H.-J. Zeng, Q.-C. He, Z. Chen, W.-Y. Ma, and J. Ma. Learning to cluster web search results. In Proc. of SIGIR, 2004 | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62222 | - |
| dc.description.abstract | 在這篇論文中, 我們探討一個如何改善搜尋結果頁面
所有結果底下的片段資訊以作為搜尋詞的整體概觀的問 題, 使使用者在點擊搜尋結果前可已對這個搜尋詞得到一 個大概的了解. 對於一個已知其類別的搜尋詞, 我們使用 了其類別所涵蓋的語意以及那些跟這類別息息相關的屬 性來組織這樣的片段資訊我們對類別從社群問答網站中 的問題抽取了那些跟類別息息相關的屬性, 並從那些問題 的答案中萃取了每個屬性的context information. 而我們 產生這些片段資訊主要依賴三個因素, 涵蓋搜尋詞的資訊 量, 涵蓋類別語意的程度, 以及涵蓋類別屬性的程度, 為了 能同時最佳化這三個因素, 我們採用了整數線性規劃來 模組化我們的問題, 實驗結果顯示我們產生出的片段資 訊, 在與傳統搜尋頁面以及一些基本的summarization 演 算法, 在表達搜尋詞概觀的程度上, 有不少的進步. | zh_TW |
| dc.description.abstract | Previous work on snippet generation focuses mainly on
how to produce one snippet for an individual search result. This paper aims to generate a comprehensive overview for an entity query in the search-result page. We assume each entity has its own category, whose attributes are regarded as the unique characteristics that the users might be interested in when searching for the entity. Given an entity as query (e.g., enterogastritis) and its category (e.g., disease), we want to organize the snippets that contain its attributes (e.g., symptoms and diagnoses) so that users can learn about the useful information with respect to the given query directly from the generated snippets without downloading documents. First, we extract the attributes of a category from a community-based question-answering (CQA) website. Next, the snippets are generated according to several factors, including how a sentence could be central to the meanings of the query, its category and corresponding attributes, and how well the snippets diversify the attributes. Finally, an Integer Linear Programming (ILP) is adopted to find an optimal sentence set as the snippet. The experiments are conducted on 100 common disease queries. Experimental results demonstrate the effectiveness and efficiency of the proposed approach, compared to an existing search engine and several summarization baselines. | en |
| dc.description.provenance | Made available in DSpace on 2021-06-16T13:34:45Z (GMT). No. of bitstreams: 1 ntu-102-R00922026-1.pdf: 616063 bytes, checksum: a00424b8ad7d5a04c2a897da03acbddf (MD5) Previous issue date: 2013 | en |
| dc.description.tableofcontents | 口試委員會審定書.....................iii
誌謝.....................v 摘要.....................vii Abstract.....................ix 1 Introduction.....................1 2 Related Works.....................7 2.1 Document Summarization.....................7 2.2 Facet Generation and Summarization.....................8 3 Problem Formulation.....................9 3.1 Problem Specification.....................9 4 System Overview.....................11 5 Off-line Information Extraction Stage.....................13 5.1 Attribute Extraction.....................13 5.2 Building Category Vector..................... 16 6 On-line Query Processing Stage.....................17 6.1 Scoring Function S_important(s).....................17 6.1.1 Scoring function S_query(s).....................17 6.1.2 Scoring function S_category(s).....................18 6.1.3 Scoring function S_attr(s).....................18 6.2 Merging Scoring Functions.....................19 6.3 Integer Linear Programming Model.....................19 6.4 Sentence Compression..................... 20 7 Experiments.....................21 7.1 Data Set.....................21 7.2 Baselines.....................21 7.3 Evaluation.....................23 7.4 Performance Comparison.....................23 7.4.1 ROUGE Performance .....................23 7.4.2 Time Performance.....................26 7.5 Parameter Settings.....................27 7.5.1 The impact of α,β,γ in S_important . . . . . . . 27 7.5.2 Number of used attributes.....................27 7.5.3 Limit on number of attribute presence.....................28 8 Conclusions and Future Work.....................31 Bibliography.....................33 | |
| dc.language.iso | en | |
| dc.subject | 搜尋結果概要 | zh_TW |
| dc.subject | Search-Result Summarization | en |
| dc.title | 使用類別資訊產生作為查詢詞概觀的網站片段敘述 | zh_TW |
| dc.title | Generating Snippets as Query Overview with Category
Information | en |
| dc.type | Thesis | |
| dc.date.schoolyear | 101-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 陳信希(HSIN-HSI CHEN),盧文祥(WEN-HSIANG LU),張嘉惠(Chia-Hui Chang) | |
| dc.subject.keyword | 搜尋結果概要, | zh_TW |
| dc.subject.keyword | Search-Result Summarization, | en |
| dc.relation.page | 35 | |
| dc.rights.note | 有償授權 | |
| dc.date.accepted | 2013-07-18 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊工程學研究所 | zh_TW |
| 顯示於系所單位: | 資訊工程學系 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-102-1.pdf 未授權公開取用 | 601.62 kB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
