Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊工程學系
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62222
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor鄭卜壬(Pu-Jen Cheng)
dc.contributor.authorShih-Ying Chenen
dc.contributor.author陳世穎zh_TW
dc.date.accessioned2021-06-16T13:34:45Z-
dc.date.available2014-07-26
dc.date.copyright2013-07-26
dc.date.issued2013
dc.date.submitted2013-07-18
dc.identifier.citation[1] Introducing the knowledge graph: things, not
strings. http://googleblog.blogspot.tw/2012/05/
introducing-knowledge-graph-things-not.html.
[2] D. E. Avison and H. U. H. U. Shah. The information systems development
life cycle Texte imrpime : a first course in information
systems. McGraw-Hill Companies, 1997.
[3] M. Bautin and S. Skiena. Concordance-based entity-oriented
search. Web Intelli. and Agent Sys., 7(4), Dec. 2009.
[4] O. Ben-Yitzhak, N. Golbandi, N. Har’El, R. Lempel, A. Neumann,
S. Ofek-Koifman, D. Sheinwald, E. Shekita, B. Sznajder,
and S. Yogev. Beyond basic faceted search. In Proc. of WSDM,
2008.
[5] A. L. Berger and V. O. Mittal. Ocelot: a system for summarizing
web pages. In Proc. of SIGIR, 2000.
[6] P. Bhaskar and S. Bandyopadhyay. A query focused multi document
automatic summarization. In Proc. of PACLIC, 2010.
[7] W. Dakka and P. G. Ipeirotis. Automatic extraction of useful facet
hierarchies from text databases. In Proc. of ICDE, 2008.
[8] Z. Dou, S. Hu, Y. Luo, R. Song, and J.-R. Wen. Finding dimensions
for queries. In Proc. of CIKM, 2011.
[9] G. Erkan and D. R. Radev. Lexrank: graph-based lexical centrality
as salience in text summarization. J. Artif. Int. Res., 22(1), Dec.
2004.
[10] D. Gillick and B. Favre. A scalable global model for summarization.
In Proc. of ILP Workshop, 2009.
[11] M. A. Hearst. Clustering versus faceted categories for information
exploration. Commun. ACM, 49(4), 2006.
[12] Y. Ko, H. An, and J. Seo. Pseudo-relevance feedback and statistical
query expansion for web snippet generation. Inf. Process. Lett.,
109(1), 2008.
[13] J. Kupiec, J. Pedersen, and F. Chen. A trainable document summarizer.
In Proc. of SIGIR, 1995.
[14] C.-Y. Lin. Rouge: A package for automatic evaluation of summaries.
In Proc. of ACL workshop, 2004.
[15] X. Ling, Q. Mei, C. Zhai, and B. Schatz. Mining multi-faceted
overviews of arbitrary topics in a text collection. In Proc. of
SIGKDD, 2008.
[16] D. M. McDonald and H. Chen. Summary in context: Searching
versus browsing. ACM Trans. Inf. Syst., 24(1), Jan. 2006.
[17] R. McDonald. A study of global inference algorithms in multidocument
summarization. In Proc. of ECIR, 2007.
[18] J.-P. Ng, P. Bysani, Z. Lin, M.-Y. Kan, and C.-L. Tan. Exploiting
category-specific information for multi-document summarization.
In Proc. of COLING, 2012.
[19] D. R. Radev, W. Fan, and Z. Zhang. Webinessence: a personalized
web-based multi-document summarization and recommendation
system. In In NAACL 2001 Workshop on Automatic Summarization,
2001.
[20] D. R. Radev, H. Jing, M. Styś, and D. Tam. Centroid-based summarization
of multiple documents. Inf. Process. Manage., 40(6),
Nov. 2004.
[21] C. Sauper and R. Barzilay. Automatically generating wikipedia
articles: a structure-aware approach. In Proc. of ACL and IJCLP
of the AFNLP, 2009.
[22] W. Song, Q. Yu, Z. Xu, T. Liu, S. Li, and J.-R. Wen. Multi-aspect
query summarization by composite query. In Proc. of SIGIR, 2012.
[23] E. Stoica, M. A. Hearst, and M. Richardson. Automating creation
of hierarchical faceted metadata structures. In Proc. of NAACL
HLT, 2007.
[24] A. Tombros and M. Sanderson. Advantages of query biased summaries
in information retrieval. In Proc. of SIGIR, 1998.
[25] R. Varadarajan. A system for query-specific document summarization.
In Proc. of CIKM, 2006.
[26] X. Wang and C. Zhai. Learn from web search logs to organize
search results. In Proc. of SIGIR, 2007.
[27] M. White, T. Korelsky, C. Cardie, V. Ng, D. Pierce, and
K. Wagstaff. Multidocument summarization via information extraction.
In Proc. of HLT, 2001.
[28] H.-J. Zeng, Q.-C. He, Z. Chen, W.-Y. Ma, and J. Ma. Learning to
cluster web search results. In Proc. of SIGIR, 2004
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/62222-
dc.description.abstract在這篇論文中, 我們探討一個如何改善搜尋結果頁面
所有結果底下的片段資訊以作為搜尋詞的整體概觀的問
題, 使使用者在點擊搜尋結果前可已對這個搜尋詞得到一
個大概的了解. 對於一個已知其類別的搜尋詞, 我們使用
了其類別所涵蓋的語意以及那些跟這類別息息相關的屬
性來組織這樣的片段資訊我們對類別從社群問答網站中
的問題抽取了那些跟類別息息相關的屬性, 並從那些問題
的答案中萃取了每個屬性的context information. 而我們
產生這些片段資訊主要依賴三個因素, 涵蓋搜尋詞的資訊
量, 涵蓋類別語意的程度, 以及涵蓋類別屬性的程度, 為了
能同時最佳化這三個因素, 我們採用了整數線性規劃來
模組化我們的問題, 實驗結果顯示我們產生出的片段資
訊, 在與傳統搜尋頁面以及一些基本的summarization 演
算法, 在表達搜尋詞概觀的程度上, 有不少的進步.
zh_TW
dc.description.abstractPrevious work on snippet generation focuses mainly on
how to produce one snippet for an individual search result.
This paper aims to generate a comprehensive overview for
an entity query in the search-result page. We assume each
entity has its own category, whose attributes are regarded as
the unique characteristics that the users might be interested
in when searching for the entity. Given an entity as query
(e.g., enterogastritis) and its category (e.g., disease), we want
to organize the snippets that contain its attributes (e.g., symptoms
and diagnoses) so that users can learn about the useful
information with respect to the given query directly from the
generated snippets without downloading documents. First, we
extract the attributes of a category from a community-based
question-answering (CQA) website. Next, the snippets are
generated according to several factors, including how a sentence
could be central to the meanings of the query, its category
and corresponding attributes, and how well the snippets
diversify the attributes. Finally, an Integer Linear Programming
(ILP) is adopted to find an optimal sentence set as the
snippet. The experiments are conducted on 100 common disease
queries. Experimental results demonstrate the effectiveness
and efficiency of the proposed approach, compared to an
existing search engine and several summarization baselines.
en
dc.description.provenanceMade available in DSpace on 2021-06-16T13:34:45Z (GMT). No. of bitstreams: 1
ntu-102-R00922026-1.pdf: 616063 bytes, checksum: a00424b8ad7d5a04c2a897da03acbddf (MD5)
Previous issue date: 2013
en
dc.description.tableofcontents口試委員會審定書.....................iii
誌謝.....................v
摘要.....................vii
Abstract.....................ix
1 Introduction.....................1
2 Related Works.....................7
2.1 Document Summarization.....................7
2.2 Facet Generation and Summarization.....................8
3 Problem Formulation.....................9
3.1 Problem Specification.....................9
4 System Overview.....................11
5 Off-line Information Extraction Stage.....................13
5.1 Attribute Extraction.....................13
5.2 Building Category Vector..................... 16
6 On-line Query Processing Stage.....................17
6.1 Scoring Function S_important(s).....................17
6.1.1 Scoring function S_query(s).....................17
6.1.2 Scoring function S_category(s).....................18
6.1.3 Scoring function S_attr(s).....................18
6.2 Merging Scoring Functions.....................19
6.3 Integer Linear Programming Model.....................19
6.4 Sentence Compression..................... 20
7 Experiments.....................21
7.1 Data Set.....................21
7.2 Baselines.....................21
7.3 Evaluation.....................23
7.4 Performance Comparison.....................23
7.4.1 ROUGE Performance .....................23
7.4.2 Time Performance.....................26
7.5 Parameter Settings.....................27
7.5.1 The impact of α,β,γ in S_important . . . . . . . 27
7.5.2 Number of used attributes.....................27
7.5.3 Limit on number of attribute presence.....................28
8 Conclusions and Future Work.....................31
Bibliography.....................33
dc.language.isoen
dc.subject搜尋結果概要zh_TW
dc.subjectSearch-Result Summarizationen
dc.title使用類別資訊產生作為查詢詞概觀的網站片段敘述zh_TW
dc.titleGenerating Snippets as Query Overview with Category
Information
en
dc.typeThesis
dc.date.schoolyear101-2
dc.description.degree碩士
dc.contributor.oralexamcommittee陳信希(HSIN-HSI CHEN),盧文祥(WEN-HSIANG LU),張嘉惠(Chia-Hui Chang)
dc.subject.keyword搜尋結果概要,zh_TW
dc.subject.keywordSearch-Result Summarization,en
dc.relation.page35
dc.rights.note有償授權
dc.date.accepted2013-07-18
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊工程學研究所zh_TW
顯示於系所單位:資訊工程學系

文件中的檔案:
檔案 大小格式 
ntu-102-1.pdf
  未授權公開取用
601.62 kBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved