Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48387| Title: | 自動化標籤擴充及預測於社會媒體檢索之應用 Automatic Tag Expansion and Prediction for Social Media Retrieval |
| Authors: | Ming-Hung Hsu 許名宏 |
| Advisor: | 陳信希(Hsin-Hsi Chen) |
| Keyword: | 社會媒體,社群標記,網路書籤,標籤預測,標籤擴充,網路搜尋, Social Annotation,Social Bookmarking,Social Media,Social Tagging,Tag Expansion,Tag Prediction,Web Search, |
| Publication Year : | 2011 |
| Degree: | 博士 |
| Abstract: | 在現今的網路世界,社群標記已成為一種風行的機制,能夠讓使用者們去組織所需要的資源。這些急速增加的使用者標記吸引了豐富的學術研究,其目標包含分析使用者的標記行為和所使用的標籤,以及如何將標記好的標籤加以利用。雖然現今社群標記有許多可能的相關應用已成為眾多學者關心的研究議題,「冷啟動」(cold-start)這個問題卻大大地限制了它的實用性,特別是對於網路上頻繁出現的新興資源而言,冷啟動所造成的負面影響更是嚴重。
有鑑於網頁是社群書籤系統中最主要的社會媒體,在此論文中,我們對於網頁自動化標籤預測以及標籤擴充做了綜合性的研究。首先我們將探討一個社群書籤系統中,關於標籤的使用法所呈現的穩定化(stabilizing)過程。接著我們提出一個二階段式、有效且有效率的標籤預測方法;當有少量的使用者標記可供利用時,此方法能夠善用使用者標籤以改進標籤預測的品質。實驗結果顯示,我們所提出的方法能夠有效地為目標網頁預測適當的標籤。 除了此二階段的標籤預測方法,我們也提出一個整合標籤預測與網路搜尋的架構,稱為OBR。在使用開放目錄計劃(Open Directory Project,簡稱ODP)的資料所獲得的實驗結果顯示,此OBR架構能夠顯著地提昇典型的檢索模型(例如,BM25)的效能。 為了檢索另一種類型且廣為流傳的社會媒體,圖片,我們也引入一個新穎的、以知識本體為基礎的方法,以擴充目標圖片的文字標記。此方法利用了在概念網(ConceptNet),目前最大的常識知識庫,之中所定義的概念與概念之間的空間關係。實驗結果顯示,我們所提出的標記擴充方法能夠強化以文字為基礎的圖片檢索模型效能。 為了探索能夠改進目前標籤預測、標籤擴充方法的可能方向,我們也嘗試提出一個動態估計標籤相關性的架構,並評估其可行性。雖然目前這個架構只是個初步嘗試,其企圖在於偵測使用者的興趣隨著時間推移而發生的變化,我們所得到的實驗結果仍代表此架構能夠微幅地改進標籤擴充的效能。然而,此架構所耗費的大規模計算量,將是未來必須尋求改進的目標之一。 Nowadays social annotation has become a popular manner for Web users to manage demanded resources. The rapidly-increasing amount of annotations attracts rich research on tagging analysis and applicability of social tags. Although there are many potential applications of social annotation, the cold-start problem limits its applicability, especially critical for new resources on the Web. In this dissertation, we study comprehensively automatic tag prediction/expansion for Web pages which are major media in social bookmarking. At first, we explore the stabilizing phenomenon of tag usage in a social bookmarking system. Then we propose a two-stage tag prediction approach, which is both efficient and effective. When a few user annotations are available, the proposed approach is able to utilize user tags to further improve prediction quality. In the first stage - content-based ranking (CBR), candidate tags are selected and ranked to generate an initial tag list. In the second stage - random-walk re-ranking (RWR), we introduce a random-walk model which utilizes tag co-occurrence information to re-rank the initial list. The re-ranking stage conceptually performs generalization for top tags in initial ranking. The experimental results show that our algorithm effectively proposes appropriate tags for target Web pages. In addition to the effective two-stage approach to tag prediction, we also present a framework, named OBR, to effectively and robustly incorporate tag prediction into general Web search. The experimental results on the ODP (Open Directory Project) dataset validate that the OBR framework significantly enhances BM25, a classical retrieval model. For retrieving another type of popular social media, images, we introduce a novel knowledge-based approach to expand tags or text descriptions of target images. This approach utilizes entity-entity spatial relations defined by Concept, which is the largest commonsense knowledge base. The experimental results show that the proposed expansion approach can enhance text-based image retrieval. To explore potential directions for advancing present tag expansion methods, we also investigate a framework for dynamic estimation of tag-tag correlation. While this framework is a preliminary study to reveal changes in users’ interests, the experimental results indicate that it enhances the performance of tag expansion. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/48387 |
| Fulltext Rights: | 有償授權 |
| Appears in Collections: | 資訊工程學系 |
Files in This Item:
| File | Size | Format | |
|---|---|---|---|
| ntu-100-1.pdf Restricted Access | 784.74 kB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
