Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/6398
Title: 利用語彙、句法以及語義資訊偵測網路抄襲
Online Plagiarized Detection Through Exploiting Lexical, Syntactic, and Semantic Information
Authors: Wan-Yu Lin
林琬瑜
Advisor: 林守德(Shou-De Lin)
Keyword: 抄襲偵測,語彙,句法,語義,
Plagiarism Detection,Lexical,Syntactic,Semantic,
Publication Year : 2013
Degree: 碩士
Abstract: 傳統的抄襲偵測系統,許多只著重在文章的語彙統計特徵,至多再考慮句法結構,或利用 WordNet 來擷取文章的語義面訊息,且以離線的抄襲偵測居多;我們的系統則是將搜尋引擎整合進來,同時引進語彙、句法和語義這三個層面的結構特徵,抽取可疑文句組對裡,語彙的重覆率、重組率、連續性,單詞在句中所屬的詞性和片語標籤,以及透過 Latent Dirichlet Allocation (LDA) 所標記出的潛在主題來代表可能蘊含的語義資訊,如此結合這六個不同的抄襲偵測模型,再利用我們所設計的加權方法將六個模型的預測結果合併,是一個能自動偵測網路抄襲的線上系統。實驗結果顯示無論是英文還是中文的文章,我們的系統都能成功偵測出相當數量的可能抄襲來源,實驗數據上的表現也相較目前一些最先進的演算法還要來得突出。
In this paper, we introduce a framework that identifies sentence and document level online plagiarism by exploiting lexical, syntactic and semantic features, which includes duplication ngram, reordering and alignment of words, POS and phrase tags, and semantic similarity of sentences. We also enhance plagiarism detection by establishing an ensemble framework to combine the prediction scores of each model. Experiments performed on English and Chinese corpora demonstrate that our system can not only find considerable amount of real-world online plagiarism cases but also outperforms several state-of-the-art algorithms.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/6398
Fulltext Rights: 同意授權(全球公開)
Appears in Collections:資訊網路與多媒體研究所

Files in This Item:
File SizeFormat 
ntu-102-1.pdf1.36 MBAdobe PDFView/Open
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved