Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電子工程學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93566
Title: 圖對比學習的自適應資料擴增架構
Self-Adaptive Data Augmentation Framework for Graph Contrastive Learning
Authors: 柯冠宇
Kuan-Yu Ko
Advisor: 郭斯彥
Sy-Yen Kuo
Keyword: 機器學習,自監督式學習,圖神經網路,圖對比學習,資料擴增,
Machine learning,self-supervised learning,graph neural networks,graph contrastive learning,data augmentation,
Publication Year : 2024
Degree: 碩士
Abstract: 作為著名的自監督式學習方法,圖對比學習是現今一大熱門的主題。常見的對比學習方法需要對輸入資料進行資料擴增,然而,在圖上進行資料擴增並不直觀,不適當的方法有可能破壞圖的結構,進而導致模型訓練結果不佳。因此,如何在不破壞結構的情況下對圖進行資料擴增,又或者不使用資料擴增去做圖對比學習是目前在這個領域的一大難題。
這篇論文提出了一種全新的架構,該架構並不限制任何的資料擴增方法,可以自己適應並排除掉被破壞結構的資料,並且生成出一個包含原本資料集和資料擴增後沒有被破壞結構的資料的新資料集。準確來說,我們將一個批次的原本的資料和進行資料擴增後的資料輸入訓練好的模型,並計算這兩個批次間輸出的表徵的L2範數,蒐集L2範數較小的圖形成新的資料集。這是建立在同類別的資料若是沒有遭到資料擴增破壞結構的話,輸入受過訓練的模型,其表徵在潛在空間上距離比較近的觀察下所發想出來的。我們將新的資料集再去訓練一個全新的模型,也顯示這個用新資料集訓練的模型不僅比使用原有資料集訓練表現得更好,更能夠得到和最先進模型相比接近或更好的準確度。
Graph contrastive learning (GCL) has emerged as a famous self-supervised learning method. Its efficacy often hinges on the generation of positive samples through data augmentation. Unfortunately, applying data augmentation to graph is not intuitive. Inappropriate augmentation methods may destroy graph structure, leading to poor model performance. Thus, developing a data augmentation method that preserve semantics of the graph, or alternatively, a GCL methods without data augmentation becomes a significant challenge within this domain.
In this paper, we propose a novel framework that is compatible with all data augmentation methods while being self-adaptive. It excludes data which graph structure are destroyed, creating a new dataset including data from original dataset and those preserved its semantics after data augmentation. Specifically, we input a batch of original data and augmented data into a trained model. The L2 norm of the representations between two batches are computed, and we extract those graphs with minimal L2 norm. This is inspired by the fact that for a trained model, representations from two graphs with same label should exhibit proximity. We train a new model on refined dataset. The results show that this model not only outperforms the model trained on the original dataset but also achieves competitive or better performance in comparison to state-of-the-art methods.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/93566
DOI: 10.6342/NTU202401171
Fulltext Rights: 未授權
Appears in Collections:電子工程學研究所

Files in This Item:
File SizeFormat 
ntu-112-2.pdf
  Restricted Access
633.67 kBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved