不經重新訓練的關係知識編輯方法：以FIT-RSRC領域專屬視覺問答為例

許家銓; Chia-Chuan Hsu

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101907

標題:	不經重新訓練的關係知識編輯方法：以FIT-RSRC領域專屬視覺問答為例 Editing Relationship Knowledge Without Retraining: A Case Study on Domain-Specific VQA (FIT-RSRC)
作者:	許家銓 Chia-Chuan Hsu
指導教授:	莊永裕 Yung-Yu Chuang
關鍵字:	模型編輯,多模態學習遙測影像視覺問答跨模態遷移 Model Editing,Multimodal LearningRemote SensingVisual Question AnsweringCross-Modal Transferability
出版年 :	2026
學位:	碩士
摘要:	本論文探討在領域專屬遙測視覺問答（VQA）中，不經重新訓練的關係知識編輯。透過在 STAR 子集上的 attention-ratio 診斷分析，我們指出存在定位—推理的解耦：模型即使能關注到正確區域，仍可能產生帶偏誤的關係標籤。接著，我們將關係推理改寫為純文字情境任務並套用 ROME 式局部更新，揭示共軛干擾（conjugate interference）以及多次編輯順序對穩定性的高度敏感。最後的遷移測試顯示，語言端的語意修正難以可靠轉移到多模態 VQA 推理中，突顯跨模態泛化的關鍵限制。 This thesis studies relationship knowledge editing without retraining for domain-specific remote-sensing VQA. Using a curated STAR subset and attention-ratio diagnostics, we show a grounding–reasoning decoupling: models often attend to the correct regions yet still produce biased relationship labels. We then cast relationship reasoning as a text-only scenario task and apply ROME-style localized updates, revealing conjugate interference and strong sensitivity to multi-edit order. Finally, transfer tests indicate that language-side semantic edits do not reliably carry over to multimodal VQA inference, highlighting key limits of cross-modal generalization.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/101907
DOI:	10.6342/NTU202600393
全文授權:	未授權
電子全文公開日期:	N/A
顯示於系所單位：	資訊工程學系

文件中的檔案：

檔案	大小	格式
ntu-114-1.pdf 未授權公開取用	1.73 MB	Adobe PDF

顯示文件完整紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料（如：文字、圖片、PDF）並使其易於取用。