Skip navigation

DSpace

機構典藏 DSpace 系統致力於保存各式數位資料(如:文字、圖片、PDF)並使其易於取用。

點此認識 DSpace
DSpace logo
English
中文
  • 瀏覽論文
    • 校院系所
    • 出版年
    • 作者
    • 標題
    • 關鍵字
    • 指導教授
  • 搜尋 TDR
  • 授權 Q&A
    • 我的頁面
    • 接受 E-mail 通知
    • 編輯個人資料
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 資訊網路與多媒體研究所
請用此 Handle URI 來引用此文件: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80744
完整後設資料紀錄
DC 欄位值語言
dc.contributor.advisor徐宏民(Winston H. Hsu)
dc.contributor.authorYan-Ru Wangen
dc.contributor.author王彥茹zh_TW
dc.date.accessioned2022-11-24T03:14:58Z-
dc.date.available2021-11-06
dc.date.available2022-11-24T03:14:58Z-
dc.date.copyright2021-11-06
dc.date.issued2021
dc.date.submitted2021-10-14
dc.identifier.citationS. Agethen, H.­C. Lee, and W. H. Hsu. Anticipation of human actions with pose­based fine­-grained representations. 2019 IEEE/CVF Conferenceon Computer Vision and Pattern Recognition Workshops(CVPRW) J. Carreira and A. Zisserman. Quo vadis, action recognition? a new model and the kinetics dataset. pages 4724–4733, 07 2017. A. Chadha, G. Arora, and N. Kaloty. iPerceive: Applying common-­sense reasoning to multi­modal dense video captioning and video question answering. 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1–13, 2021. G. Chen, J. Li, J. Lu, and J. Zhou. Human trajectory prediction via counterfactual analysis. In ICCV, 2021. D. Epstein, B. Chen, and C. Vondrick. Oops! predicting unintentional action in video. In The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020. Y. A. Farha and J. Gall. Uncertainty­-aware anticipation of activities. 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pages 1197–1204, 2019. Y. A. Farha, A. Richard, and J. Gall. When will you do what? ­anticipating tempo­ral occurrences of activities. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5343–5352, 2018. C. Feichtenhofer, H. Fan, J. Malik, and K. He. Slowfast networks for video recogni­tion. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019. A. Furnari, S. Battiato, and G. Maria Farinella. Leveraging uncertainty to rethink loss functions and evaluation measures for egocentric action anticipation. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, September 2018. A. Furnari and G. M. Farinella. What would you expect? anticipating egocentric ac­tions with rolling­unrolling lstms and modality attention. In International Conference on Computer Vision (ICCV), 2019. H. Gammulle, S. Denman, S. Sridharan, and C. Fookes. Forecasting future action sequences with neural memory networks. BMVC, 2019. H. Gammulle, S. Denman, S. Sridharan, and C. Fookes. Predicting the future: A jointly learnt model for action anticipation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019. J. Gao, Z. Yang, and R. Nevatia. Red: Reinforced encoder-­decoder networks for action anticipation. BMVC, 07 2017. R. Girdhar and K. Grauman. Anticipative Video Transformer. In ICCV, 2021. Q. Ke, M. Fritz, and B. Schiele. Time-­conditioned action anticipation in one shot. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019. H. Kuehne, A. B. Arslan, and T. Serre. The language of actions: Recovering the syntax and semantics of goal-­directed human activities. In Proceedings of Computer Vision and Pattern Recognition Conference (CVPR), 2014. T. Lan, T.­C. Chen, and S. Savarese. A hierarchical representation for future action prediction. In ECCV, pages 689–704, 2014. C. Li, S. H. Chan, and Y.­T. Chen. Who make drivers stop? towards driver­-centric risk assessment: Risk object identification via causal inference. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 10711–10718, 2020. M. Liu, S. Tang, Y. Li, and J. M. Rehg. Forecasting human­-object interaction: Joint prediction of motor attention and actions in first person video. In ECCV, 2020. T. Mahmud, M. Hasan, and A. K. Roy­Chowdhury. Joint prediction of activity labels and starting times in untrimmed videos. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 5784–5793, 2017. A. Miech, I. Laptev, J. Sivic, H. Wang, L. Torresani, and D. Tran. Leveraging the present to anticipate the future in videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019. R. Morais, L. Vương, T. Tran, and S. Venkatesh. Learning to abstract and predict human actions. BMVC, 2020. G. Nan, R. Qiao, Y. Xiao, J. Liu, S. Leng, H. Zhang, and W. Lu. Interventional video grounding with dual contrastive learning. In CVPR, 2021. L. Neumann, A. Zisserman, and A. Vedaldi. Future event prediction: If and when. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 2935–2943, 2019. Y. Ng and B. Fernando. Forecasting future action sequences with attention: A new approach to weakly supervised action forecasting. IEEE transactions on image processing : a publication of the IEEE Signal Processing Society, PP, 09 2020. J. Pearl, M. Glymour, and N. P. Jewell. The book of why: the new science of cause and effect. John Wiley Sons, 2016. J. Pearl and D. Mackenzie. The book of why: the new science of cause and effect. 2018. Basic Books. R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. Grad­-cam: Visual explanations from deep networks via gradient-­based localization. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 618–626, 2017. F. Sener, D. Singhania, and A. Yao. Temporal aggregate representations for long­-range video understanding. In European Conference on Computer Vision, pages 154–171. Springer, 2020 F. Sener and A. Yao. Zero­-shot anticipation for instructional activities. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 862–871, 2019. C. Sun, A. Shrivastava, C. Vondrick, R. Sukthankar, K. Murphy, and C. Schmid. Relational action forecasting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), June 2019. D. Surís, R. Liu, and C. Vondrick. Learning the predictability of the future. 2021. C. Vondrick, H. Pirsiavash, and A. Torralba. Anticipating visual representations from unlabeled video. In 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),pages 98-106,Los Alamitos,CA,USA,jun2016. IEEE Com­puter Society. T. Wang, J. Huang, H. Zhang, and Q. Sun. Visual commonsense r­cnn. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10760–10770, 2020. Y. Wu, L. Zhu, X. Wang, Y. Yang, and F. Wu. Learning to anticipate egocentric actions by imagination. IEEE Transactions on Image Processing, 30:1143–1152, 01 2021. K. Xu, J. Ba, R. Kiros, K. Cho, A. C. Courville, R. Salakhutdinov, R. S. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In ICML, 2015. X. Yang, F. Feng, W. Ji, M. Wang, and T.­S. Chua. Deconfounded video moment retrieval with causal intervention. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’21, 2021.
dc.identifier.urihttp://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80744-
dc.description.abstract"動作預測是希望模型能夠根據看到的一段影片去推斷即將發生的未來動作, 這對於許多智能應用是很重要的能力,例如:自動駕駛, 輔助型機器人。現階段的作法多利用動作識別的模型所提取出來的資訊來作為設計動作預測模型的基礎。然而,我們發現當我們單純利用動作識別的模型來學習動作預測的問題時,模型會有過於單純仰賴畫面中正在進行的動作來判斷,忽略畫面中其他重要資訊好比畫面中有哪些物件。基於Judea Pearl所提的因果理論,這樣單純依靠被動觀察輸入與輸出的關聯性來推導兩者之間的因果關係是會受到混雜因子的誤導。我們藉由主動干預模型原先的學習模式,讓模型在做出預判之前,必須先考慮每種動作發生的可能性藉此來降低它過於依靠畫面中動作而不去觀察影片中其他資訊的問題。實驗結果顯示,我們所提出的comprehenser有助於消弭上述所提到的問題,並且可應用於不同的動作識別模型架構之上,皆獲得更卓越的性能。"zh_TW
dc.description.provenanceMade available in DSpace on 2022-11-24T03:14:58Z (GMT). No. of bitstreams: 1
U0001-1410202111421700.pdf: 2655556 bytes, checksum: 36a85a87de3d7cd244c133186c046bf4 (MD5)
Previous issue date: 2021
en
dc.description.tableofcontentsVerification Letter from the Oral Examination Committee i Acknowledgements ii 摘要 iii Abstract iv Contents vi List of Figures viii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 Action Anticipation 5 2.2 Causality in Vision 6 Chapter 3 Approach 7 3.1 Problem Definition 7 3.2 Causal Intervention 8 3.3 Comprehenser Module 10 Chapter 4 Experiments 12 4.1 Dataset 12 4.2 Implementation Details 12 Chapter 5 Results 14 5.1 Quantitative Analysis 14 5.2 Qualitative Analysis 15 Chapter 6 Conclusion 17 References 18 Appendix A — Breakfast Full Results 23
dc.language.isoen
dc.subject因果推斷zh_TW
dc.subject動作預測zh_TW
dc.subjectCausal Interventionen
dc.subjectAction Anticipationen
dc.title去除動作預測中的混雜因素zh_TW
dc.titleDeconfounded Action Anticipationen
dc.date.schoolyear109-2
dc.description.degree碩士
dc.contributor.oralexamcommittee陳文進(Hsin-Tsai Liu),葉梅珍(Chih-Yang Tseng),陳奕廷,余能豪
dc.subject.keyword動作預測,因果推斷,zh_TW
dc.subject.keywordAction Anticipation,Causal Intervention,en
dc.relation.page24
dc.identifier.doi10.6342/NTU202103717
dc.rights.note同意授權(限校園內公開)
dc.date.accepted2021-10-15
dc.contributor.author-college電機資訊學院zh_TW
dc.contributor.author-dept資訊網路與多媒體研究所zh_TW
顯示於系所單位:資訊網路與多媒體研究所

文件中的檔案:
檔案 大小格式 
U0001-1410202111421700.pdf
授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務)
2.59 MBAdobe PDF
顯示文件簡單紀錄


系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved