請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80744完整後設資料紀錄
| DC 欄位 | 值 | 語言 |
|---|---|---|
| dc.contributor.advisor | 徐宏民(Winston H. Hsu) | |
| dc.contributor.author | Yan-Ru Wang | en |
| dc.contributor.author | 王彥茹 | zh_TW |
| dc.date.accessioned | 2022-11-24T03:14:58Z | - |
| dc.date.available | 2021-11-06 | |
| dc.date.available | 2022-11-24T03:14:58Z | - |
| dc.date.copyright | 2021-11-06 | |
| dc.date.issued | 2021 | |
| dc.date.submitted | 2021-10-14 | |
| dc.identifier.citation | S. Agethen, H.C. Lee, and W. H. Hsu. Anticipation of human actions with posebased fine-grained representations. 2019 IEEE/CVF Conferenceon Computer Vision and Pattern Recognition Workshops(CVPRW) J. Carreira and A. Zisserman. Quo vadis, action recognition? a new model and the kinetics dataset. pages 4724–4733, 07 2017. A. Chadha, G. Arora, and N. Kaloty. iPerceive: Applying common-sense reasoning to multimodal dense video captioning and video question answering. 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1–13, 2021. G. Chen, J. Li, J. Lu, and J. Zhou. Human trajectory prediction via counterfactual analysis. In ICCV, 2021. D. Epstein, B. Chen, and C. Vondrick. Oops! predicting unintentional action in video. In The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020. Y. A. Farha and J. Gall. Uncertainty-aware anticipation of activities. 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pages 1197–1204, 2019. Y. A. Farha, A. Richard, and J. Gall. When will you do what? anticipating temporal occurrences of activities. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5343–5352, 2018. C. Feichtenhofer, H. Fan, J. Malik, and K. He. Slowfast networks for video recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019. A. Furnari, S. Battiato, and G. Maria Farinella. Leveraging uncertainty to rethink loss functions and evaluation measures for egocentric action anticipation. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops, September 2018. A. Furnari and G. M. Farinella. What would you expect? anticipating egocentric actions with rollingunrolling lstms and modality attention. In International Conference on Computer Vision (ICCV), 2019. H. Gammulle, S. Denman, S. Sridharan, and C. Fookes. Forecasting future action sequences with neural memory networks. BMVC, 2019. H. Gammulle, S. Denman, S. Sridharan, and C. Fookes. Predicting the future: A jointly learnt model for action anticipation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2019. J. Gao, Z. Yang, and R. Nevatia. Red: Reinforced encoder-decoder networks for action anticipation. BMVC, 07 2017. R. Girdhar and K. Grauman. Anticipative Video Transformer. In ICCV, 2021. Q. Ke, M. Fritz, and B. Schiele. Time-conditioned action anticipation in one shot. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2019. H. Kuehne, A. B. Arslan, and T. Serre. The language of actions: Recovering the syntax and semantics of goal-directed human activities. In Proceedings of Computer Vision and Pattern Recognition Conference (CVPR), 2014. T. Lan, T.C. Chen, and S. Savarese. A hierarchical representation for future action prediction. In ECCV, pages 689–704, 2014. C. Li, S. H. Chan, and Y.T. Chen. Who make drivers stop? towards driver-centric risk assessment: Risk object identification via causal inference. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 10711–10718, 2020. M. Liu, S. Tang, Y. Li, and J. M. Rehg. Forecasting human-object interaction: Joint prediction of motor attention and actions in first person video. In ECCV, 2020. T. Mahmud, M. Hasan, and A. K. RoyChowdhury. Joint prediction of activity labels and starting times in untrimmed videos. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 5784–5793, 2017. A. Miech, I. Laptev, J. Sivic, H. Wang, L. Torresani, and D. Tran. Leveraging the present to anticipate the future in videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, June 2019. R. Morais, L. Vương, T. Tran, and S. Venkatesh. Learning to abstract and predict human actions. BMVC, 2020. G. Nan, R. Qiao, Y. Xiao, J. Liu, S. Leng, H. Zhang, and W. Lu. Interventional video grounding with dual contrastive learning. In CVPR, 2021. L. Neumann, A. Zisserman, and A. Vedaldi. Future event prediction: If and when. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 2935–2943, 2019. Y. Ng and B. Fernando. Forecasting future action sequences with attention: A new approach to weakly supervised action forecasting. IEEE transactions on image processing : a publication of the IEEE Signal Processing Society, PP, 09 2020. J. Pearl, M. Glymour, and N. P. Jewell. The book of why: the new science of cause and effect. John Wiley Sons, 2016. J. Pearl and D. Mackenzie. The book of why: the new science of cause and effect. 2018. Basic Books. R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In 2017 IEEE International Conference on Computer Vision (ICCV), pages 618–626, 2017. F. Sener, D. Singhania, and A. Yao. Temporal aggregate representations for long-range video understanding. In European Conference on Computer Vision, pages 154–171. Springer, 2020 F. Sener and A. Yao. Zero-shot anticipation for instructional activities. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pages 862–871, 2019. C. Sun, A. Shrivastava, C. Vondrick, R. Sukthankar, K. Murphy, and C. Schmid. Relational action forecasting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), June 2019. D. Surís, R. Liu, and C. Vondrick. Learning the predictability of the future. 2021. C. Vondrick, H. Pirsiavash, and A. Torralba. Anticipating visual representations from unlabeled video. In 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR),pages 98-106,Los Alamitos,CA,USA,jun2016. IEEE Computer Society. T. Wang, J. Huang, H. Zhang, and Q. Sun. Visual commonsense rcnn. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10760–10770, 2020. Y. Wu, L. Zhu, X. Wang, Y. Yang, and F. Wu. Learning to anticipate egocentric actions by imagination. IEEE Transactions on Image Processing, 30:1143–1152, 01 2021. K. Xu, J. Ba, R. Kiros, K. Cho, A. C. Courville, R. Salakhutdinov, R. S. Zemel, and Y. Bengio. Show, attend and tell: Neural image caption generation with visual attention. In ICML, 2015. X. Yang, F. Feng, W. Ji, M. Wang, and T.S. Chua. Deconfounded video moment retrieval with causal intervention. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’21, 2021. | |
| dc.identifier.uri | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/80744 | - |
| dc.description.abstract | "動作預測是希望模型能夠根據看到的一段影片去推斷即將發生的未來動作, 這對於許多智能應用是很重要的能力,例如:自動駕駛, 輔助型機器人。現階段的作法多利用動作識別的模型所提取出來的資訊來作為設計動作預測模型的基礎。然而,我們發現當我們單純利用動作識別的模型來學習動作預測的問題時,模型會有過於單純仰賴畫面中正在進行的動作來判斷,忽略畫面中其他重要資訊好比畫面中有哪些物件。基於Judea Pearl所提的因果理論,這樣單純依靠被動觀察輸入與輸出的關聯性來推導兩者之間的因果關係是會受到混雜因子的誤導。我們藉由主動干預模型原先的學習模式,讓模型在做出預判之前,必須先考慮每種動作發生的可能性藉此來降低它過於依靠畫面中動作而不去觀察影片中其他資訊的問題。實驗結果顯示,我們所提出的comprehenser有助於消弭上述所提到的問題,並且可應用於不同的動作識別模型架構之上,皆獲得更卓越的性能。" | zh_TW |
| dc.description.provenance | Made available in DSpace on 2022-11-24T03:14:58Z (GMT). No. of bitstreams: 1 U0001-1410202111421700.pdf: 2655556 bytes, checksum: 36a85a87de3d7cd244c133186c046bf4 (MD5) Previous issue date: 2021 | en |
| dc.description.tableofcontents | Verification Letter from the Oral Examination Committee i Acknowledgements ii 摘要 iii Abstract iv Contents vi List of Figures viii List of Tables ix Chapter 1 Introduction 1 Chapter 2 Related Work 5 2.1 Action Anticipation 5 2.2 Causality in Vision 6 Chapter 3 Approach 7 3.1 Problem Definition 7 3.2 Causal Intervention 8 3.3 Comprehenser Module 10 Chapter 4 Experiments 12 4.1 Dataset 12 4.2 Implementation Details 12 Chapter 5 Results 14 5.1 Quantitative Analysis 14 5.2 Qualitative Analysis 15 Chapter 6 Conclusion 17 References 18 Appendix A — Breakfast Full Results 23 | |
| dc.language.iso | en | |
| dc.subject | 因果推斷 | zh_TW |
| dc.subject | 動作預測 | zh_TW |
| dc.subject | Causal Intervention | en |
| dc.subject | Action Anticipation | en |
| dc.title | 去除動作預測中的混雜因素 | zh_TW |
| dc.title | Deconfounded Action Anticipation | en |
| dc.date.schoolyear | 109-2 | |
| dc.description.degree | 碩士 | |
| dc.contributor.oralexamcommittee | 陳文進(Hsin-Tsai Liu),葉梅珍(Chih-Yang Tseng),陳奕廷,余能豪 | |
| dc.subject.keyword | 動作預測,因果推斷, | zh_TW |
| dc.subject.keyword | Action Anticipation,Causal Intervention, | en |
| dc.relation.page | 24 | |
| dc.identifier.doi | 10.6342/NTU202103717 | |
| dc.rights.note | 同意授權(限校園內公開) | |
| dc.date.accepted | 2021-10-15 | |
| dc.contributor.author-college | 電機資訊學院 | zh_TW |
| dc.contributor.author-dept | 資訊網路與多媒體研究所 | zh_TW |
| 顯示於系所單位: | 資訊網路與多媒體研究所 | |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| U0001-1410202111421700.pdf 授權僅限NTU校內IP使用(校園外請利用VPN校外連線服務) | 2.59 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
