具有核化斯坦因差異和分數匹配的變分推論

陳貞諺; Zhen-Yan Chen

請用此 Handle URI 來引用此文件： http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96419

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.advisor	楊鈞澔	zh_TW
dc.contributor.advisor	Chun-Hao Yang	en
dc.contributor.author	陳貞諺	zh_TW
dc.contributor.author	Zhen-Yan Chen	en
dc.date.accessioned	2025-02-13T16:23:13Z	-
dc.date.available	2025-02-14	-
dc.date.copyright	2025-02-13	-
dc.date.issued	2025	-
dc.date.submitted	2025-02-07	-
dc.identifier.citation	Barp, A., Briol, F.-X., Duncan, A., Girolami, M., and Mackey, L. (2019). Minimum stein discrepancy estimators. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc. Bartholomew, D., Knott, M., and Moustaki, I. (2011). Latent Variable Models and Factor Analysis: A Unified Approach. Wiley Series in Probability and Statistics. John Wiley & Sons, Ltd, first edition. Casella, G., and George, E. I. (1992). Explaining the gibbs sampler. The American Statistician, 46(3):167–174. Dhaka, A. K., Catalina, A., Welandawe, M., Andersen, M. R., Huggins, J., and Vehtari, A. (2021). Challenges and opportunities in high dimensional variational inference. In Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W., editors, Advances in Neural Information Processing Systems, volume 34, pages 7787–7798. Curran Associates, Inc. Dieng, A. B., Tran, D., Ranganath, R., Paisley, J., and Blei, D. (2017). Variational inference via χ upper bound minimization. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc. Hastings, W. K. (1970). Monte carlo sampling methods using markov chains and their applications. Biometrika, 57(1):97–109. Hodgkinson, L., Salomone, R., and Roosta, F. (2021). The reproducing stein kernel approach for post-hoc corrected sampling. Hoffman, M. D., Blei, D. M., Wang, C., and Paisley, J. (2013). Stochastic variational inference. Journal of Machine Learning Research, 14(5):1303–1347. Jordan, M., Ghahramani, Z., Jaakkola, T., and Saul, L. (1999). An introduction to variational methods for graphical models. Machine Learning, 37:183–233. Kanagawa, H., Jitkrittum, W., Mackey, L., Fukumizu, K., and Gretton, A. (2023). A kernel stein test for comparing latent variable models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(3):986–1011. Knoblauch, J., Jewson, J., and Damoulas, T. (2022). An optimization-centric view on bayes’ rule: Reviewing and generalizing variational inference. Journal of Machine Learning Research, 23(132):1–109. Koller, D., and Friedman, N. (2009). Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning. The MIT Press. Korba, A., Aubin-Frankowski, P.-C., Majewski, S., and Ablin, P. (2021). Kernel stein discrepancy descent. In Meila, M., and Zhang, T., editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 5719–5730. PMLR. Li, Y., and Turner, R. E. (2016). Rényi divergence variational inference. In Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc. Liu, Q., Lee, J., and Jordan, M. (2016). A kernelized stein discrepancy for goodness-of-fit tests. In Balcan, M. F., and Weinberger, K. Q., editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 276–284, New York, New York, USA. PMLR. Liu, Q., and Wang, D. (2016). Stein variational gradient descent: A general purpose bayesian inference algorithm. In Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., and Garnett, R., editors, Advances in Neural Information Processing Systems, volume 29. Curran Associates, Inc. Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953). Equation of state calculations by fast computing machines. The Journal of Chemical Physics, 21(6):1087–1092. Minka, T. P. (2001). Expectation propagation for approximate bayesian inference. In Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence, UAI’01, pages 362–369, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc. Modi, C., Gower, R., Margossian, C., Yao, Y., Blei, D., and Saul, L. (2023). Variational inference with gaussian score matching. In Oh, A., Naumann, T., Globerson, A., Saenko, K., Hardt, M., and Levine, S., editors, Advances in Neural Information Processing Systems, volume 36, pages 29935–29950. Curran Associates, Inc. Murphy, K. P., Weiss, Y., and Jordan, M. I. (1999). Loopy belief propagation for approximate inference: an empirical study. In Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, UAI’99, pages 467–475, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc. Naesseth, C., Lindsten, F., and Blei, D. (2020). Markovian score climbing: Variational inference with kl(p\|\|q). In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M., and Lin, H., editors, Advances in Neural Information Processing Systems, volume 33, pages 15499–15510. Curran Associates, Inc. Nguyen, D. (2023). An in depth introduction to variational bayes note. Available at: https://ssrn.com/abstract=4541076 or http://dx.doi.org/10.2139/ssrn.4541076. Platt, J., and Barr, A. (1987). Constrained differential optimization. In Anderson, D., editor, Neural Information Processing Systems, volume 0. American Institute of Physics. Ranganath, R., Gerrish, S., and Blei, D. (2014). Black Box Variational Inference. In Kaski, S., and Corander, J., editors, Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, volume 33 of Proceedings of Machine Learning Research, pages 814–822, Reykjavik, Iceland. PMLR. Salakhutdinov, R. (2015). Learning deep generative models. Annual Review of Statistics and Its Application, 2:361–385. Wainwright, M. J., and Jordan, M. I. (2008). Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1–2):1–305. Xu, W., and Matsuda, T. (2020). A stein goodness-of-fit test for directional distributions. In Chiappa, S., and Calandra, R., editors, Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, volume 108 of Proceedings of Machine Learning Research, pages 320–330. PMLR. Yang, Y., Martin, R., and Bondell, H. (2019). Variational approximations using fisher divergence.	-
dc.identifier.uri	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96419	-
dc.description.abstract	近年來，近似難以處理的概率分佈已成為一個重要的研究課題。針對這一問題，目前主要有兩類技術方法：馬爾可夫鏈蒙特卡羅（MCMC）和變分推論（VI）。然而，MCMC 方法計算成本高，且對於大規模數據集並不實用。相比之下，VI 方法受到越來越多的關注。然而，傳統的變分方法通過最小化目標分佈與一個相對簡單的參數化分佈（變分族）之間的 Kullback–Leibler （KL）散度，會受到變分族限制和有過於簡化的問題。本研究提出了一種基於核化斯坦因差異的變分推論方法（KSD-VI），旨在緩解傳統 VI 的限制，更進一步結合分數匹配原則（KSDSM-VI）以解決過度簡化的問題。	zh_TW
dc.description.abstract	Recently, approximating intractable probability distributions has become an important problem. Two main classes of techniques address this issue: Markov chain Monte Carlo (MCMC) and variational inference (VI). However, MCMC methods are computationally expensive and impractical for large datasets. In contrast, VI has received increasing interest. Traditional variational inference, minimizing the Kullback-Leibler divergence (KL divergence) between a relatively simple parametric family (variational family) and the target distribution, suffers from limitations on variational family and the risk of oversimplification. This research proposes a new variational inference method based on kernelized Stein discrepancy (KSD-VI) to overcome the restrictions of traditional VI. Furthermore, it integrates the score matching principle (KSDSM-VI) to address oversimplification.	en
dc.description.provenance	Submitted by admin ntu (admin@lib.ntu.edu.tw) on 2025-02-13T16:23:13Z No. of bitstreams: 0	en
dc.description.provenance	Made available in DSpace on 2025-02-13T16:23:13Z (GMT). No. of bitstreams: 0	en
dc.description.tableofcontents	致謝 i 摘要 iii Abstract v Contents vii List of Figures ix List of Tables xi Chapter 1 Introduction 1 1.1 Intractable Probability Distribution 1 1.2 Kernelized Stein Discrepancy (KSD) 3 1.3 Motivation 4 Chapter 2 Preliminaries 5 2.1 Variational Inference 5 2.2 Dissimilarity Measure 7 2.2.1 Kullback-Leibler Divergence 8 2.2.2 Fisher Divergence 9 2.2.3 Kernelized Stein Discrepancy 10 2.3 Optimization 11 2.3.1 Parameter-Based Schemes 11 2.3.1.1 Deterministic Optimization 11 2.3.1.2 Stochastic Optimization 12 2.3.2 Particle-Based Schemes 13 Chapter 3 Methods 15 3.1 KSD-VI 15 3.2 KSDSM-VI 19 Chapter 4 Simulations 23 4.1 Comparison Between Fisher-VI, KLVI, and KSD-VI 23 Chapter 5 Conclusion and Discussion 29 References 31 Appendix A — Calculation 37 A.1 Inclusive KSD with Univariate Gaussian Settings 37	-
dc.language.iso	en	-
dc.title	具有核化斯坦因差異和分數匹配的變分推論	zh_TW
dc.title	Variational Inference with Kernelized Stein Discrepancy and Score Matching	en
dc.type	Thesis	-
dc.date.schoolyear	113-1	-
dc.description.degree	碩士	-
dc.contributor.oralexamcommittee	陳裕庭;張升懋	zh_TW
dc.contributor.oralexamcommittee	Yu-Ting Chen;Sheng-Mao Chang	en
dc.subject.keyword	變分推論,近似貝式推論,馬可夫鏈蒙地卡羅,核化斯坦因差異,分數匹配,	zh_TW
dc.subject.keyword	Variational Inference,Approximate Bayesian Inference,Markov chain Monte Carlo,Kernelized Stein Discrepancy,Score Matching,	en
dc.relation.page	38	-
dc.identifier.doi	10.6342/NTU202500486	-
dc.rights.note	同意授權(限校園內公開)	-
dc.date.accepted	2025-02-07	-
dc.contributor.author-college	理學院	-
dc.contributor.author-dept	統計與數據科學研究所	-
dc.date.embargo-lift	2025-02-14	-
顯示於系所單位：	統計與數據科學研究所

文件中的檔案：

檔案	大小	格式
ntu-113-1.pdf 授權僅限NTU校內IP使用（校園外請利用VPN校外連線服務）	1.59 MB	Adobe PDF	檢視/開啟

顯示文件簡單紀錄

系統中的文件，除了特別指名其著作權條款之外，均受到著作權保護，並且保留所有的權利。