Skip navigation

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More
DSpace logo
English
中文
  • Browse
    • Communities
      & Collections
    • Publication Year
    • Author
    • Title
    • Subject
    • Advisor
  • Search TDR
  • Rights Q&A
    • My Page
    • Receive email
      updates
    • Edit Profile
  1. NTU Theses and Dissertations Repository
  2. 電機資訊學院
  3. 電信工程學研究所
Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96805
Title: 三平面注意力於快速文字到 3D 生成
TPA3D: Triplane Attention for Fast Text-to-3D Generation
Authors: 吳彬世
Bin-Shih Wu
Advisor: 王鈺強
Yu-Chiang Frank Wang
Keyword: 電腦視覺,3D 視覺,生成式AI,文字到3D生成,注意力機制,
computer vision,3D vision,generative AI,text-to-3D generation,attention mechanism,
Publication Year : 2025
Degree: 碩士
Abstract: 由於缺乏大規模的文字與3D對應資料,近期的文字轉3D生成工作主要依賴於使用2D擴散模型來合成3D資料。由於基於擴散模型的方法通常在訓練和推理過程中需要大量的優化時間,因此基於GAN(生成式對抗網絡)的模型在快速生成3D資料方面仍然具有吸引力。在這篇論文中,我們提出了一種名為三平面注意力的文字引導3D生成方法(TPA3D),這是一個端到端可訓練的基於GAN的深度學習模型,專為快速文字到3D生成而設計。在訓練過程中僅觀察到3D形狀資料及其渲染的2D圖像,我們的TPA3D旨在檢索詳細的視覺描述,以合成對應的3D網格資料。這是通過我們提出的對句子和單詞級別文字特徵進行的注意力機制實現的。在實驗中,我們展示了TPA3D生成的高質量3D紋理形狀與細粒度描述相一致,同時展現了令人印象深刻的計算效率。
Due to the lack of large-scale text-3D correspondence data, recent text-to-3D generation works mainly rely on utilizing 2D diffusion models for synthesizing 3D data. Since diffusion-based methods typically require significant optimization time for both training and inference, the use of GAN-based models would still be desirable for fast 3D generation. In this work, we propose Triplane Attention for text-guided 3D generation (TPA3D), an end-to-end trainable GAN-based deep learning model for fast text-to-3D generation. With only 3D shape data and their rendered 2D images observed during training, our TPA3D is designed to retrieve detailed visual descriptions for synthesizing the corresponding 3D mesh data. This is achieved by the proposed attention mechanisms on the extracted sentence and word-level text features. In our experiments, we show that TPA3D generates high-quality 3D textured shapes aligned with fine-grained descriptions, while impressive computation efficiency can be observed.
URI: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/96805
DOI: 10.6342/NTU202401418
Fulltext Rights: 同意授權(限校園內公開)
metadata.dc.date.embargo-lift: 2025-02-22
Appears in Collections:電信工程學研究所

Files in This Item:
File SizeFormat 
ntu-113-1.pdf
Access limited in NTU ip range
24.63 MBAdobe PDF
Show full item record


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

社群連結
聯絡資訊
10617臺北市大安區羅斯福路四段1號
No.1 Sec.4, Roosevelt Rd., Taipei, Taiwan, R.O.C. 106
Tel: (02)33662353
Email: ntuetds@ntu.edu.tw
意見箱
相關連結
館藏目錄
國內圖書館整合查詢 MetaCat
臺大學術典藏 NTU Scholars
臺大圖書館數位典藏館
本站聲明
© NTU Library All Rights Reserved