請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/102235| 標題: | 多模態人機共創框架:自動輪椅舞蹈生成系統 A Multimodal Human-Robot Co-Creation Framework for Autonomous Wheelchair Dancing |
| 作者: | 林宥成 Yu-Cheng Lin |
| 指導教授: | 陽毅平 Yee-Pien Yang |
| 關鍵字: | 電動輪椅,ROS2語音辨識音樂資訊檢索跨模態控制生成式人工智慧 Electric wheelchair,ROS2Speech recognitionMusic information retrievalCross- modal controlGenerative AI |
| 出版年 : | 2026 |
| 學位: | 碩士 |
| 摘要: | 本研究旨在建構一套結合語音理解 (automatic speech recognition, ASR)、音樂節奏分析 (music information retrieval, MIR) 與智慧控制之電動輪椅舞蹈系統,使使用者可透過自然語言與音樂輸入,實現具節奏性與創造性的移動表演。系統以 ROS2 為核心通訊框架,整合語音辨識、自然語意解析 (large language model, LLM)、音樂拍點分析與即時控制模組,形成「語音—語意—音樂—動作—控制」之跨模態閉環架構。
在輸入層,系統以 Whisper 模型進行語音轉錄,並透過 LLM 分析語意與情緒描述以生成舞蹈規範 (dance specification)。音樂層採用節拍追蹤與節奏強度時序分析分析萃取拍點與速度資訊,作為動作時間軸對齊依據。控制層方面,以 ROS2 Nav2 架構結合 AMCL (adaptive monte carlo localization, AMCL) 定位與里程計感測融合,確保舞步執行之平滑與安全性;最終生成之輪椅車輪速度命令 (v, ω) 會由上位控制器 Arduino Mega 2560 (以下簡稱 Arduino) 透過差動訊號傳送至數位訊號處理驅動模組,完成雙輪閉迴路控制。 本研究亦對辨識層與控制層進行量化驗證。節拍偵測之平均延遲約 4–5 ms,小節動態對齊率 (measure dynamic alignment, MDA) 於不同曲風中達 87–93%,顯示系統能精準貼齊節奏結構。控制層採用速度控制與加速度平滑化策略,使速度抖動率 (velocity jitter ratio, VJR) 相較傳統純速度控制降低約 40–60%,並有效抑制瞬時加速度尖峰。定位精度方面,AMCL 融合可將長距離位姿誤差由 0.8–1.0 m 降至 0.05–0.13 m,提升約 90% 的軌跡穩定性。在障礙物環境測試中,本研究架構達成跳舞兼顧避障之功能,並能保持舞步節奏一致性,展現優於傳統純速度控制之環境適應力。 綜合而言,本研究實現具創造性與互動性的電動輪椅舞蹈控制平台,不僅可作為輔助科技與藝術表演之跨域應用原型,亦為未來智慧輔具結合生成式人工智慧 (artificial intelligence, AI) 與人機共演提供新方向。 This thesis aims to develop an electric wheelchair dance system that integrates speech understanding (automatic speech recognition, ASR), musical rhythm analysis (music information retrieval, MIR), and intelligent control, enabling users to perform rhythmic and creative movements through natural language and music inputs. The system is built upon the ROS2 communication framework, incorporating Automatic Speech Recognition, semantic interpretation using large language models (LLM), beat analysis, and real-time control modules, forming a cross-modal closed-loop architecture of “Speech–Semantics–Music–Motion–Control.” In the input layer, the Whisper model is used for speech transcription, and an LLM interprets semantic and emotional descriptions to generate a dance specification. The music layer applies beat tracking and tempogram analysis to extract tempo and onset features as temporal references for motion alignment. In the control layer, the ROS2 Nav2 framework combines adaptive monte carlo localization (AMCL) with odometry sensor fusion to ensure smooth and safe execution of dance trajectories. The final velocity commands (v, ω) are transmitted via RS-485 from the upper controller (Arduino Mega 2560) to the DSP motor driver, enabling dual-wheel closed-loop control. This thesis also conducts quantitative evaluations on the perception and control modules. The average beat detection latency is approximately 4–5 ms, while the measure dynamic alignment (MDA) reaches 87–93% across different musical styles, demonstrating precise alignment with rhythmic structures. The adoption of velocity control with acceleration smoothing reduces the velocity jitter ratio (VJR) by approximately 40–60% compared with traditional open-loop velocity control, effectively suppressing instantaneous acceleration spikes. For localization accuracy, AMCL reduces long-distance pose drift from 0.8–1.0 m to 0.05–0.13 m, achieving an improvement of nearly 90%. In obstacle-rich environments, the proposed system successfully performs dance movements while simultaneously avoiding collisions, maintaining rhythmic consistency and demonstrating superior adaptability compared with traditional velocity control. In conclusion, this thesis presents an intelligent and interactive electric wheelchair dance control platform that serves as a cross-domain prototype integrating assistive technology and performing arts. It also provides a new direction for future assistive devices that combine generative artificial intelligence (AI) with human–robot co-performance. |
| URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/102235 |
| DOI: | 10.6342/NTU202600280 |
| 全文授權: | 未授權 |
| 電子全文公開日期: | N/A |
| 顯示於系所單位: | 機械工程學系 |
文件中的檔案:
| 檔案 | 大小 | 格式 | |
|---|---|---|---|
| ntu-114-2.pdf 未授權公開取用 | 23.07 MB | Adobe PDF |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。
