居家型機器人運用位置訊息進行人類活動與習慣理解

林子涵; Tzu-Han Lin

Please use this identifier to cite or link to this item: http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87342

Title:	居家型機器人運用位置訊息進行人類活動與習慣理解 Household Robot Utilizing Location Information for Human Activity and Habit Understanding
Authors:	林子涵 Tzu-Han Lin
Advisor:	傅立成 Li-Chen Fu
Keyword:	人類活動辨識,計畫辨識,地點估計,居家機器人,人類習慣,機器人響應, Human activity recognition,plan recognition,location estimation,household robot,human habit,robot response,
Publication Year :	2023
Degree:	碩士
Abstract:	近年來技術的快速發展與提升，為機器人領域開闢了多項新的領域，許多研究大量投入居家機器人的應用及解決其可能遇到的問題，多年來，研究人員與專業人士將注意力轉向提升機器人的智慧能力，此提升不僅能提高程序的效率，更能使機器人更人性化，而居家機器人要有智慧就必須擁有理解周邊環境的能力與辨識人類行為和背後意義的能力，此外給予必要與適當的回饋與反應。雖然在這個領域已經擁有非常多的研究，但還是存在許多須解決的問題和對人類行為理解上的挑戰。首先，對於一個居家型機器人而言，要具備有理解人類活動上的能力是至關重要的，因此，機器人須具備一個高準確度的人類活動辨識系統，除此之外，由於人類的活動與執行的地點有高度相關性，位置訊息被認為是有助於提高人類活動辨識系統的表現，因此，機器人也必須具備識別地點的能力，然而，當機器人感知影像中存在多個地點時，要獲得一個穩定且準確的位置估計技術是不容易也具挑戰性的。其次，為了理解一系列所觀察到的活動，計畫辨識扮演著至關重要的腳色，此外人類的習慣是影響序列中活動順序的重要因素，並能提高與改變計畫辨識的結果。最後，機器人與人類的互動是應用中最重要的部分，基於對於人類活動與目標的理解，機器人有必要運用這些知識和人類進行互動，並潛在的改變人類的行為與決策而使系統更有意義在。本篇研究中，我們考慮了一個整合兩種不同估計地點模型的方法來獲得用於人類活動辨識系統的位置訊息，其中一個模型名為ResNet50-Place365是透過處理單一影像來估計地點，另一個是我們創建的模型，此模型利用人與物體之間的距離來決定人所在的位置。處此之外，我們提出一個基於適應性圖卷積及活動地點的人類活動辨識系統名為AL-GCN，該系統透過輸入三維人體骨架和估計的地點來預測居家環境中人類的行為，以實現更好的準確性與強健性。我們還設計了一個計劃辨識系統來預測下一個活動、目標與計畫，為了提高計畫辨識的性能，我們透過創建一個包含人類習慣的知識庫讓系統可以將人類習慣作為辨識的資訊，該知識庫包含存儲各種活動序列的plan library、用於預測下一個活動的Loc-NextAct Tensor與用於預測目標的Loc-Objective Tensor，此外該知識庫也能夠獲得新訊息並做適當的更新來改進預測的結果與適應不同的使用者。最後，偵測到的活動與預測將作為機器人響應模組的輸入來向用戶提供建議、警告與提醒。最終我們將系統部屬到我們自己的機器人上，以執行現實世界中會發生的場景。在實驗中，我們進行了兩種不同類型的評估，一個是對數據集做的評估，另一個則是在真實世界中對我們的機器人所做的評估。在數據集的評估中，我們所提出將兩種不同位置估計融合的方法可獲得92.83%的準確率，而這個準確率都比僅使用其中任何一個都更準確，此外我們的AL-GCN模型透過結合位置訊息與人體骨架在Cross-subject的評估上可以達到94.33%的準確率，而我們的計畫辨識系統透過考慮位置訊息與更新知識庫來改進預測，並可得知系統在活動序列的前幾次觀測就可以獲得較高準確的預測。在真實世界中，我們的位置估計方法在估計客廳上可以達到98%的準確率，至於提出的AL-GCN模型加上位置訊息後準確率提高了10%到20%，最後，在執行不同的計劃時，我們提出的計畫辨識系統所作的預測會隨著知識庫的更新而有顯著的提高。在未來研究上，我們的系統可以與Re-ID做結合使機器人能分辨不同的人物，並進一步使用系統所儲存的個別個人習慣進行預測，除此之外，機器人響應可以進一步考慮習慣時間上的資訊，並提供一個在時間上更精確地回應。 The rapid development and growth of technology has widened up and opened new frontiers of the field of robotics. Many researches have devoted in investigating indoor robotic applications and the problems they may encounter. Throughout the years, researchers and professionals have turn their focus on producing robots that are becoming more and more intelligent. The increase of intelligence in robots not only allow the program to be more efficient, but also allow the robots to be more humanized. For household robots to be more intelligent, they require the ability to understand the surrounding environment and human behavior. Moreover, give necessary and suitable feedback and responses when interacting with a human. In this field of research, there are still much to investigate and many problems to resolve. First, a household robot that possess the ability to understand human activity is crucial. Thus, a human activity recognition (HAR) system that has a high recognition accuracy is desired. Moreover, due to the high relationship between indoor activities and location, the location information is considered helpful for improving the HAR. Hence, the robot also needs to possess the ability to recognize location. However, obtaining a stable and accurate location estimating technique is challenging when the robot perceives multiple locations in an image. Second, to understand a sequence of perceived activities, plan recognition plays a vital role. Moreover, human habit is an essential element that affects the order of activities in the sequences which can enhance the plan recognition performance. Last but not least, the interaction between robot and user is a major part in robot application. Upon the knowledges from the understanding of human activity and plan recognition, the robot is necessary to utilize these knowledges to interact with humans and potentially impact human decision which makes the system more meaningful. In this study, we consider a location estimation method that integrates two different models to obtain the location information that is used for HAR. A model called ResNet50-Place365 is utilized to estimate location by processing image data. The other model is created by us called the location estimator, which utilizes the distance between human and objects to determine which location the human is in. Moreover, we propose a human activity recognition system called activity-location graph convolutional neural network (AL-GCN) based on adaptive graph convolutional network that incorporates location information to understand human behaviors. This model predicts the human activity in a household environment by taking the three-dimensional human skeleton and the estimated location as input to achieve better prediction accuracy and robustness. We also propose a plan recognition system that gives various predictions, such as next activity, objective, and plan. In order to enhance the performance of plan recognition, we take the human habit into account by creating a knowledge base that consists of a plan library, which stores various sequence of activities, a Loc-NextAct Tensor, which is used for predicting the next activity, and a Loc-Objective Tensor, which is used for predicting the objective. Moreover, the knowledge base can be updated as new information are obtained to adapt to different users and improve prediction results. Furthermore, a response module that takes the activity and predictions results from HAR and plan recognition as inputs is created to give advices, warnings, or reminders to the user. Finally, our system is physically deployed onto our own home-made robot for executing real-world scenarios. In our experiment, two types of evaluations are conducted. One is the evaluation on the datasets, and the other is the evaluation on our physical robot in real-world scenarios. In the dataset evaluation, our proposed method that fuses ResNet50-Place365 and location estimator receives an accuracy of 92.83%, which is more accurate than only using either one of them. By incorporating the location information, our proposed AL-GCN HAR model achieves a 94.33% accuracy on the cross-subject evaluation. Moreover, the predictions of our plan recognition system are improved by updating the knowledge base and considering the location information. As a result, the predictions can achieve high accuracy in the first few observations. In the real-world experiment, our location estimation model achieves a 98% accuracy in estimating the living room. As for our AL-GCN Model, with the location information added, the accuracy of different activities gain improvement ranging from 10% to 20%. Finally, our plan recognition results in real-world show that with the updated knowledge base, the accuracy prediction increases significantly. In Future work, our system can integrate with Re-ID to allow the robot to recognize multiple people and further apply their personal stored habit for predictions. Moreover, robot responses can take the habit time into consideration and provide a more precise response.
URI:	http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/87342
DOI:	10.6342/NTU202300261
Fulltext Rights:	同意授權(全球公開)
metadata.dc.date.embargo-lift:	2026-03-01
Appears in Collections:	電機工程學系

Files in This Item:

File	Size	Format
ntu-111-1.pdf	9.56 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets