Please use this identifier to cite or link to this item:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91745
Title: | 使用大型語言模型開發基於網頁介面互動的任務導向對話系統 Developing a task oriented dialogue system with web UI interaction using large language models |
Authors: | 黃仙惠 Hsien-Hui Huang |
Advisor: | 黃乾綱 Chien-Kang Huang |
Keyword: | 大型語言模型,網頁測試自動化框架,具有 GUI 介面互動的對話系統, Large Language Models,Web Test Automation,GUI-TOD system, |
Publication Year : | 2024 |
Degree: | 碩士 |
Abstract: | 填槽式任務導向對話框架(SF-TOD framework)預先定義任務專用槽(task- specific slots)並透過任務專用槽應用程式開發接口(task-specific slots API)來處理用戶的請求,但在實際應用上會遇到缺乏 task-specific slots API 的問題。在此問題上已有研究提出了一種基於圖形使用者介面的任務導向對話系統(GUI-TOD system),此系統通過學習操作 GUI 介面動作(例如點擊、滑動、輸入文字等)在 應用程式 GUI 介面上執行任務,不需要透過 task-specific slots API。然而,目前 GUI-TOD system 研究主要集中在行動應用程式的 GUI 介面,尚未有網頁 GUI 介 面的 GUI-TOD system。
本研究使用大型語言模型開發基於網頁介面互動的任務導向對話系統(Task Oriented Dialogue system with Web UI Interaction using Large Language Models, 簡稱 為 TOD-WebUII-LLM),TOD-WebUII-LLM 結合了大型語言模型和網頁自動化測試框架來實現具有網頁介面互動的 GUI-TOD system,實作過程中使用 LLMs 作為智慧型代理來理解和操作 GUI 介面,同時利用網頁測試自動化框架 Playwright 來 執行網頁 GUI 操作。本研究的主要貢獻為提供了一個對話系統,在該系統上使用 自然語言對話就能自動化地在 Web 上完成指定任務;同時,本研究也為該系統制 定了評估標準和蒐集 14 個測試案例作為討論基礎。 The Slot Filling Task-Oriented Dialogue (SF-TOD) framework pre-defines task-specific slots and utilizes a Task-Specific Slots API to handle user requests. However, practical applications often face challenges due to the lack of a task-specific slots API. Previous research has introduced a Graphical User Interface Task-Oriented Dialogue system (GUI-TOD system), which learns to perform tasks on application GUI interfaces by understanding GUI actions such as clicks, scrolls, and text input, without relying on task-specific slots API. However, existing GUI-TOD system research has primarily focused on mobile application GUI interfaces, leaving a gap in the exploration of GUI-TOD systems for web interfaces. This study presents the development of a Task-Oriented Dialogue system with Web UI Interaction using Large Language Models (TOD-WebUII-LLM) utilizing a large language model and a web automation testing framework. TOD-WebUII-LLM combines large language models and web automation testing using the Playwright framework to implement a GUI-TOD system with web interface interaction. In the implementation, large language models serve as intelligent agents to comprehend and manipulate GUI interfaces, while the Playwright web testing automation framework executes web GUI operations. The primary contribution of this research lies in providing a dialogue system that enables the automation of specified tasks on the web through natural language conversations. Additionally, the study establishes evaluation criteria and collects 14 test cases for discussion and analysis. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/91745 |
DOI: | 10.6342/NTU202400041 |
Fulltext Rights: | 未授權 |
Appears in Collections: | 工程科學及海洋工程學系 |
Files in This Item:
File | Size | Format | |
---|---|---|---|
ntu-112-1.pdf Restricted Access | 7.58 MB | Adobe PDF |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.