請用此 Handle URI 來引用此文件:
http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8225
標題: | 應用中曆時間標準化於六朝正史對讀 On Normalizing Chinese Calendar and its Application to Aligned Reading of the Standard Histories of the Six Dynasties |
作者: | PEI-WEN FANG 方珮雯 |
指導教授: | 項潔(Jieh Hsiang) |
關鍵字: | 六朝正史,批次擷取與標記,時間標準化,多文本對讀, the official history of the Six Dynasties,batch capture and marking,time standardization,aligned reading, |
出版年 : | 2020 |
學位: | 碩士 |
摘要: | 時間是人文研究的要素之一,只要將時間定錨,便能交叉閱讀文本,比對同一時期之下,不同史書對特定事件或人物的記載。考量中國各朝紀年方式因政權而異,欲將中曆時間放置在一標準化的時間軸,需要統一轉換為西曆。本研究參考現有中西曆轉換工具,開發批次時間擷取與標記演算法,以最多分裂政權的六朝為代表,選用十史內的人物傳記,將文本中曆時間批次轉換為西曆時間,並將標記好時間錨點的文本,轉換為DocuXml檔案,應用於數位人文研究平台:DocuSky內的多文本對讀工具。 本研究分為三大步驟:時間批次擷取與標記、中曆時間標準化、對讀系統設計。研究者以DocuSky現有工具,從中國哲學書電子化計劃(Ctext) (Ctext)網站下載DocuXml格式文本,再取用所需章節,依序以正則表達式擷取時間字詞、以法鼓山時間資料庫判別年代、以跨朝判別演算法辨識改朝換代過程的時間點,試圖提升時間批次標記的準確率。接著,研究者將標記好起訖時間的時間字詞轉換為多文本對讀工具的時間錨點,並以此推測文件起迄時間,用以搜尋可對讀文本。 多文本對讀分為文本內與文本間的對讀。文本內對讀意指同一本史書中,不同人物傳記的對讀,而文本間對讀則比對不止一本史書。研究者於多文本對讀系統新增時間範圍對讀功能,以陳高祖陳霸先、燕成武帝慕容垂為例,試圖從不同正史拼湊一位歷史人物的生命史,並根據演算法和文本特性,探討時間批次擷取、時間批次標記,以及多文本對讀的優勢與限制。 本研究提升自動化批次標記演算法的準確率,以標準化的時間串連六朝正史,實現以時間為錨點的跨文本對讀,為六朝研究帶來更多可能性。 Time is a core elementone of the core elements of in humanities research. As long as the time is anchored If time is aligned properly, oneyou can cross-read context texts and compare the records of a specific event or character in different historical books. Considering that the chronology ofSince each Chinese dynasty marked their time using different reign titlesin China differs, we need to standardize them using a consistent dating scheme such as the Chinese calendar time have to be unified into a standardized time axis, for example, the Gregorian or Julian calendar. calendar or the Julian calendar. We utilize existingReferring to the existing Chinese/ and Western calendar conversion tools and designed, this research aims to develop batch time capture and marking algorithms for the standardization process. To demonstrate, we use our method to align the times in the biographies of the 10 Standard Histories during the Six Dynasties eraFocusing on the six dynasties, the most fragmented era with many regimes , one of the most divided regimes in Chinese history, this study selects biographies in the ten histories to convert Chinese calendar into Western calendar time. The Afterwards, the context with time anchors were then converted into DocuXml files, and applied to the multi-textaligned reading tool in DocuSky, a digital humanities research platform. This research is divided into three steps: time batch capture and marking, time standardization, and design of the aligned reading tool. DocuXml formatted context are downloaded from Chinese Text Project (CText) website with a Docusky existing tooldowanloading and converstion tools. Firstly, time wPhrasesords representing time were then extracted usingwith regular expressions. Secondly, eEras were identified with using Buddhist Studies Time Authority Databases developed by Dharma Dram, and then a cross-dynasty discriminant algorithm was designed to identify time words phrases in the dynasty change process. Finally, time words phrases marked with the start and end dates were converted into time anchors in the aligned reading tool. Based on these dates, the start and end date of a document was estimated with Median absolute deviation. This information is used to search context for aligned reading. Aligned reading is divided into intra-text and inter-text aligned reading. The intra-text aligned reading refers to the parallel reading of separated biographies in the same history book, while the inter-text reading compares biographies in more than one history book. Two biographies were presented as examples in this thesis, Chen Baxian, Emperor Wu of Chen and Murong Chui, and Emperor Chengwu of Later Yan. This study tried piece together the life history of a historical figure from different official histories. Based on algorithms and context characteristics, the advantages and limitations of time batch extraction, time batch marking, and aligned reading were also explored. This research improved the accuracy of the automated batch marking algorithm, connected the official history of the Six Dynasties with a standardized time axis, and realized the cross-text aligned reading with time anchors, which brings more possibilities for the study of the Six Dynasties. |
URI: | http://tdr.lib.ntu.edu.tw/jspui/handle/123456789/8225 |
DOI: | 10.6342/NTU202003195 |
全文授權: | 同意授權(全球公開) |
顯示於系所單位: | 資訊網路與多媒體研究所 |
文件中的檔案:
檔案 | 大小 | 格式 | |
---|---|---|---|
U0001-1308202008191400.pdf | 14.95 MB | Adobe PDF | 檢視/開啟 |
系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。