Cross-document structural relationship identification using supervised machine learning

Multi document analysis has been a field of interest for decades and is still being actively researched until today. One example of such analysis could be for the task of multi document summarization which is meant to represent the concise description of the original documents. In this paper, we wil...

全面介紹

Saved in:
書目詳細資料
Main Authors: Kumar, Yogan Jaya, Salim, Naomie, Raza, Basit
格式: Article
出版: 2012
主題:
在線閱讀:http://eprints.utm.my/id/eprint/46756/
https://dx.doi.org/10.1016/j.asoc.2012.06.017
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:Multi document analysis has been a field of interest for decades and is still being actively researched until today. One example of such analysis could be for the task of multi document summarization which is meant to represent the concise description of the original documents. In this paper, we will focus on some special properties that multi document articles hold, specifically news articles. Information across news articles reporting on the same story are often related. Cross-document structure theory (CST) gives several relationships between pairs of sentences from different documents. Among them, we focus on four relations namely “Identity”, “Overlap”, “Subsumption”, and “Description”. Our aim is to automatically identify these CST relationships. We applied three machine learning techniques, i.e. SVM, neural network and our proposed case-based reasoning (CBR) model. Comparison between these techniques shows that the proposed CBR model yields better results.