Feature-based POS tagging and sentence relevance for news multi-document summarization in Bahasa Indonesia
Moch Zawaruddin Abdullah, Chastine Fatichah
Abstract
Sentence extraction in news document summarization determines representative sentences primarily by employing the news feature known as news feature score (NeFS). NeFS can achieve meaningful sentences by analyzing the frequency and similarity of phrases while neglecting grammatical information and sentence relevance to the title. The presence of instructive content is indicated by grammatical information carried by part of speech (POS). POS tagging is the process of giving a meaningful tag to each term based on qualified data and even surrounding words. Sentence relevance to the title is intended to determine the sentence's level of connectivity to the title in terms of both word-based and meaning-based similarity, primarily for news documents in Bahasa Indonesia. In this study, we present an alternative sentence weighting method by incorporating news features, POS tagging, and sentence relevance to the title. Sentence extraction based on news features, POS tagging, and sentence relevance is introduced to extract the representative sentences. The experiment results on the 11 groups of Indonesian news documents are compared with the news features scores with the grammatical information approach method (NeFGIS). The proposed method achieved better results. The increasing f-score rate of ROUGE-1, ROUGE-2, ROUGE-L, and ROUGE-SU4 sequentially are 1.84%, 3.03%, 3.85%, 2.08%.
Keywords
Indonesian news; Multi-document summarization; News features; Pos tagging; Sentence extraction; Sentence relevance
DOI:
https://doi.org/10.11591/eei.v11i1.3275
Refbacks
There are currently no refbacks.
This work is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License .
<div class="statcounter"><a title="hit counter" href="http://statcounter.com/free-hit-counter/" target="_blank"><img class="statcounter" src="http://c.statcounter.com/10241695/0/5a758c6a/0/" alt="hit counter"></a></div>
Bulletin of EEI Stats
Bulletin of Electrical Engineering and Informatics (BEEI) ISSN: 2089-3191, e-ISSN: 2302-9285 This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU) .