Multiword target-independent transformer-based model for financial sentiment analysis in colloquial Cantonese
Carlin Chun Fai Chu, Raymond So, Ernest Kan Lam Kwong, Andy Chan
Abstract
Tokenization process decomposes a multi-word-span instrument name into several tokens and the transformer attention mechanism handles each token individually, thus hindering the treatment of the related tokens as a single entity. The existence of multiple instruments in a single message further exaggerates the complications and results in low predictive performance. This study proposed the use of sequentially tagged target-independent sentinel tokens to encapsulate multiword instrument aspects for natural language inference model fine-tuning. The encapsulation not only facilitated the attention mechanism to handle an instrument name as a single entity but also enabled the model to handle unseen instruments effectively. Our empirical analysis was based on 5,178 manually annotated instrument–sentiment pairs originated from finance discussion board messages that addressed sentiments of one to four instruments in a single post. The proposed approach consistently outperformed the direct bidirectional encoder representations from transformers (BERT) based approach in terms of recall, precision, and F1-score when handling financial commentaries written in colloquial Cantonese. This study demonstrated the potential benefits of target-independent sentinel token encapsulation for natural language inference. The underlying logic of multiword target-independent encapsulation was expected to hold for other languages, including Chinese, Japanese, and Thai.
Keywords
Aspect-based opinion mining; Natural language inference; Pretrained language model; Sentinel token; Transformer
DOI:
https://doi.org/10.11591/eei.v14i3.8963
Refbacks
There are currently no refbacks.
This work is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License .
<div class="statcounter"><a title="hit counter" href="http://statcounter.com/free-hit-counter/" target="_blank"><img class="statcounter" src="http://c.statcounter.com/10241695/0/5a758c6a/0/" alt="hit counter"></a></div>
Bulletin of EEI Stats
Bulletin of Electrical Engineering and Informatics (BEEI) ISSN: 2089-3191, e-ISSN: 2302-9285 This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU) .