Multiword target-independent transformer-based model for financial sentiment analysis in colloquial Cantonese

Carlin Chun Fai Chu, Raymond So, Ernest Kan Lam Kwong, Andy Chan

Abstract


Tokenization process decomposes a multi-word-span instrument name into several tokens and the transformer attention mechanism handles each token individually, thus hindering the treatment of the related tokens as a single entity. The existence of multiple instruments in a single message further exaggerates the complications and results in low predictive performance. This study proposed the use of sequentially tagged target-independent sentinel tokens to encapsulate multiword instrument aspects for natural language inference model fine-tuning. The encapsulation not only facilitated the attention mechanism to handle an instrument name as a single entity but also enabled the model to handle unseen instruments effectively. Our empirical analysis was based on 5,178 manually annotated instrument–sentiment pairs originated from finance discussion board messages that addressed sentiments of one to four instruments in a single post. The proposed approach consistently outperformed the direct bidirectional encoder representations from transformers (BERT) based approach in terms of recall, precision, and F1-score when handling financial commentaries written in colloquial Cantonese. This study demonstrated the potential benefits of target-independent sentinel token encapsulation for natural language inference. The underlying logic of multiword target-independent encapsulation was expected to hold for other languages, including Chinese, Japanese, and Thai.

Keywords


Aspect-based opinion mining; Natural language inference; Pretrained language model; Sentinel token; Transformer

Full Text:

PDF


DOI: https://doi.org/10.11591/eei.v14i3.8963

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Bulletin of EEI Stats

Bulletin of Electrical Engineering and Informatics (BEEI)
ISSN: 2089-3191, e-ISSN: 2302-9285
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).