Handling concept drifts and limited label problems using semi-supervised combine-merge Gaussian mixture model

Ibnu Daqiqil Id, Pardomuan Robinson Sihombing, Supratman Zakir

Abstract


When predicting data streams, changes in data distribution may decrease model accuracy over time, thereby making the model obsolete. This phenomenon is known as concept drift. Detecting concept drifts and then adapting to them are critical operations to maintain model performance. However, model adaptation can only be made if labeled data is available. Labeling data is both costly and time-consuming because it has to be done by humans. Only part of the data can be labeled in the data stream because the data size is massive and appears at high speed. To solve these problems simultaneously, we apply a technique to update the model by employing both labeled and unlabeled instances to do so. The experiment results show that our proposed method can adapt to the concept drift with pseudo-labels and maintain its accuracy even though label availability is drastically reduced from 95% to 5%. The proposed method also has the highest overall accuracy and outperforms other methods in 5 of 10 datasets.

Keywords


Concept drift; Gaussian mixture model; Label propagation; Label scarcity; Semi-supervised learning

Full Text:

PDF


DOI: https://doi.org/10.11591/eei.v10i6.3259

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Bulletin of EEI Stats