Prediction measuring local coffee production and marketing relationships coffee with big data analysis support

Anita Sindar Ros Maryana Sinaga, Ricky Eka Putra, Abba Suganda Girsang


Following the increasing enthusiasm of the coffee market in Indonesia, a machine learning model is developed to study the relationship between coffee producers, consumers, production, and the market. Machine learning work flow is constructed in various stages; explore, develop, and validate the models. In this research, the building model predicts the production and market of late departure coffee based on labeled and unlabeled variables. The best predictions from the trained type of model algorithms of machine learning like tree accuracy of 85.7%, support vector machine (SVM) accuracy of 82.9%, and k-nearest neighbors, the accuracy of 82.9%, which produce three categories, namely, high production of 2 provinces, medium production of 21 provinces, and low production of 11 provinces. The accuracy classification is supported by the AUC value obtained for a high class, a medium class, and a low class. In addition, local coffee marketing modeling used in logistic regression was found with an accuracy of 88.9%, aiming to classify coffee interests between Arabica coffee and Robusta coffee. We found that the AUC value logistic regression for arabica coffee is about 0.94, while for Robusta is 0.92. The analysis of the classification modeling results shows a high level of accuracy of 93.0%.


Big data analysis; Coffee production; Machine learning; Modeling; Prediction

Full Text:




  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Bulletin of EEI Stats