Extraction of human understandable insight from machine learning model for diabetes prediction

Tsehay Admassu Assegie, Thulasi Karpagam, Radha Mothukuri, Ravulapalli Lakshmi Tulasi, Minychil Fentahun Engidaye


Explaining the reason for model’s output as diabetes positive or negative is crucial for diabetes diagnosis. Because, reasoning the predictive outcome of model helps to understand why the model predicted an instance into diabetes positive or negative class. In recent years, highest predictive accuracy and promising result is achieved with simple linear model to complex deep neural network. However, the use of complex model such as ensemble and deep learning have trade-off between accuracy and interpretability. In response to the problem of interpretability, different approaches have been proposed to explain the predictive outcome of complex model. However, the relationship between the proposed approaches and the preferred approach for diabetes prediction is not clear. To address this problem, the authors aimed to implement and compare existing model interpretation approaches, local interpretable model agnostic explanation (LIME), shapely additive explanation (SHAP) and permutation feature importance by employing extreme boosting (XGBoost). Experiment is conducted on diabetes dataset with the aim of investigating the most influencing feature on model output. Overall, experimental result evidently appears to reveal that blood glucose has the highest impact on model prediction outcome.


Diabetes prediction; Extreme boosting (XGBoost); Interpretable model; LIME; SHAP

Full Text:


DOI: https://doi.org/10.11591/eei.v11i2.3391


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Bulletin of EEI Stats