Predicting lung cancer risk using explainable artificial intelligence

ABSTRACT


INTRODUCTION
Lung cancer is one of the most common and deadly forms of cancer worldwide.It is estimated that lung cancer accounts for 2.09 million new cases and 1.76 million deaths each year.Early detection and accurate diagnosis of lung cancer are essential for improving patient outcomes and reducing mortality rates [1].A branch of artificial intelligence (AI) known as explainable AI (XAI) aims to develop systems that are simple enough for people to understand [2].XAI is especially significant in the healthcare industry, where gaining the confidence of physicians and patients requires being able to explain how a machine learning model is making predictions.
Creating a machine learning model that can estimate a person's probability of getting lung cancer based on various risk factors is required to predict lung cancer risk using XAI.Personal traits like age, sex, smoking history, family history, and risk factors include examples such as being exposed to environmental toxins [3].It would be necessary to train a sizable dataset of lung cancer patients and healthy individuals for the XAI model used to predict the chance of developing lung cancer [4].To guarantee that the dataset is a representative of the general population and contains a diverse range of people with varying risk factors, it would need to be carefully curated.
Once trained, the XAI model can be used to estimate a person's chance of developing lung cancer based on their risk factors.Additionally, the XAI model would be able to explain how it got to its prediction [5].

Bulletin of Electr Eng & Inf
ISSN: 2302-9285  Predicting lung cancer risk using explainable artificial intelligence (Shahin Shoukat Makubhai) 1277 The most significant risk variables that went into the prediction would be listed or visualized in this explanation.Gaining the confidence of physicians and patients requires being able to explain how the XAI model arrived at its forecast [6].The XAI model can help doctors make informed decisions, and give patients tailored suggestions based on their risk factors.The XAI model can help patients make wise choices about their health and take measures to prevent lung cancer and reduce the risk of its development.
The use of XAI to predict the likelihood of developing lung cancer is a novel strategy that has the potential to significantly advance cancer diagnosis and therapy [7].Although using AI in healthcare is not novel, explaining how AI models make decisions can help boost confidence in the outcomes and give healthcare professionals insightful information.One of the main causes of cancer-related deaths globally, and a significant public health concern is lung cancer [8].Finding people who are at a high risk of getting lung cancer, can help with early diagnosis and possibly enhance patient outcomes [9].Despite the fact that there are a number of established risk factors for lung cancer, including smoking and exposure to toxins in the environment, it is still difficult to identify those who are at high risk.
The development of models that can precisely predict a person's risk of getting lung cancer based on various risk factors, such as age, smoking history, and family history of cancer, may be feasible using XAI [10].Additionally, by comprehending the AI model's decision-making process, medical professionals can learn important information about the risk factors more closely linked to lung cancer, which can help guide in prevention and treatment plans [11].Overall, the application of XAI in predicting lung cancer risk has the potential to significantly improve cancer prevention and treatment, ultimately leading to better patient outcomes [12].
In conclusion, using XAI to forecast lung cancer risk in healthcare is a significant application of artificial intelligence.Using XAI models, we can accurately predict an individual's likelihood of getting lung cancer and explain how the prediction was made.This knowledge can empower doctors and patients to make informed healthcare decisions that may ultimately save lives.
Within this section, you will find comparable studies addressing intrusion detection and prevention.Below is another analysis of the existing literature on this subject.An important subject that has received a great deal of attention recently, is the use of XAI to predict the likelihood of developing lung cancer.In this literature review, we will explore a range of research papers, journals, and publications pertaining to the utilization of XAI for forecasting the probability of lung cancer development.
The paper entitled "early lung cancer diagnostic biomarker discovery by machine learning methods" by Xie et al. [13] proposes the use of machine learning methods to identify potential biomarkers for early detection of lung cancer.The authors start by outlining the present difficulties and restrictions in lung cancer diagnosis and stress the importance of early detection for better patient outcomes.The authors then go on to explain the study's methodology, which involved using a variety of ML algorithms to analyze gene expression data from lung cancer patients and healthy controls in order to find possible biomarkers.The findings demonstrated that possible biomarkers for early detection of lung cancer could be found by using machine learning techniques.The study, according to the authors, offers a hopeful direction for additional investigation, and the creation of early-detection tools for lung cancer.
The paper entitled "biomarkers in lung cancer screening: achievements, promises, and challenges" by Seijo et al. [14] provides an overview of the current state of biomarkers in lung cancer screening.The article starts by emphasizing the significance of early lung cancer detection, which is frequently challenging because symptoms frequently do not appear until the disease has advanced to an advanced stage.The author then goes over the drawbacks of conventional detection techniques like low-dose computerized tomography (CT) scans and chest X-rays.The idea of biomarkers is then introduced by the author.These are quantifiable illness indicators that can be found in biological samples like blood, urine, or tissue.Including genetic, epigenetic, and protein biomarkers, the author also gives an idea of the various categories of biomarkers.The paper then reviews the state of biomarker research for lung cancer screening today, highlighting some of the most promising biomarkers found so far.The author talks about how using biomarkers in screening could increase the likelihood of early discovery and decrease false positives.The article ends with a discussion of the difficulties confronting the field of lung cancer screening biomarker research.Large-scale validation studies, the creation of standardized assays, and the incorporation of biomarkers into already-in-use screening procedures are some of these difficulties.
The paper entitled "lung cancer prediction by deep learning to identify benign lung nodules" by Heuvelmans et al. [15] targeted to develop a dl model to predict lung cancer, and identify benign lung nodules.The use of deep learning models in medical imaging, particularly in the diagnosis of lung cancer, was reviewed in the literature by the author.Deep learning models have the potential to increase lung cancer detection and decrease false-positive rates, according to the study.The research also covered deep learning models' drawbacks, such as the need for sizable datasets and their lack of interpretability.Overall, the literature review stressed the need for more research in this field and supported the use of deep learning models in lung cancer prediction.The paper entitled "artificial intelligence in cancer imaging: clinical challenges and applications" by Bi et al. [16] analyzes the potential of artificial intelligence in the field of cancer imaging.The study starts off with an overview of the problems that are currently plaguing cancer imaging, such as the shortcomings of conventional imaging techniques like CT and MRI and the demand for more precise and effective cancer detection techniques.The survey then examines how more precise and effective methods of diagnosis and treatment could revolutionize cancer imaging with the help of AI.The authors discusses various AI methods, such as computer vision and deep learning, and how they can be used with various cancer imaging approaches, like mammography and radio mics.The study also covers the difficulties and restrictions associated with applying AI to cancer imaging, including the need for a significant amount of high-quality data and the possibility of bias in AI algorithms.The author emphasizes the importance of collaboration between clinicians and AI researchers to overcome these challenges and develop effective AI-based cancer imaging systems.Overall, the literature survey imposes the potential of AI in revolutionizing cancer imaging and the need for continued research and collaboration to realize this potential.
The paper entitled "bias in data-driven artificial intelligence systems-an introductory survey" by Ntoutsi et al. [17] provides an overview of the issue of bias in AI systems that are trained on large datasets.The paper defines bias in AI and gives examples of how it can appear in various situations, such as when making lending and hiring choices.The nature of the data being used, the algorithms used to process the data, and the societal and cultural contexts in which the AI systems are created and implemented are just a few of the topics covered in this piece that can cause bias in AI.The author also discusses various strategies that have been put forth to deal with bias in AI, including the use of fairness measures and the creation of moral standards for AI development.Overall, the article highlights the importance of understanding and addressing bias in AI systems, as these systems increasingly play a role in decision-making processes in a wide range of industries and sectors.
The paper entitled "lung cancer detection and classification by using machine learning and multinomial Bayesian" by Dwivedi et al. [18] explores the application of machine learning and Bayesian methods in the detection and classification of lung cancer.The survey starts off by stressing the significance of early lung cancer detection, which can considerably enhance patient outcomes.The author then gives an outline of Bayesian and machine learning techniques and talks about how they might be used to identify and categorize lung cancer.After that, the survey examines a number of studies that have used machine learning and Bayesian techniques to identify and categorize lung cancer, including studies that have applied deep learning methods and studies that have applied Bayesian networks.The benefits and shortcomings of these techniques are also discussed by the author.The findings comes to a concluding with a discussion of the potential future paths for research in the field of machine learning and Bayesian methods for lung cancer detection and classification.Although these approaches have shown potential, the author points out that more study is required before precise models that can be applied in clinical practice can be created.
The paper entitled "a neural network and optimization based lung cancer detection system in CT images" by Venkatesh et al. [19] proposes a system for lung cancer detection using neural networks and optimization techniques.The system analyses lung CT images to extract pertinent features using a CNN.The most significant features from the extracted features are then prioritised using genetic algorithms, and these are fed into a support vector machine classifier for the final classification of the CT picture as either cancerous or non-cancerous.The pre-processing of CT images, design and training of the CNN, optimisation of the features, and implementation of the SVM classifier are just a few of the processes covered in the article.The article also includes experimental findings on a publicly accessible dataset that demonstrates the high accuracy of the proposed system in identifying lung cancer in CT images.Overall, the study offers an intriguing method for detecting lung cancer using a fusion of deep learning and optimization techniques, and it offers encouraging findings that may help to advance the creation of improved lung cancer screening techniques.
The paper entitled "explainable machine learning framework for lung cancer hospital length of stay prediction" by Alsinglawi et al. [8] develops a machine learning framework to predict the length of stay for patients with lung cancer in hospitals.The goal of the framework is to be explainable, which means that it can offer concise and understandable justifications for its forecasts, which is crucial for clinical decisionmaking.The authors reviewed current machine learning methods for predicting hospital lengths of stay in addition to explainable machine learning approaches.The authors discovered that although machine learning models can successfully predict duration of stay with high accuracy, they frequently lack interpretability, which reduces their usefulness in clinical settings.The authors suggested a hybrid strategy, that combines a rule-based model and a machine-learning model to handle this problem.The rule-based model is used to produce explanations for the prediction, while the machine-learning model is used to predict the length of stay.The machine learning model's features are used by the rule-based model to produce explanations that give a precise and understandable account of how the prediction was made.1279 framework to several other machine learning algorithms using a dataset of lung cancer patients from a hospital.The outcomes demonstrated that this framework outperformed the other models in terms of accuracy, and also offered simple, understandable justifications for its forecasts.The paper entitled "explainable machine learning for lung cancer screening models" by Kobylińska et al. [20] is a comprehensive review of recent studies that use machine learning (ML) techniques to develop lung cancer screening models.The authors emphasizes the significance of early lung cancer detection and the possibility for ML models to enhance screening efficiency and accuracy.The survey centers on research that makes use of explainable ML methods, which make it possible for clinicians to comprehend how the models make their predictions.The author explains the various ML algorithms used in the studies, such as deep learning, random forests, and logistic regression, as well as their advantages and disadvantages.The authors also looks at different feature engineering and feature selection methods applied to these works.The survey contains a thorough discussion of the difficulties involved in creating and testing explainable machine learning models for lung cancer screening, including the lack of readily available data, class imbalance, and interpretability problems.In order to overcome these difficulties and confirm the efficacy of explainable ML models for lung cancer screening, the author emphasizes the need for additional study in his concluding paragraph.
The paper entitled "an explainable AI-driven biomarker discovery framework for non-small cell lung cancer classification" by Dwivedi et al. [21] presents a literature survey on biomarker discovery and classification of non-small cell lung cancer (NSCLC) using XAI-driven frameworks.The survey highlights the importance of biomarkers in early diagnosis and effective treatment of NSCLC.It also discusses various machine learning algorithms and feature selection techniques used in biomarker discovery for NSCLC classification.The survey further emphasizes the need for XAI-driven frameworks to ensure transparency, interpretability, and reproducibility of the biomarker discovery process.The article concludes with the author's proposed framework for NSCLC classification using XAI-driven biomarker discovery.Different applications of machine learning-AI techniques in medical field [22]- [30].
Table 1 shows the summary of predicting lung cancer risk using XAI.The current state of research and practice in the field would be identified, and then areas, where there is a need for growth or development, would be identified, as part of a gap analysis for predicting lung cancer risk using XAI.Here are some measures that could be taken to carry out such an analysis: − Identify current methods for predicting lung cancer risk: reviewing the existing research on machine learning and conventional statistical models for predicting the risk of developing lung cancer would be the first move [31], [32].The key components or risk factors that are usually incorporated into these models (such as age, smoking history, and family history) as well as any drawbacks or difficulties with these methods would need to be identified.− Evaluate the interpretability of existing models: being able to explain how a model generates its predictions in a way that is comprehensible and transparent is a crucial component of XAI [33], [34].The interpretability of current lung cancer risk prediction algorithms should therefore be assessed.In order to do this, it may be necessary to evaluate the models' capacity to offer precise and understandable justifications for how they arrived at a given risk score, as well as the consistency of these justifications for various cases.− Identify areas for improvement: the next stage would be to pinpoint areas that require development or improvement based on the analysis of the current models [35], [36].This could entail locating additional risk variables that could be incorporated into the models, enhancing the readability of current models, or creating brand-new machine learning techniques that are especially intended to be more understandable.− Develop and test new models: the next step would be to create and test fresh models for predicting the risk of developing lung cancer after potential areas for improvement have been found [37], [38].This might entail experimenting with new techniques for interpreting and visualising the findings, incorporating new data sources, or using fresh machine learning algorithms.− Evaluate the performance of new models: it is essential to evaluate the performance of the newly developed machine learning models for predicting lung cancer risk by comparing them to existing models [39], [40].This comparison should consider various critical factors such as precision, interpretability, algorithmic effectiveness, and usability [41].By analyzing these aspects, we can determine the strengths and weaknesses of each model, identify areas for improvement, and ensure that the new models are reliable and effective tools for predicting an individual's risk of developing lung cancer.Overall, a gap analysis for lung cancer risk prediction using XAI would entail a thorough analysis of the existing literature, the identification of areas for improvement, and the creation and testing of new models.By thoroughly assessing current methodologies and establishing novel, more comprehensible models, it could be feasible to enhance the accuracy and utility of lung cancer risk prediction, benefiting both patients and healthcare practitioners.

1281
− Model evaluation: evaluate the model's performance using suitable metrics such as accuracy, precision, recall, F1-score, and confusion matrix.Additionally, model's explainability can also be evaluated by computing feature importance or creating visual explanations.− Model interpretation: interpret the model to understand the factors that influence lung cancer risk prediction.Explanations can be generated using techniques such as feature importance, local explanations, and global explanations.− Deployment: install the developed model in a setting that will allow for the actual application.Making sure the deployed model is reliable, secure, and capable of successfully handling new data inputs is crucial.Establishing ethical and legal criteria for the model's implementation and use is also crucial.Through the application of this methodology, it is conceivable to develop a transparent and easily comprehensible AI model for gauging the probability of lung cancer onset.Such a model can provide valuable support to healthcare practitioners in their decision-making processes, thereby alleviating the challenges posed by this disease.Figure 1 shows the system architecture for predicting lung cancer risk using XAI.
Figure 1.System architecture for predicting lung cancer risk using XAI The steps for predicting lung cancer risk using XAI are as: − Data collection: collect relevant data from various sources such as electronic health records, medical imaging, and clinical trials.− Data preprocessing: clean, normalize, perform feature engineering, and handle missing data in the collected data.− Feature selection: select the most important features that contribute to the prediction of lung cancer risk.− Machine learning algorithm: build a machine learning model that can predict the risk of lung cancer using the selected features.− Explanation generation and visualization: use various XAI techniques to generate and visualize the explanations for the model predictions.− Explanation evaluation: evaluate the generated explanations for their usefulness, accuracy, and comprehensibility.

RESULTS AND DISCUSSION
Using machine learning algorithms to analyse data on lung cancer risk factors and forecast the likelihood of a person getting lung cancer, XAI can predict the risk of developing lung cancer [50], [51].A subset of AI called XAI seeks to provide interpretable and transparent models so that people can understand how a model makes predictions.
You would first need to collect information on various variables that may contribute to the development of lung cancer in order to forecast lung cancer risk using XAI [52], [53].These variables might include demographic data like age, sex, and race as well as lifestyle variables like smoking status, secondhand smoke exposure, and job hazards [54], [55].Once the data has been gathered, you can evaluate it and build a creating a predictive model utilizing machine learning algorithms such as logistic regression.

CONCLUSION
In conclusion, XAI-based risk prediction for lung cancer, has the potential to significantly improve early detection and prevention of the disease.By analyzing patient data such as age, gender, smoking history, and other health factors, machine learning algorithms can identify patterns and risk factors that may be difficult for humans to detect.XAI is particularly important in healthcare, as it can provide transparent interpretation for its predictions, which can help patients and clinicians understand the reasoning behind the algorithm's suggestions.This could lead to increased trust in the technology, and better health outcomes.
Nevertheless, it is crucial to emphasize that AI must not serve as a substitute for human judgment and expertise.Instead, it should function as an instrument to enhance and complement clinical decisionmaking.Furthermore, it is crucial that AI is developed and deployed in an ethical and responsible manner, taking into account issues such as data privacy, bias, and transparency.Hence, when employing interpretable AI for forecasting lung cancer risk, it holds the promise of being a valuable asset in the battle against lung cancer.Its application should be in tandem with alternative diagnostic and preventive methods, placing a significant emphasis on ethical and responsible implementation.

Table 1 .
Summary of predicting lung cancer risk using XAI Data collection: gather information about lung cancer risk factors, including age, gender, smoking history, exposure to the environment, family history, medical history, and other variables, from reputable sources.The information must be varied, impartial, and population-representative. − Data preprocessing: cleanse, normalise, and prepare the gathered data by converting it into a predictive model-friendly format.Handling absent or inconsistent data is also crucial.− Feature selection: determine which elements or factors from the preprocessed data are pertinent and should be incorporated into the predictive model.Different methods, including statistical methods, domain expertise, and machine learning algorithms, can be used for feature selection.− Model development: developing an XAI model that can accurately predict an individual's probability of developing lung cancer based on selected features is crucial.The model should be transparent, easy to comprehend, and able to provide explanations for its conclusions.Popular classification algorithms such as support vector machines, decision trees, and logistic regression can be used to create this model.The XAI model will take into account relevant features such as age, sex, smoking history, family history, and exposure to toxins from the environment.By carefully selecting and curating these features, we can ensure that the model is robust and effective in predicting an individual's risk of developing lung cancer.The model will be designed to provide clear and concise explanations for its conclusions, allowing healthcare professionals and patients to understand the reasoning behind its predictions.By utilizing a transparent and interpretable algorithm, we can enhance the trust and adoption of the model in the medical field, ultimately leading to better healthcare outcomes for patients.

Table 2 .
Comparison of dataset for predicting lung cancer risk using XAI Predicting lung cancer risk using explainable artificial intelligence (Shahin ShoukatMakubhai)