Bulletin of Electrical Engineering and Informatics

Received May 25, 2022 Revised Aug 18, 2022 Accepted Sep 3, 2022 In the last few years, a very huge development has occurred in medical techniques using artificial intelligence tools, especially in the diagnosis field. One of the essential things is brain tumor (BT) detection and diagnosis. This kind of disease needs an expert physician to decide on the treatment or surgical operation based on magnetic resonance imaging (MRI) images; therefore, the researchers focus on such kind of medical images analysis and understanding to help the specialist to make a decision. in this work, a new environment has been investigated based on the deep learning method and distributed federated learning (FL) algorithm. The proposed model has been evaluated based on cross-validation techniques using two different standard datasets, BT-small2c, and BT-large-3c. The achieved classification accuracy was 0.82 and 0.96 consecutively. The proposed classification model provides an active and effective system for assessing BT classification with high reliability and accurate clinical findings.


INTRODUCTION
Clustering is a popular exploratory data analysis tool for gaining and understanding of data structure.It is the task of identifying subgroups in data so that data points within the same subgroups (cluster) are extremely similar while data points within different clusters are very dissimilar.In other words, we strive to discover homogeneous subgroups within the data so that data points in each cluster are as comparable as feasible based on a similarity measure like Euclidean-based distance or correlation-based distance [1], [2].The critical concerns in clustering are; which similarity metric should be used, how many clusters may be found in the data, which clustering method is the "best", how should algorithmic parameters be chosen, are the individual clusters and partitions correct [3].
K-means is one of the most widely used for its characteristics such as; speed and simplicity [4].It has been used in different fields [5], [6].It is an iterative technique that attempts to split a dataset into k separate non-overlapping subgroups (clusters) [7], each of which contains only one data point.It attempts to make intra-cluster data points as comparable as possible while maintaining clusters as distinct (far) as possible.It distributes data points to clusters in such a way that the sum of the squared distances between them and the cluster's centroid (arithmetic mean of all the data points in that cluster) is small as possible [8].
Within clusters, the less variance there is, and the more homogenous (similar) the data points are.If cluster have spherical-like shape, the K-means method is good at capturing data structure.It tries to build a good spherical shape around the centroid at all times.That means, as soon as the clusters have sophisticated geometric shapes, K-means fails to cluster the data [9].In addition, it is necessary to predefine the number of It cannot deal with noisy data or outliers, Cluster having non-convex forms are not suited for detection [1], [8].In addition, the final outcome is controlled by the original initial centroids.
In terms of consistency and quality, a clustering ensemble tries to integrate numerous clustering models to provide a better outcome than the individual clustering algorithms [10], [11].It refers to a situation in which a number of different runs, as a result different clusterings have been obtained for a particular dataset, then to find a single (consensus) clustering [12].Most of existing ensemble methods have tried to obtain the most consistent clustering result with base clusterings, "accuracy" in clustering does not have a clear meaning because it is unsupervised [13].The term "Three-way decision" refers to a group of efficient methods and heuristics employed in human problem solving and information processing.Three-way clustering employs the core region and peripheral (fringe) region to represent a cluster as an application of Three-way decision in clustering [10], [14], [15].Core region provide the pure clustering for objects and as a result it can be used in improving the clustering.Therefore, it was suggested to be merged with K-means algorithm in order to be improved and reduce its sensitivity problem with random initial centroids.This hypothesis was evaluated in this paper through practical work using some experiments.

METHOD
The work in this paper is based on two fields of methods; traditional clustering wit k-means algorithm and ensemble clustering that can be combined into proposed work in order to achieve more performance.

K-means algorithm
The unsupervised classification of patterns into groups (clusters) is clustering [16].The most wellknown and often used clustering technique is the k-means algorithm.In the literature, several k-means extensions have been proposed.K-means technique and its expansions are always impacted by initializations with a necessary number of clusters a priori [17], while being an unsupervised learning to clustering in pattern recognition and machine learning.In other words, the k-means algorithm isn't quite an unsupervised clustering technique [1], [17].Despite its widespread use, the algorithm has certain drawbacks.Includes issues with centroids that are randomly initialized, resulting unexpected convergence [1], [18].Therefore, running the algorithm multiple times, different compilation results can be obtained each time, depending on initial centroid.Different solutions have been proposed to solve the algorithm problems [18], [19].

Cluster ensemble
Cluster ensemble techniques seek to develop stronger and more resilient clustering solutions by combining information from several data partitioning [20].In another sense, it seeks to integrate various clustering models in order to create a superior outcome [18].The ensemble technique was initially developed and extensively researched in the supervised learning domains.Because of its effectiveness in classification problems, academics have sought to adapt the similar paradigm other unsupervised learning areas during the last decade or so, specifically clustering issues, because of two aspects [11]: i) there is usually no prior information about the underlying structure or any specific features that we wish to uncover, by forcing a certain structure onto the data, various clustering algorithm might generate different clustering results for the same data; ii) there is no one clustering method that can work consistently well for various issues, and for the choice of clustering algorithms for a specific problem there are no clear rules to follow.

Three-way method
As known, hard clustering uses two-way decision in order to produce a cluster, while there is need to deal with the uncertainty world that need more representation.Three-way method is based on three decisions to give more than single region of clustering [21].Three-way Decision state that "according to the positive, boundary, and negative regions of a set, one can make a three-way decision: accept, abstain and reject" [22].Accordingly, it can be considered as efficient methods and heuristic methods widely utilized for the resolution and processing of decision-making problems [22].Below some basic fundamental facts regarding three-way clustering.Suppose that C={C1,..., Ck}is a family cluster of universe V={υ,…, υn}.It uses a pair of sets to represent a Three-way cluster Ci [21].

Measures of evaluation
Clustering assessment, also known as cluster validity, is a key procedure in assessing the efficacy of learning technique in finding important groupings.A decent cluster quality measurement will assist to compare different clustering methods and to analyze whether an approach is preferable than another [21].For evaluating the performance of algorithm, we used: a. Davies-Bouldin index [24], [25] (DB hereafter) Which a lower value is better.b.Average Silhouette index [22] (AS hereafter) Which a higher value is better.c.Accuracy (ACC hereafter) Which a higher value is better.

PROPOSED ALGORITHM
The proposed algorithm is shown in Algorithm 1, is based on merging three-way technique with Kmeans algorithm.This can be done through several steps.First the traditional clustering-based k-means must be done for multiple (m) runs with different initial centroids.At each run, new initial centroids are provided, there is different results are produced.As a result, there is (m) different clustering, each object in data would be member to (m) clusters.Then these clusters would be introduced to ensemble three-way technique in order to construct "core" through intersection the objects' clusters from different runs, core region that contains the clustered objects purely and fringe region that contains other objects as shown in Figure 1.

RESULTS AND DISCUSSION
The practical test in this paper was executed using popular data sets that are extracted from "UCI machine learning repository" site.The details of these datasets are listed in Table 1, with different details (samples, attributes, and classes), they are used for clustering task.The work in testing step of proposed algorithm was achieved through experimentation of traditional k-means algorithm and ensemble k-means algorithm and then different metrics were computed for each one.It was executed with the traditional k-means algorithm and ensemble k-means algorithm.For each data set, there are three experiments were done in order to enable the comparison between the traditional kmeans and ensemble k-means through computing the metrics (DB, AS, ACC) with each experiment.The experiments contain, the best k-means performance, the average k-means performance, and then the performance of ensemble k-means.From Tables 2-4, it possible to notice an improve in the results for the performance of Core Region compared to best performance and average performance for implementation of the traditional K-means algorithms, the lower value for metrics (AS, DB) while the higher values of ACC.This is due to the exclusion of elements in the Fringe region.Then by synchronizing the results to align each result and matching the names of the clusters by uniting the clusters labels, and by intersecting the clusters, the most closely related objects in each cluster were identified (core region), and the marginal elements that are usually within the cluster boundary were isolated (fringe region).By excluding marginal objects, it became clear that the results could be improved.

CONCLUSION
We applied the Three-way clustering re-ensemble method after modifying its algorithm to allow and improve the results obtained for the K-means algorithm after applying it several times.As the produced results that was shown from ensemble K-means, it is emergent performance.This is a good step for more related works in the future, as this method can be exploited by resetting centroids and then resetting the affiliation of the new incoming elements to the dataset without the need to repeat the process by measuring the distance between the new elements and the generated centroids.

Table 2 .
Bank and forest datasets performances

Table 3 .
Seeds and sonars datasets performances

Table 4 .
Wine data set performances