Levenberg-marquardt backpropagation neural network with techebycheve moments for face detection

Received Mar 12, 2020 Revised Apr 29, 2021 Accepted Jul 23, 2021 Face detection is an intelligent approach used in a variety of applications that identifies human faces in digital images. This work presents a new method which composes of a neural network and Techebycheve transforms for face detection. For feature extraction, Tchebychev transform was applied, in which a discrete Tchebychev transform is given for different sampling patterns and several samples here were performed on color images. A LevenbergMarquardt backpropagation neural network was applied to the transformed image to find faces in the face detection dataset and FDDB benchmarked database. Model performance was measured based on its accuracy and the best result from the newly proposed method was 98.9%. Simulation results showed that the proposed method handles face detection efficiently.


INTRODUCTION
Face detection is one of the computer vision terms that covers a person's face and informs about his or her area. This technology can acquire facial features and does not account for anything else like bodies, houses, vehicles, and plants. Facial detection is considered to be the location of an object for a special case. In identifying the class of an object, the activity is to find and separate the total area of an element and the size of a particular class of scanned and digitalized images. The models include the supreme torso, walkers, and vehicles. The first step in image applications is to locate and detect the human face. It is important to filter and track facial recognition [1]- [3]. Face detection findings are difficult to obtain for three basic reasons: Firstly, though most faces are arranged with similar facial highlights arranged in a generally similar spatial configuration. There is a possibility that there can be very inflexible and textured contrasts between the faces. These components of volatility are usually due to the necessary contrasts in "facial appearance" between people [4]- [6]. Secondly, facial detection or recognition is difficult to identify on the specific premise but significant basic features, such as glasses or a mustache, may be available or completely absent from the face. Also, these reflections and features, when present can cause other basic features on the face (for example, glasses can reduce glare in eye-catching shades) and have a variable appearance on their own (for example, glasses are available in a wide range of designs) [7]. This adds even more to the scope of authorized facial designs that a comprehensive facial detection framework must address. Third, abnormal imaging conditions may interfere with facial detection in an unrestricted state. Because faces are essentially three-dimensional structures adjusting the shading of the light source can cast or eliminate huge shades from a particular face and is now much larger compared to 2D facial samples.

2549
There are many types of moments for the extraction of highlighted features, Hu (1962) [8] presented the methods of the main invariants, which suggested a strategy for determining the second invariants using algorithmic techniques. He used mathematical techniques to create many invariants. In any case, geometric moments are not obtained from orthogonal functions or asymmetric ability groups and are difficult to noise especially for higher-order moments [9]. In this way, the second invariants of Hu have restricted applications. Many works in writing have introduced ways of dealing with second-construct moment invariants, for example, Zernike moment [10], pseudo-Zernike moment [11], and Legendre moment [12]. However, in these methodologies, the accuracy of the detection deteriorates due to the discrete estimation of the continuous integers [13]. To determine these problems, Mukundan proposed discrete orthogonal Tchebychev moments [14]. Once element extraction has to use a classification method, Neural Networks are one of the most popular machine-based artificial intelligence algorithmic calculations for characterization and classification at present. It was definitely proved after some time that neural network performs better and different algorithms with accuracy and speed. With different variants such as neural networks [15], convolutional neural networks (CNN), as such: [16]- [18], and deep learning and so on [19], [20]. Exploration techniques rely on layouts in the previous to extract specific features and classification, but there are many strategies for identifying faces by identifying the facial components of the neighborhood and command using factual and geometric models for the human face [21]- [23]. First, low-level review settings to share visual highlights using image properties, eg, edges, power, shading, motion, or general measurements. The paper is composed as; section 2 provides a summary of hypothetical and rational basic elements, section 3 presents the proposed technique, section 4 presents the results and discussion of the implementation of the proposed technique for facial detection, and section 5 concludes.

BACKGROUND
Different techniques used in the proposed method have been presented in this section that will be explained in the subsequent:

Discrete tchebychev moments
There are many isolated symmetrical and orthogonal moments, for example, the Tchebychev moments. Using discrete symmetric Tchebychev polynomials as the main work for frame moments eliminates the discrete estimate associated with constant moments. In this work, use Tchebychev minutes to change the geodesic image rather than the image itself. The isolated normalized Tchebychev polynomials with the associated repeat condition are called [13]: where: Now the 2D normalized discrete Tchebychev moments Tmn for an NN generalized geodesic transform dx=(x)=[dx (i, j)] of the image x (i, j) are given by: With m, n=0,…, N-1. pm(i): The m th normalized discrete Tchebychev polynomials for the variable i.

Levenberg-marquardt backpropagation neural network
The architecture of the backpropagation uses the design of the reprocessing algorithm [24]. It was re-propagation that was crafted by generalizing Widrow-Hoff's learning rule to multi-layered organizations and non-differential exchange work. Input vectors and associated target vectors are used to prepare an organization until it can approximate a resource. Associate input vectors with explicit target vectors or input vectors as defined in this study. Network with bias, a sigmoid level, and a direct linear level is equipped to approximate any resource with a limited number of gaps and discontinuities. There are two ways to calculate propagation; back and forth. The way forward is to put a network at the forefront, simulation, starting or initializing weight, and prepare the network. The biases and the weights of the network are updated on the contrary. See reference [24] for further subtleties. The Levenberg-Marquardt algorithm aimed to evolve towards the second query training speed without saving the Hesse framework matrix. By the time the presentation work contains an aggregate type of squares as is usual in the preparation of feed organizations, at this point the Hessian network can be approximated as: and the gradient can be computed as: Where J is the Jacobian matrix framework in which the first subordinates of the organization errors in relation to weights and trends, and a vector of network errors is e. The Jacobian matrix can be programmed through a standard reprocessing procedure (see [25]) which is much less complicated than the realization of the Hessian matrix. The Levenberg-Marquardt algorithm uses this approximation to the Hessian matrix in the Newton-like update: At the point where the scalar μ is zero, this is just Newton's strategy, using the large Hessian matrix. When μ is huge, it becomes a small slope in advance. Newton's fastest and most accurate technique is at least almost a minimum error, so it's about switching to Newton's strategy as quickly as expected, given the circumstances. Consequently, μ decreases after each successful advancement (decrease in work) and grows just as speculative advancement would resume presentation work. Consequently, the presentation capacity will be systematically reduced with each algorithm cycle [24].

Face image dataset
The dataset and the face detection benchmark (FDDB), a dataset of facial areas intended to examine the problem of unrestricted facial recognition. In this dataset, the 5171 annotations appear in many 2845 images taken from the Faces in the Wild data index [26], in this work using 5000 human faces for three-stage training, validation, and testing.

Performance measurement
The evaluation of the proposed technique will be measured by using the evaluation tools (precision, accuracy, recall, and F1_Score). These measurements depend on four initiatives TP, TN, FP, and FN.

THE PROPOSED METHOD
This section describes the design and implementation of the face detection system, the overall diagram of the proposed system is shown as Figure 1.

Tchebychev-neural network
Tchebychev moments and neural network (LMBP) are acceptable means of decoupling images and classification. In this work, a shift that was a collaboration between tchebychev and the neural network (LMBP) aimed to expose unusual exposure compared to using another shift or traditional (CNN). The use of Tchebychev moments changes and transformation give special highlights to the first image. Applying a neurological system (LMBP) will filter out the best of these specific features and provide a dominant performance in facial detection. The tchebychev -neural network measurement (called TNN) refers to the classification of measurements made on windows smaller than the first images, see

Threshold
A threshold is the last step used to remove the ambiguity of the output from the neural network and to classify the input image as face or non-face. The threshold selected by (trial and error) process, in (12) shows the selected threshold.

Face detection steps
This section describes all the steps for the proposed method: a. The face area should be identified by using a bounding box in the original face image within the preprocessing phase. First, convert an RGB image to gray-scale and then using multiple sizes of the crop box, start with a sliding window from 16-by-16 pixels that scan for the entire image. After that, the sliding window (crop box) increased its size and did the same thing. The box depends on the original size of the captured image. It should be square of unified size N×N for all sub-images. In this work, we found that the crop box size has no significant effect on the detection except the increasing of execution time, for that the image size of 32×32 is selected. b. The result from the first step xi (32, 32) passes through a discrete orthogonal Tchebychev transform to get Tchebychev moments Ti (32, 32). The low-pass area of size 12-by-12 was selected from the Tchebychev transform matrix. c. The resulting dataset from the previous step was separated into three groups for training, validation, and testing. Common ratios used are: -70% train, 15% validation, 15% test -80% train, 10% validation, 10% test -60% train, 20% validation, 20% test After the experiment, the best performance result with sizes 70.0%, 15.0%, and 15.0% is 3500, 750, and 750 human faces respectively. d. The last step is to apply a neural network (LMBP) with many input features (144) get from Tchebychev moments with the best number of hidden layers (41), the image classified according to the threshold. e. In this proposal the learning rate used is 0.6, activation function was (symmetric saturating linear transfer function), and the number of epochs is 17 with 5 number of runs.

RESULTS and DISCUSSION
This section shows the tests on how to choose the optimal variables used in the proposal and how they affect the detection accuracy, many tests were introduced to improve the accuracy of the proposed architecture. The neural network has three important parameters that highly affect the detection accuracy. These parameters are learning rate, number of hidden layers, and activation function. Various values of learning rate tested to find the best accuracy as in Figure 3. We found that the learning rate with a value of 0.6 is the best. Also, the effect of the number of hidden layers on the accuracy is tested. In this test many numbers of hidden layers are used and measuring the accuracy, the result of this test shows in Figure 4. It is concluded that the best number of hidden layers are forty-one layers. The third parameter is the activation function. Many activation functions can be used in this proposal and each one has a specific effect on the accuracy of the face detection. The activation functions can be divided into two groups according to the output value range, so each group is tested alone. The first group has an output value range from (-1 to 1), while the second group output value ranges from zero to one. So these activation function groups are tested to find which one gives the best accuracy as shown in Figure 5 and The image in this proposal is divided into many blocks. Various block sizes were checked, the accuracy for each block size is measured. The best accuracy is achieved when the block size is (12x12) as shown in Figure 7. The performance of the suggested method is evaluated by measuring the root mean square error when using the Tchebychev moments, with block size (12×12), activation function (satlins), and threshold (0.4). Two types of datasets were used for this test (random and specific dataset). The minimum RMSE for the specific dataset is (0.1104) and for the random dataset is (0.1326) when using the number of the hidden layer equal to forty-one as in Figure 8. It is concluded that the use of the specific dataset gives a better result than using the random dataset. The face detection accuracy of the proposed method was up to 98%. Figure 9 shows samples of detected face images. While Figure 10 shows samples of face images where the system fails to detect. Figure 11 evaluated the proposed method via specificity, accuracy, precision, recall, and F1_Score. Finally, we summarized the training process such that the input features were (144) get from Tchebychev moments, the best number of hidden layers (41), the image classified according to the threshold equal to (0.4), The learning rate used is 0.6, the number of epochs was seventeen. At the end of the training.

CONCLUSION
The proposed face detection method based on Tchebychev transform using the LMBP neural network has been introduced and evaluated. The use of the LMBP neural network improved the face detection performance, as shown from comparing with other methods. The failure in the detection of some of the faces may be due to facial expression, rotation of faces with an angle more than 45 degrees, or maybe affected by glasses, and occlusion. The performance in general was promised and may be enhanced when adding some other features to the current suggested algorithm.