Comparative analysis of augmented datasets performances of age invariant face recognition models

,


INTRODUCTION
The paper aim at carrying out a comparative analysis of augmented datasets (FG-Net dataset and morph datasets). Both performances (accuracy, loss function, mean square error (MSE), and mean absolute error (MAE)) for trait-ageing invariant face recognition (AIFR) systems are compared. The significate of the study is that both datasets are used for AIFR. Data augmentation via the addition of noises to both datasets at the preprocessing phase greatly increases the accuracy and other parameters of AIFR.
-Literature review Many comparisons exist in the literature between the performances of augmented datasets on age invariant recognition systems. The augmented dataset is usually used independently of each other to verify the invariability of designed face recognition systems. Two of the most common face image datasets used in age-invariant face recognition, FG-NET and MORPH [1] are usually at the centre of comparisons made to check the performance of age-invariant face recognition system. The goal is to have a good performance for 1357 all datasets used for face recognition. The results got from augmented standard datasets [2] for face images are usually based on the robustness of designed face recognition system models to variations in pose, illumination, shape, and texture. It is customary to have several ways of augmenting the datasets based on the goal of the researcher and the challenge that needs to be solved. Comparisons between the performance of augmented datasets on age-invariant face recognition systems extend to niche applications like finding missing children who are discovered at a much later time (longer than ten years) [3]. The importance of comparisons, especially for niche applications, is emphasized in [4]. The factors that degrade the performance of face recognition systems are so numerous that it is sensible to have as many augmented datasets from as many providers as possible. The abundant evidence of the robustness of any age-invariant face recognition system is usually presented after it has passed the rigorous condition of being subject to varieties of augmented datasets [5]. The evidence is generally in the form of performance metrics like accuracy [6]. These performance metrics are used to gauge how well face recognition systems can accurately recognize face images of various subjects regardless of the source of the image, the noise added to the image and other forms of augmentation.
The region of the face used [7] to develop the age-invariant face recognition model plays a significant role in the design of age-invariant face recognition systems that are robust. The region of the face, when extracted from various datasets, could give non-identical performances on the designed face recognition model. This submission extends to other face recognition models designed to checkmate the negative effect of trait ageing. At the centre of comparisons of augmented datasets is accuracy [8], [9]. The precision with which the designed face recognition model can identify subjects' facial image after they have been designed to discriminate between real and generated images, estimate age and identify subjects. New applications of age-invariant face recognition systems like soft biometrics [10] take the comparison between augmented datasets seriously. The verification/identification process is thoroughly confirmed for as many augmented datasets as possible to verify the accuracy of the face recognition system. The algorithms used to develop age-invariant face recognition systems such as support vector machine (SVM) [11], principal component analysis (PCA) and the like, perform differently for various forms of augmented datasets. The authors in [12] tested the recognition system performance of a modelled age-invariant face recognition system after passing face images through the designed and optimized adaptive neuro-fuzzy inference system (ANFIS) classifier. The reviews made in [13] and [14] give in-depth studies of the performances of various augmented datasets on designed age-invariant face recognition system. The studies focused on the challenges of face recognition as it relates to the verification of designed face recognition systems using different augmented datasets. The studies were able to identify the challenges faced by adaptive and age-invariant face recognition systems through extensive and thorough comparisons using different augmented datasets.
-Fg-Net dataset specifications and complexities There are 1002 images of 82 various persons with ages spanning from birth to 69 years in the FG-NET database. The most common age group in the dataset is within the (<41 years) age group. Some of the pictures of subjects in the FG-NET database were digitally taken recently while others were scanned copies of the original photographs taken from personal collections of the subjects. The quality of the images in the FG-NET database depends significantly on the skill of the photographer, the condition of the photograph, the sophistication of the imaging tools used and the durability of the photographic paper found in personal collections. Thus there are variations in sharpness, illumination, resolution, background, facial expression, camera angles, and facial hair. These variations make the FG-NET database a good one for AIFR research and samples of same subject (person) ranging from ages 2 to 43 is as shown in Figure 1. dataset was collected in uncontrolled environments (the pictures were taken in real-world conditions) and thus has a very unique range of facial expressions. The photographs in the MORPH database were taken over a period of four years and the database is regularly updated. MORPH Album 2 contains 55,134 face images of 13,000 subjects along with metadata that shows that majority of the images were acquired in a period of four years. Example images, age progression, and statistics of MORPH Album 2, as shown in Figure 2. -Comparative analysis of Fg-Net ageing dataset and morph dataset II Some of the remarkable dissimilarity between the Fg-Net and the Morph datasets is that children dominate photos from Fg-Net. In contrast, most pictures from Morph are mainly from adult persons [15]. Also, the age gap between the images of the same subjects in the Fg-Net dataset is significantly wide-ranging as compare to the once in Morph dataset, which is relatively small [16], as shown in Table 1. Besides, Fg-Net contains subjects from one caucasian race, whereas Morph dataset contains the caucasoid, negroid, and mongoloid races [17]. Furthermore, the total images (samples) in Fg-Net are 1002 with 82 subjects, while that of Morph is 55,134 with 13,658 subjects [18]- [22], while details of both datasets are as shown in Table 2 and Table 3 and Table 4 depicts the Morph numbers of facial image and decade-of-life. However, the Similarity is that both datasets contain face images of the same subjects at various age gaps. The sole reason makes both ageing datasets and can be compared experimentally base on this fact [5]- [8].   2. RESEARCH METHOD 2.1. Pre-processing the FG-NET database for deep learning A mammoth amount of data is needed to train a deep neural network. The FG-NET dataset has only 10-15 face images of each subject at different ages amounting to 1002 images. The size of the FG-NET dataset is too small for deep neural network application. We preprocessed the images in the database by adding noise to it. The addition of noise to the FG-NET dataset helped increase the total amount of pictures available for deep learning application. The augmentation of the dataset was done at the preprocessing stage to allow for improved feature extraction. The following steps were followed to augment the FG-NET dataset with noise. a. Convert all images to three channels with matrix entries for red, green an blue (RGB) for uniformity.

Pre-processing the morph database for deep learning
A mammoth amount of data is needed to train a deep neural network. The MORPH Album 2 dataset has only 1-5 face images of each subject at different ages amounting to 13,000 images. The size of the is too small for deep neural network application. We preprocessed the images in the database by adding noise to it. The addition of noise to the MORPH Album 2 dataset helped increase the total amount of pictures available for deep learning application. The augmentation of the dataset was done at the preprocessing stage to allow for improved feature extraction. The following steps were followed to augment the MORPH Album 2 dataset with noise: a. Convert all images to three channels with matrix entries for red, green an blue (RGB) for uniformity.

Feature extraction and classification using convolutional neural network
Over a million images from the ImageNet database was used to train the Inception-ResNet-v2 convolutional neural network (CNN). The images that was used to train the Inception-ResNet-v2 CNN forms part of the databased for the imagenet large-scale visual recognition challenge. Inception-ResNet-v2 has 164 layers and can classify images into 1000 object classes. The CNN accept images of size 299x299 for classification. The Inception-ResNet-v2 was used in this study to learn features for age invariant face recognition using a process called transfer learning. Transfer learning is the process of adapting a pre-trained neural network for another task for which it was not originally trained. Transfer learning was used to learn age invariant features from the FG-NET and MORPH datasets for AIFR. Figure 3 and Table 5 shows a summary of the network architecture of Inception-ResNet-v2. In order to use the Inception-ResNet-v2 network, MATLAB R2018b was installed and downloaded the installer of the deep learning toolbox model for Inception-ResNet-v2 network from [27]. Run the installer to install the Inception-ResNet-v2 network in MATLAB R2018b.

Training the deep learning model
The preprocessed FG-NET dataset was used to retrain the Inception-ResNet-v2 neural network for AIFR. The process was possible via Transfer learning. The transfer learning process is enumerated below: a. The preprocessed FG-NET images are loaded into MATLAB using the image datastore object. b. The images are then splited into a validation set (20% images) and a train set (80%). c. The images in the train set are resized to 299x299 for compatipility with Inception-ResNet-v2. d. Inception-ResNet_v2 is run by MATLAB. e. Training preferences are specified. f. Begin the transfer learning process using the augmented FG-NET dataset. g. Check the validity of the transfer learning process using the validation set. h. Estimate the network's accuracy.

Using the trained deep learning model for face recognition in FG-NET and Morph datasets
The retrained neural network was used for testing images from the Morph Album 2 dataset using the following process: a. Image is read from the MORPH Album 2 dataset. b. All images are converted into an RGB matrix c. Viola-Jones algorithm is used to cop and detect faces. d. All images are resized to 299x299, resize the image to 299x299. e. The retrained Inception-ResNet-v2 neural network is loaded. f. Load the image into the retrained neural network for classification. g. Compare predicted class to the ground truth.

Mean squared error
The mean squared error (MSE) is a predictor value that is always positive. A score closer to zero better. Where, N, in this instance, is the sums of iteration,f_i is the training loss values and y_i is the testing loss values. Consequently, MSE is calculated, as presented in (2) [36], [37].

Mean absolute error
The mean absolute error (MAE) is a measure of the disparity between two values. In this circumstance between y_i which is the values of training loss and y _i, which is the value of the testing loss, n is the sums of iteration. Consequently, MAE is calculated, as presented in (3) [38].
The MAE is the mean of the total errors (| −̂|).

Loss function
Categorical cross-entropy is a loss function used to calculate the variation concerning two probability disseminations. This dissimilarity is computed for respectively iteration in the training and testing dataset. The technique to calculate the likelihood variation is as shown in the (4) [39].
Where is the input value, is the true value, ŷ is forecast value by the method, is the sum of iteration and is the sum of class labels. Wen et al. [40] recommended a loss function called centre loss in adding to using the definite cross-entropy loss. The idea is to growth the discriminative power of the completely learned features by declining the intra-class variations.
The centre loss function is as shown in (5).
While is the ℎ class centre of the features and is the sum of iterations. Wen et al. [40] detected that (5) seen not accomplish the expected result. Two modifications were done by Wen et al. [40] to decide this problem. First, the modification is to bring up to data the centers founded on a mini-batch as a additional for the entire dataset. For the second modification is the institution of two new variables and the − . α is used to regulate the learning rates of the centre, and the δ-function is a Boolean that results in 1 if the situation is true and 0 if the situation is false. In (6) defines the updated function of the class centre. The novel centre of each class is as shown in (7): While ∈ [0, 1]. Wen et al., [40] introduce λ to balance the two-loss functions of the total loss function. The complete function is shown in (8).
In the event is set to 0, the total loss function is equal to the categorical cross-entropy function is used.

Results and discussion
This section deal wth the results and comparative analysis of augmented datasets (FG-Net and Morph II) performances for trait-ageing invariant face fecognition system. Figure 4 shows FG-Net and Morph datasets training, and testing accuracies results in comparative analysis. With FG-Net dataset outperforming the Morph dataset with a mean testing accuracy of 0.15%. While Figure 5 shows FG-Net and Morph training and testing loss (error) results in comparative analysis. With mean FG-Net dataset output performance, the Morph dataset testing loss of 71%. Table 6 shows a summary of the result performance of augmented datasets of Fg-Net and Morph dataset. All this implied that FG-Net dataset have will perform better than Morph dataset during deployment of these model in age invariant face recognition (AIFR) system

Engr. Dr. (Mrs) Okokpujie Imhade
Princess is a researcher/lecturer in the Department of Mechanical Engineering Covenant University, Ota, Ogun State Nigeria. She is currently the Chief Editor to Covenant Journal of Engineering Technology (CJET). She is also a reviewer and editor of many international/local journals and conferences. Her areas of research interest are Design and Production, Advanced Manufacturing such as Machining, Tool Wear, Vibration, Nano-lubricant, Energy Systems, Mathematical Modeling, Optimization, Mechatronics and also a Multi-disciplinary Researcher. Dr. I. P. Okokpujie is an active researcher who has authored/coauthored over 106 peer-reviewed publications in reputable journals and international conferences. In 2017 to 2019, she was the technical secretary to the International Conference on Engineering for a Sustainable World (ICESW) index in Scopus and ISI data based through the IOPs publisher. She is a Registered Engineer of the Council for the Regulation of Engineering in Nigeria (COREN), a member of the Nigerian Society of Engineers (NSE), member Nigerian Institution of Mechanical Engineers and Association of Professional Women Engineers of Nigeria (APWEN). She is currently the National Technical Secretary and Vice-Chairman of APWEN Ota Chapter in Ogun State, also, the Technical Editor to Journal of Nigeria Institution of Mechanical Engineers. She is one of the top-rated researchers in her institution, and she is happily married and blessed with children. Dr. I. P. Okokpujie is very passionate about the Education of the Girl Child and also offers quality mentorship to the Young Engineers.