The detection and classification of acute myeloid leukaemia blood cell images based on different YOLO approaches

ABSTRACT


INTRODUCTION
Acute myeloid leukaemia (AML) is one subtype of the four-leukaemia blast cell that generally originate in the bone marrow and human blood [1] and it can also be classified into distinctive categories by the type of mature and immature white blood cells (blast cells).Unlike any other type of cancer cell, the stages of leukaemia are differently determined based on blood cell counts and the accumulation of leukaemia cells within the organs [2].Since the blast cell can be propagated and differentiated at a faster pace, the early-stage detection of leukaemia can save a lot of lives.Inaccurate diagnosis can have an adverse effect directly on the patient for further prescribing the drug regimen as well as increase the cost of the treatment and also could result in a complicated treatment procedure.The identification of the abnormal white blood cells causing the disease may provide important information to the physicians in the determination of the appropriate treatments [3].A microscopic examination method is an early method for categorizing and justifying the cause of the disease.Examination results could vary depending on sample preparation processes and also the experience and personnel training of examiners, which lead to inter-and intra-variation of the examining results [4], [5].Although this process is being done by the medical-trained examiner who knows the interpretations of medical data, it is potentially misdiagnosed due to its subjectivity, and the complexity of blast cell characteristics of AML.It is also challenging to assign standardized images with time constraints.Therefore, the study of blast cell detection has significant importance in early screening of medical diagnosis for further effective planning of the prescription of the drug regimen.
The rapid growth of deep learning approaches in various real-world applications has led to emerging medical research utilizing large amounts of medical data, including electronic health records, texts, videos, sensor data, and medical imaging [6].Deep learning methods, including convolutional neural networks (CNNs), have achieved remarkable success in medical image analysis [7].Previous research has employed various image processing algorithms [8]- [12] and CNN-based classification models [13]- [16] for the segmentation and classification of leukocyte cells.However, there are still many challenging problems due to the usage of different equipment, data quality, data availability, limited resolution in noisy images, and so on.In recent years, CNNbased object detection approaches have gained attention in the field of leukocyte cell analysis.Wang et al. [17] introduced CNN-based object detection approaches, including single shot multibox detector (SSD) and you only look once version3 (YOLOv3), for recognizing 11 categories of peripheral leukocyte.Their work demonstrated promising classification results in this context.Additionally, Shakarami et al. [18] proposed fast and efficient YOLOv3 (FED) for the detection of blood cells.Their approach aimed to achieve fast and efficient performance in blood cell detection.These studies revealed the importance of leukocyte cell stage identification in predicting patient health and guiding treatment decisions.Inspired by this, we aim to investigate a relatively large dataset with more stage categories of leukocyte cells using a deep learning approach.Matek et al. [19] reported the recognition of blast cells in AML using a ResNeXt CNN classification model, achieving excellent results for classes with over 400 images but facing challenges with classes having fewer than 100 images.Balancing sample sizes for each class is critical for effective CNN model training.
Although several segmentation and classification models have been used in leukocyte cell analysis, there is still a research gap that can be filled by object detection-based models.Two well-known categories in object detection approaches are the region proposal-based two-stage detector and the regression/classification based onestage detector.The two-stage detector offers better detection accuracy and adaptability, utilizing region of interest pooling (RoIPool) and region proposal network (RPN) for object bounding box classification and regression.However, it requires longer training and detection times.On the other hand, the one-stage detector provides high inference speed by directly predicting bounding boxes without the need for an RPN step.YOLO is a popular onestage detection model, with each new version, including YOLOv7, showing improved detection accuracy [20]- [22].Comparing different YOLO versions in terms of performance remains an ongoing question.
In this study, we propose to compare eight different versions of YOLO for the detection and classification of 15 classes of AML cell images.Data augmentation techniques are also employed to increase and balance the training images in the dataset.We also introduce the performance evaluation technique using the receiver operating characteristic (ROC) curve.This technique can help to examine the performance of our approaches with the quantitative AUC values.Section 2 will discuss the dataset arrangement, data augmentation techniques, and provide detailed explanations of the eight YOLO approaches.Section 3 will present a comprehensive discussion of the classification results and ROC analysis.Finally, in section 4 we will conclude our study, highlighting the potential of different YOLO approaches for the detection and classification of AML blood cell images.

METHOD 2.1. Dataset and labeling
The dataset for this study was received from the cancer imaging archive (TCIA).The peripheral blood smears were collected from 100 patients diagnosed with AML at munich University Hospital between 2014 and 2017.The munich AML morphology dataset contains 18,365 single-cell images and a trained examiner experienced in leukocyte cell classified 15 classes for training and evaluation.Each single-cell image has a size of 400×400 pixels (corresponding to approximately 29 μm×29 μm) including background components such as erythrocytes, platelets and cell fragments.The full single-cell image dataset and corresponding annotations are publicly available at TCIA [23].
The training and testing process is the most critical factor in machine learning models and there is no useful model for how to split the training and testing dataset [24].The researchers examined the effect of the training and testing process on performance in machine learning by using various sampling theorems.The results showed that the test rate can be selected between 10% and 20% if the dataset is low or the number of samples in the dataset is low.In general, the test rate is between 20% and 50%.Image labeling is the process of manually labeling the regions of an object in an image and creating text-based descriptions of those regions for object classification.The description of the image labeling was performed by well-trained personals.The main objective of image labeling is to allow the users to highlight or specify the pixel area of the objects in an image.The CiRA CORE platform [25] was used as a tool for image labeling before continuing with data augmentation techniques.

Data augmentation
Data augmentation is one of the most widely used techniques for increasing the sample size of a dataset for the model training process.Many researchers used numerous image transformation techniques, such as position augmentation and color augmentation, to obtain various types of original images.Since deep learning models were trained with both the original and various augmentation images, they have more generalization capabilities.As a survey result on image data augmentation for deep learning, various data augmentation techniques have been developed to lessen the overfitting of the learning model by providing better generalization [26].Therefore, we used the four data augmentation techniques: i) rotation (rotating the image in 45° intervals between -180º and 180º to produce a variety of images), ii) contrast (adjusting between the darkest and brightest image portions by multiplying all pixel values with 0.4, 0.6, 0.8, and 1.0), iii) noise (image noise injection by using a gaussian noise distribution with three standard deviations (σ=0, 10, 20)), and iv) blur (blurring the image by using gaussian filter with a standard deviation of 9).After performing the above four image processing tasks, an individual image can be augmented up to 108 images with a size of 608x608 pixels.The training dataset with a 608x608 pixels image was further prepared for the training dataset of eight versions of the YOLO algorithm.

Detection of AML blood cell images based on different YOLO approaches
The YOLO object detection models, including their tiny versions, have been utilized for the detection of AML blood cell images.These models consist of YOLOv2, YOLOv2-tiny, YOLOv3, YOLOv3tiny, YOLOv4, YOLOv4-tiny, YOLOv7, and YOLOv7-tiny.Each model has distinct hyperparameters (Table 2) and characteristics.The tiny versions have the advantage of being able to run on a CPU instead of a GPU.The original YOLO model, proposed by Redmon et al. [27] introduced an end-to-end network that combines feature extraction, candidate frame classification, and regression.While YOLOv1 significantly improved the detection rate compared to two-stage methods, it had reduced detection accuracy.However, subsequent modifications to the YOLO model have significantly enhanced object detection performance.YOLOv2, introduced by Redmon and Farhadi in 2017 [28], utilized the DarkNet19 feature extraction network and incorporated a batch normalization layer for faster network convergence.It also employed a k-means clustering algorithm to automatically determine prior anchor boxes, resulting in improved detection performance.YOLOv2 outperformed other detection systems at the time and required less processing power due to its 19 convolutional layers and 5 max-pooling layers.The model produced one output feature map of size 19×19 for object prediction with five anchor boxes.YOLOv2-tiny is a smaller version of YOLOv2 with 9 convolution layers and 6 max-pooling layers for feature extraction.It predicts one output feature map using the same five anchor boxes as YOLOv2.The trained model provides an output feature map of size 13×13 for object prediction.
YOLOv3, known for its speed and accuracy, introduced the Darknet-53 backbone for feature extraction [29].It incorporated ResNet shortcut connections to address gradient disappearance and additional convolutional layers for predicting three different bounding boxes.YOLOv3 adopted the sum of squared error loss and logistic regression function for bounding box predictions.It utilized independent logistic classifiers and binary cross-entropy loss for multilabel class predictions.The model employed k-means clustering to determine bounding box priors and produced three-branch outputs with nine anchor boxes in the feature map for each cell image, with different sizes such as 19×19 image size for a large object, 38×38 image size for medium object and 76×76 image size for a small object, respectively.YOLOv3-tiny, a simplified version, offered faster processing and reduced memory requirements.It shared similarities with YOLOv2-tiny, featuring 9 convolution layers and 6 max-pooling layers for feature extraction.The target was distributed into two scales using k-means clustering.YOLOv3-tiny produced two branch outputs with six anchor boxes for prediction, corresponding to feature maps of size 13×13 and 26×26.
YOLOv4, proposed by Bochkovskiy et al. [30], improved processing speed and detection accuracy, but it still needs to be improved in terms of efficiency.Its structure comprised the CSP Darknet53 backbone network for feature extraction, a neck network incorporating spatial pyramid pooling (SPP) and path aggregation network (PANet) for feature fusion, and a head network for output prediction.YOLOv4 introduced the mish activation function and utilized the CSP block module for improved learning ability.It employed the complete intersection over union (CIoU) loss function and the distance IoU (DIoU) non-maximum suppression algorithm.The model's final output prediction used the same three output feature maps as YOLOv3.Because the YOLOv4-tiny is based on YOLOv4, the feature extractor is the CSPDarknet53-tiny backbone.The YOLOv4-tiny CSP block, in contrast to YOLOv4, uses the leaky rectified linear unit (ReLU) activation function rather than the Mish activation function.The YOLOv4-tiny also produces two-branch outputs using two distinct scales of the feature map for the output prediction, similar to the YOLOv3-tiny.
Despite the fact that YOLOv5 and YOLOv6 were launched in 2020 and 2022, respectively, YOLOv7 claims to be the fastest and most accurate real-time object detector to date [31].In fact, the YOLOv7 model preprocessing approach is associated with YOLOv5, and the extended efficient layer aggregation network (E-ELAN) is proposed as the network's backbone to improve self-learning ability while retaining the original gradient path.Unlike YOLOv5, YOLOv7 integrates the head and neck networks into a single head network, but the functionalities remain the same.The head network consists of five sections including spatial pyramid pooling cross-stage partial connection (SPPCSPC), a series of CBS (used for convolution, normalization, and activation function), MP (composed of MaxPool and CBS), Catconv (concatenation, convolution), and Repconv (re-parameterized convnet structure).From this structure, the model preprocesses the input image and resizes it to 640×640 sizes before feeding it into the backbone and head network.The model then outputs three layers of different size feature maps for the detection result of the image, such as 20×20 image size for a large object, 40×40 image size for a medium object, and 80×80 image size for a small object, respectively.YOLOv7-tiny, a compressed version designed for edge GPU, employed a leaky ReLU activation function, and generated three branch outputs with nine anchor boxes.The In operation, the YOLO models divide images into grid cells and predict bounding boxes and class probabilities for objects within each cell.The YOLO method extracts significant characteristics from images using a CNN as its backbone network.These features are then fed into a series of convolutional and fully connected layers, which predict the bounding boxes and class labels.YOLO predicts objects of different sizes and aspect ratios using a single unifying model, making it efficient and capable of real-time object detection.YOLO also uses anchor boxes to handle object scale and location variations.The method predicts multiple bounding boxes at the same time and assigns a confidence score to each box, representing its probability of containing an object.NMS is applied to remove redundant bounding boxes and produce the final set of object detections.

Statistical analysis
The four models' performance is evaluated using a testing dataset containing 15 classes of AML blood cell images.The performance of the eight models is evaluated using a testing dataset that includes images of 15 different classes of AML blood cells.When the images from each class are tested, the prediction results will be obtained with a confidence level.The number of prediction results for each class is counted and summarized in a 15×15 confusion matrix table.The four statistical parameters namely precision, sensitivity, specificity, and accuracy are considered performance metrics as (1) to (4) [32]: Besides, we can visualize the model performances and evaluate the classifiers with the well-known technique, the ROC curve [33].ROC curve is a two-dimensional graph in which true positive rate (TPR) and false positive rate (FPR) values are plotted on the Y-axis and X-axis, respectively.TPR and FPR values can be received from the confusion matrix table to form an ROC curve.TPR value is the same as sensitivity and FPR value can be calculated as (5): Since the ROC is constructed by changing the threshold level on the prediction score, each changing threshold generates one point in the ROC curve [34].In this study, we constructed the ROC curve at the threshold level with every 5% increment because our testing dataset has over 3000 images.The above descriptions of true positive (TP), false positive (FP), true negative (TN), and false negative (FN) are simply used in the binary class decision.However, we use the multi-class classification with a confusion matrix of 15×15 for blood cell image analysis.In the multi-class classification, the four types of conditions could be computed by using a one-versus-rest approach.Then, the performance calculations can be achieved with two operations, namely micro-averaging and macro-averaging.Because of the imbalance in the class dataset, we used micro-averaging in this study [35].To plot the ROC curve, the required TPR and FPR are computed as ( 6), (7): where i stands for each testing class and n will be 15 classes for this study.If i=1, the 1st class is considered as positive and the rest of the 14 classes as negative class.In the formulas, TPi, FPi, TNi, and FNi are associated with each testing class i, and TTP, TFP, TTN, and TFN are the total of each condition for all classes.
In addition to evaluating the models based on the ROC curve, we calculated the area under the curve (AUC) as a measure of their usefulness.AUC represents the integration of the area under the ROC curve and provides valuable insights into the model's performance.A higher AUC value indicates a more effective model in testing.Alongside the TPR, which is equivalent to sensitivity, we also computed three other performance metrics for all classes as (8) to (10): (10)

RESULTS AND DISCUSSION
Since YOLO algorithms require a large amount of data, we need to use a machine that has very high computing power.All of the four algorithms were configured on the operation system of the Ubuntu 16.04 LTS (64-bit) and the hardware environment as follows: processor: Intel® Core i5-8400, CPU @ 2.8GHz*6, memory: 31.3GiB, and graphics: GeForce GTX 1070 Ti.In this study, each training process took a maximum of four days for computing time.

Class-wise performance comparison on the eight YOLO models
This section describes the performance comparison for the eight YOLO models with the testing dataset.We evaluated the prediction output scores for each class by using the corresponding trained models.The parameter selection of threshold level: 0.5 (50%) and NMS value: 0.2 are used as default to attain the prediction scores.Since there is a 0.5 classification threshold, the models can predict the classes if the probability is greater than 0.5.Besides, we utilized the NMS technique to reduce duplication because the model can produce duplicate detections for the same output [36].The class predictions were then achieved using the testing dataset, which contains 15 classes of blood cell images with a total of 3,663 single-cell images.The confusion matrix table was created by using the actual class label and the predicted class score.To complete the confusion matrix, we needed to count the predicted class for each image test and fill in the received counted number on the prediction class of the matrix.The eight types of confusion matrix tables for the eight YOLO algorithms are accessible at data availability.
During the testing of the trained models, we observed that YOLOv2, YOLOv3-tiny, and YOLOv7 demonstrated the ability to correctly predict more classes compared to other algorithms.Specifically, these three models can predict 14 classes accurately, while YOLOv2-tiny, YOLOv4, YOLOv4-tiny, and YOLOv7-tiny can predict 13 classes, and YOLOv3 can predict 12 classes, despite some classes having a small dataset.Figure 1 illustrates the typical results obtained from the detection and classification of each single-cell image.As mentioned in the dataset and labeling section, the eight models were difficult to correctly perform in the small testing dataset.Nevertheless, the trained models accurately predicted the class namely Smudge cell because it has a distinctive character from other classes.YOLOv4-tiny predicted fewer classes than YOLOv2, YOLOv3-tiny, and YOLOv7, but it has more prediction images than the other models.For the total number of image predictions, the eight models correctly predicted over 3663 single-cell images in descending order as follows: YOLOv4-tiny predicted 3455 images, YOLOv3 predicted 3444 images, YOLOv3-tiny predicted 3430 images, YOLOv7 predicted 3419 images, YOLOv2 predicted 3998 images, YOLOv4 predicted 3383 images, YOLOv7-tiny predicted 3382 images and YOLOv2-tiny predicted 3297 images, respectively.
Additionally, apart from the small five classes, we compare the precision and sensitivity values to evaluate the quality of class-wise prediction using the one-versus-rest approach, as shown in Figures 2 and 3.At this point, we used the results of the previous work [19] to make the comparison which used the ResNeXt CNN approach for training and 5-fold cross-validation for testing.Consequently, they performed five different times for training and testing.Although they presented their interval values in precision and sensitivity, we only used the average results for comparison.

ISSN: 2302-9285 
The detection and classification of acute myeloid leukaemia blood cell images based … (Kaung Myat Naing) 1153 Figure 2 presents the precision results for the previous work and eight types of YOLO with the bar graph.In general, we found that all models except YOLOv2-tiny have at least 90% precision in the three large dataset classes (neutrophil (segmented) (NGS), lymphocyte (typical) (LYT) and myeloblast (MYO)).Among the medium dataset classes, neutrophil (band) (NGB) has the lowest precision value because its biological pattern is comparable to NGS.Eosinophil (EOS) cells, on the other hand, have a high precision (at least 92%) for all models due to their unique characteristics, as illustrated in Figure 1.

Figure 2. The comparison of precision values for the eight types of YOLO and the previous work
Figure 3 displays the sensitivity results for all of the models.The best sensitivity scores were also obtained at least 86% in the three classes as described in the precision result in all models except YOLOv2-tiny.The significantly characterized class, namely EOS, has a relatively high sensitivity score of at least 92%.To summarize the above facts, the large training datasets still provided the capability of YOLO in the classification of blood cell images.Besides, the performance of multi -classification dropped due to the similar characteristics of blood cells for a small dataset.The overall results of four-performance metrics for eight types of the YOLO model are shown in Figure 4. YOLOv4-tiny has 94% in overall precision and sensitivity and 99% in overall specificity and accuracy, which is a better performance than the other comparative YOLO models.As a result, we can conclude that the YOLOv4-tiny model is the most suitable one for detecting AML blood cells during image classification.Furthermore, YOLOv4-tiny is also compatible with edge GPU and CPU devices because it consumes less memory.In addition, Table 1 presents the average precision and sensitivity scores of eight YOLO models, comparing them with previous work [19].YOLO models perform well in common cell classes (NGS, LYT, MON, EOS, MYO), but some classes remain challenging.Despite using only 50% of the training dataset, YOLO models' performance is comparable to the previous work with the same test dataset size, showing better precision in 6 classes and better sensitivity in 8 classes.

ROC curve and the performance analysis on the varied of threshold levels for the eight YOLO models
For performance analysis using a ROC curve, eight types of YOLO models are employed as classifiers to provide only a class decision, i.e., true class or false class on each instance.After applying these classifiers to a test dataset, it yields a confusion matrix for each threshold value that corresponds to one point in the ROC curve.
The classification results logically yield a numeric value of an instance probability with the predicted classes as shown in Figure 1.The classifier produces the predicted classes if its output is higher than the predefined threshold values.From these results, we collected the overall values of four conditions such as TTP, TFP, TTN, and TFN scores of 15 classes from a confusion matrix.After that, we calculated the two important values (TPR and FPR) to plot a single point in the ROC curve.In this way, there are many different points in the ROC curve by varying the threshold values.Conceptually, we alter the threshold values from 0 (0%) to 1 (100%) to complete the ROC curve.Next, we performed the classification by using the testing dataset with the threshold values for each 0.05 (5%) increment.Consequently, we obtained the different 21 corresponding points in the ROC curve.The summarization of the classification results for eight types of YOLO approaches to plot the ROC curve is accessible at data availability.
Considering the TPR and FPR results, we used the Jupyter Notebook open-source web application with a Python environment to plot the ROC curve as shown in Figure 5.In the ROC curve analysis, the AUC is an effective way to evaluate the performance of the trained model.The AUC value is always bounded between 0 and 1, where a perfectly inaccurate test represents a value of 0 and a perfectly accurate test represents a value of 1.In general, an AUC value can be defined as follows: under 0.5 is no realistic model, 0.5 is no discrimination model, 0.7 to 0.8 is considered an acceptable model, 0.8 to 0.9 is considered an excellent model, and more than 0.9 is considered an outstanding model [37].
Based on the findings presented in Figure 5, we can conclude that the YOLOv4-tiny model is a detection model, as it achieved the highest AUC value.The AUC values for the eight YOLO models are as follows: 0.963 for YOLOv2, 0.948 for YOLOv2-tiny, 0.969 for YOLOv3, 0.967 for YOLOv3-tiny, 0.964 for YOLOv4, 0.971 for YOLOv4-tiny, 0.966 for YOLOv7, and 0.961 for YOLOv7-tiny.Since all the AUC values are higher than 0.9, it is evident that each version of the YOLO model can be considered an outstanding model for AML cell classifications.However, the YOLOv4-tiny model stands out with the highest AUC value among them.We present a summary of our findings regarding the overall performance comparison between eight types of YOLO approaches, as shown in Table 3.While we varied the threshold values for testing, we kept the NMS unchanged for all data analyses.Overall, we observed that the eight different YOLO models demonstrate high effectiveness based on their performance scores.From the analysis of Table 3, we observed variations in the four performance values as we adjusted the threshold values.The results indicate that higher threshold values led to greater precision and specificity scores in the models' predictions.In addition, Table 4 presents a comparison of our results with other state-of-the-art approaches.Dasariraju et al. [10] achieved an overall accuracy of 92.99% using the random forest algorithm.Rastogi et al. [12] utilized LeuFeatx features with extra trees classifier, achieving an overall accuracy of 96.15%.In paper [15] and [38] used two-stage hybrid model and ghost-ResNeXt, achieving overall accuracies of 97.00% and 98.61%, respectively.In this paper, we propose to use the different YOLO models for blood cell detection and classification.The analysis of Table 4 indicates that our proposed YOLOv4-tiny model outperforms other models in terms of accuracy.[15] Two-stage hybrid model 97.00 Bairaboina and Battula, 2023 [38] Ghost-ResNeXt 98.61This study YOLOv4-tiny 99.26

CONCLUSION
We introduce the four main types of the well-known object detection strategy, the YOLO approach to detect and classify the 15-class of WBC cell images.To enhance the training dataset, we employed data augmentation techniques, including rotation, contrast adjustment, noise addition, and blur, resulting in one image being augmented into up to 108 images.The training dataset, enriched with these augmented images of 608x608 pixels, was used to assess the performance of eight YOLO models (YOLOv2, YOLOv2-tiny, YOLOv3, YOLOv3-tiny, YOLOv4, YOLOv4-tiny, YOLOv7, and YOLOv7-tiny) for the classification of AML blood cell images.Among these models, YOLOv4-tiny demonstrated superior performance, achieving over 94% in overall precision and sensitivity, and over 99% in overall specificity and accuracy.Furthermore, we proposed a performance evaluation procedure using the ROC curve, allowing for a quantitative examination of the approaches through AUC values.The AUC values for the eight YOLO models were found to be notably high, ranging from 0.948 to 0.971.With all obtained AUC scores exceeding 0.9, these eight YOLO models are deemed outstanding for AML cell classifications.Considering the remarkable performance achieved, our approach utilizing a one-stage object detection model holds great potential for enabling more reliable and faster clinical diagnoses in the future, thus contributing to advancements in the field of healthcare and medical applications.

Figure 1 .
Figure 1.The classification and localization of AML blood cell images with a threshold level of 0.5 and NMS of 0.2

Figure 3 .
Figure 3.The comparison of sensitivity values for the eight types of YOLO and the previous work

Figure 4 .
Figure 4.The overall score for the four-performance metrics namely precision, sensitivity, specificity, and accuracy with a threshold level of 0.5, and NMS of 0.2, the model performances are calculated by using a micro-averaging approach

1155 Figure 5 .
Figure 5.The ROC curve for eight types of YOLO approaches after evaluating the testing dataset So, we randomly divide the received dataset into a training dataset and testing dataset for each class where the training dataset contains the remaining dataset is the testing dataset.To balance the number of images in the dataset, we need to reduce some classes in the training dataset which has a large number of images.So, the three classes which have over 3000 images, were reduced; neutrophil (segmented) was reduced to 1510 images and the other two classes namely lymphocyte (typical) and Myeloblast were reduced to 1000 images per class.The training dataset is reduced from 80% to 27% in the entire dataset.We also noticed that the five classes namely lymphocyte (atypical), promyelocyte (bilobed), metamyelocyte, monoblast, and smudge cell have lower than 30 images and the testing data set has only 2 and 3 images as shown in Table1.To evaluate these four classes, the data sets are too small and difficult to get the actual performances.

Table 2 .
The hyperparameters of various YOLO models in model training

Table 4 .
Comparison with existing state-of-the-art models