Pre-processing technique of Aquilaria species from Malaysia for four different qualities

ABSTRACT


INTRODUCTION
Agarwood oils are the essential oil extracted from Agarwood trees from the Aquilaria species which belong to the genus of Thymelaeaceae family.Resin is a type of volatile chemical substance that permeates the heartwood of Agarwood trees [1]- [4].This high resin formation resulting for a good quality of Agarwood oils [2], [5]- [7].Agarwood oil is one of the most useful oils with a major use for everything from perfumery, medical industry, fragrances and also religious ceremonies [8]- [11].It has been applied in wide areas where contribute to their popularity in essential oil worldwide market including Malaysia.
Technically, the quality of essential oils had been manually evaluated and graded using sensory evaluation based on their physical properties [12]- [14].According to human perception and experience, an essential oil with the highest quality has a lot of resin, a dark oil color, a powerful odor, and a long-lasting 153 scent [4], [5], [10].However, various people may have different impressions and conclusions regarding the technique.The sensory evaluation method is somewhat inaccurate.There is no guarantee that grading essential oils by human sensory evaluation would ensure its purity or quality.Due to the continuous process when dealing with a large number of samples at once, the human trained grader technique has a significant disadvantage in terms of objectivity and consistency, leading to a labor-intensive and time-consuming procedure [12], [15]- [17].Numerous approaches have been proposed and used to verify the quality of essential oils using intelligent methods [13]- [15], [18]- [20].There is a platform where Agarwood oil quality classification can be done solely based on their chemical subtances allowing essential oils to be classified into their respective classes (low, medium low, medium high, and high quality) by using today's data analysis technology.The accurate result can be measured by using statistical work.The highlight of this paper is the boxplot (also called box-and-whisker plots) analysis of Agarwood oils as well as its outlier labelling when dealing with the raw data.The abundance of Agarwood oil chemical compounds will be the input for the boxplot analysis.This introduction is imperative to highlight the objective of this study which is to present data distribution by using boxplot analysis on the Agarwood oils chemical substances.

THEORETICAL WORK
An easy way to interpret this research is to use images or visuals that describe the results more precisely.Boxplot analysis is one of the visuals used in joint display [21]- [24].Visualization using joint display possibly provide structure for comparing many input data between its group [25].Boxplot has become efficient tool in the industry standard for summarize the observation value, lowest quartile, median, highest quartile, greatest observation value and outliers in one diagram [26]- [28].
Figure 1 demonstrates the elements in boxplot visualization.The red '+' sign symbol represent an outlier which also known as extreme value and located above or below the whisker.The 25 th percentile of the lowest data is the lowest quartile (Q1) while 75 th percentile compute from the data is the highest quartile (Q3).There is a red line in the middle of the box which indicate the sample median.Minimum and maximum is the range values in the sample data [28]- [30].
Figure 1.The MATLAB overview of elements in boxplot [29] Primarily, the shorter the whiskers, the more uniform the distribution of data.The result will be accepted if 50% of the median is in the group [28], [31].The whiskers are calculated based on the interquartile range (IQR) as mathematically shown in ( 1)-( 3) [31].Where Q3 indicates the highest quartile to be minus with lowest quartile (Q1) to get the IQR.Besides, the minimum and maximum range of dataset are (2) and (3): = 3 + 1.5 ()

METHOD
In this section, it is explained the research chronological and at the same time is given the comprehensive technique used for data acquisition.Method can be presented in the form of Pseudocode, tables and others that make the reader understand easily.The research method can be made in several sub-sections.

Sample acquisition
The Agarwood oil samples were prepared by Forest Research Institute Malaysia (FRIM) and Bioaromatic Research Centre of Excellence (BARCE), Universiti Malaysia Pahang Al-Sultan Abdullah (UMPSA).The targeted species are focused to only Aquilaria Malaccensis species but with varies oil volume, origin and age.FRIM and BARCE use gas chromatography-mass spectrometry (GC-MS) apparatus to obtain their chemical substances [9].There are 660 samples starting from low, medium low, medium high and high different qualities of Agarwood oils.The eleven important compounds as tabulated in Table 1 have been used as input and the grades of Agarwood oil has been used as an output to the classification system.

Pre-processing technique
The first step to create boxplot is sorting the data samples into four groups which is low, medium low, medium high, and high quality.There are two columns involve where the x-axis listing the name of eleven important compounds of Agarwood oil (independent variable) and the y-axis is the abundances of important compound in percentage (dependent variable).The boxplot's performance was then evaluated.The flowchart of the experimental analysis in Figure 2 was employed to implement the distribution analysis on Agarwood oil sample.This paper focus on boxplot analysis of four oil qualities.

RESULTS AND DISCUSSION
This section will summarize the success of boxplot analysis that support the research objective.The principal component analysis (PCA) filtered 660 data samples and 11 chemical substances to determine the important compounds.These 11 chemical substances became the inputs to boxplot.All samples are placed in the column using MATLAB software.

Figure 3. The boxplot of Agarwood oil chemical substances for high grade
The descriptive statistic could give more informative summaries over hundreds or thousands of "raw numbers" from previous data collected.The output of data will be summarize based on the mean satisfaction data, minimum, and maximum sample and also its standard deviation as shown in Table 2.As can see in the Table 2, the average percentage values for top 7 chemical substances are around 0.01 to 0.99.The comparison between four qualities which is high, medium high, medium low and low analysed as shown in Figure 4.The highlight of this line graph is 10-epi-φ-eudesmol and φ-eudesmol discovered to be important chemical substances with median range between 0.4% to 0.8%.It was declared to be important chemical substances since all of the four grades has its own median value compared to others.Then, the highest median which is 50% of IQR range is belong to β-agarofuran for high grade with 0.99%.From Figure 5, referring to the further analysis, it can be seen that the range for highest quartile (Q1) of 10-epi-φ-eudesmol and φ-eudesmol is between 0.4-0.81.For 10-epi-φ-eudesmol, the highest Q1 is high grade with 0.8102% then followed by medium low, medium high and high grade with 0.7578%, 0.5320%, and 0.4961%, respectively.Next, chemical substance of φ-eudesmol, it can be seen that the highest Q1 is medium low grade followed by high, low then lastly medium high grade with value of 0.7764%, 0.7755%, 0.6549%, and 0.5833%, respectively.Lastly, the highest Q1 which is 75% compute from the data is belong to β-agarofuran for high grade with 0.99% value.These two significant compounds (10-epi-φ-eudesmol and φ-eudesmol) confirmed to be the most important out of others nine compounds.

CONCLUSION
The research work in this paper has successfully achieved the objective by analysing the Agarwood oil of Aquilaria species boxplot between four grades (low, medium low, medium high, and high).Boxplot method was chosen because it is able to show the shape of distribution, data variability, and its significant value.The boxplot method with its median, the upper and lower quartiles of the range produce accurate differentiations between type data of samples of the Agarwood essential oil four grades.Moreover, the visualization graph is the most suitable and excellent technique to describe samples quality classification.The input is the abundance of eleven significant chemical compounds which are ɤ-eudesmol, ar-curcumene, β-dihydro agarofuran, ϒ-cadinene, α-agarofuran, allo aromadendrene epoxide, valerianol, α-guaiene, 10-epi-ɤ-eudesmol, β-agarofuran and dihydrocollumellarin, and the output is the grades of Agarwood oil.Overall, it is can be summarized that the boxplot and graph give the results of 10-epi-ɤ-eudesmol and ɤ-eudesmol as important chemical substances for future analysis with suggestion into five and six grades.Chemical compound of β-agarofuran also recommended to be considered in future work since it gives high quartiles values for medium high quality.

Figure 2 .
Figure 2. Flowchart of experimental analysis

Figure 4 .
Figure 4.The median value of the Agarwood oil chemical substances for Aqualaria species

Figure 5 .
Figure 5.The highest quartile value of the Agarwood oil chemical substances

Table 1 .
Data samples based on quality classification