Detection roasting level of Lintong coffee beans by using euclidean distance

Received Dec 30, 2020 Revised Mar 30, 2021 Accepted Oct 27, 2021 Coffee roasting is the process by which raw coffee beans (green beans) are roasted until they reach a certain roast level. In general, the roast level of roasted coffee beans is divided into 3 levels, namely the roast level of light, medium and dark. One way to find out the roast level of roasted coffee beans is to see the color change of the coffee beans. However, it is very difficult to know the exact color conditions of each roast level of roasted coffee beans and this can be overcome by build an automatic coffee roasting equipment. In this research, an automatic coffee roaster was done with a system that is able to control the roasting temperature and stirring of coffee beans. This tool can also monitor the change in color of the coffee beans during the roasting process. The system that has been implemented can detect color changes and classify the level of dark roast of roasted coffee beans using the euclidean distance algorithm. The euclidean distance give a threshold to classified the roast level. The system accuracy for predicting coffee beans color at the level of dark roast is 90% and 80% for overall.


INTRODUCTION
Coffee is one of the main commodities in the Indonesian plantation sector. The role of coffee commodities in the Indonesian economy is quite important, both as a source of income for coffee farmers, sources of foreign exchange, producers of industrial raw materials and providers of employment through processing, marketing and trade (export and import) activities [1]. One contributor to the export of Indonesia's coffee from North Sumatra is Gayo, Mandheling, and Lintong Arabica Coffee [2]. To produce high-quality coffee beans, proper post-harvest handling requires fermentation, washing, sorting, drying, and roasting. Coffee roasting is the process of roasting coffee beans without using oil to produce coffee beans with quality flavors and aromas. Roasting coffee beans has an effect of 30% in determining the aroma and taste of coffee beans [3].
In general, roasting coffee beans are divided into 3 levels of roast, namely light, medium, and dark [3]. These three levels of roast require different temperatures, times, and stirring. The resulting aroma and taste image are also different. Plus the roasting temperature and stirring of the coffee beans during roasting must be kept constant. The problem that occurs is the difficulty of knowing the exact color conditions of each roast level, one way of identification and classification of coffee beans is by visual inspection done manually which the process is tedious, time-consuming, and subjective [4]. Based on this problem we have designed an automatic coffee roaster that can help the user/roaster in roasting coffee beans. A coffee roaster that designed can control the temperature of roasting and stirring coffee beans. This tool can also monitor the discoloration of coffee beans when roasted. In this paper we take part in monitoring the color change of lintong coffee beans when the roasting process takes place. We focused on roasting coffee beans at the level of dark roast. Color change monitoring will detect whether the color of roasted coffee beans has reached the level of dark roast or not. The hardware that we used for monitoring coffee bean color changes when roasted is a webcam and Raspberry Pi 3. While in the software side, we used is image processing and euclidean distance algorithm even though there are another algorithm that can perform better such as artificial neural network (ANN) [5]- [9]. Image processing and euclidean distance algorithm are implemented by using the Python programming language.
Image processing is the process of changing an image into a new image with better quality. Image processing works by inputting an image and then the image will be transformed into a new image as the output. We used image processing on our experiment to determine and identify the roasting levels of coffee bean based on its color. Input images quality depend on the experimental conditions because there is heterogeneity in the scene (every point of the image receive neither the same quantity nor the same quality of lighting), we will discuss about this in method section [10]. Image processing is carried out here with the aim to improve image quality and correct signal data errors in the image which is noise that reduces the quality of the image. Image processing also used to determine the roasting levels of coffee by applying an euclidean distance algorithm. Our contribution in this research is to implement a light weight image processing algorithm to detect a roasting level of coffee bean in roaster machine.

RESEARCH METHOD 2.1. Euclidean distance algorithm
Euclidean distance is a method for calculating the similarity or euclidean values that exist between 2 points/vectors, we use this method because euclidean distance is light weight and the running time better than k-nearest neighbour (KNN) and ANN [11]. Euclidean distance has a good performance so it's already used in many applications such as software development and robotics [12], [13]. Although, the accuracy slightly lower than ANN but we have advantage in ergonomic design and hardware requirement. The dimensions of each point in euclidean distance can be determined according to need. The formula of euclidean distance can be seen in (1): where; Ed=Euclidean distance Xi=Reference point value Xj=Sample point value n=Point dimension This algorithm will use points with 3 dimensions which represented the RGB color geometry. In this research, red, blue, and green values could extract successfully from the three roast levels of Lintong coffee bean image [14]. The use of 3-dimensional points is based on the R, G, and B values of the image of the coffee beans captured by the webcam during the roasting process. In its implementation, the reference point and sample point values will be determined. Reference points are points whose values are determined as a reference to the sample point and have a fix value (constant). Four reference points will be determined, namely the average RGB value of coffee beans at the light, medium, dark, and green bean roast levels. While the sample point is the point where the RGB value is obtained from the image capture of roasted coffee beans in real time. The values of these sample points are variable because the webcam will capture the image of roasted coffee beans once every 30 seconds. Taking the image of the coffee beans every 30 seconds is chosen so that the process of discoloration of roasted coffee beans can be monitored more accurately and precisely. Then the RGB value of the coffee bean image will be extracted and used as a sample point. The sample point will be calculated the distance/euclidean value with respect to 4 reference points using the euclidean distance algorithm. After the euclidean value between the sample points to the 4 reference points is obtained, it will be seen which euclidean value is the smallest. For example, if the user wants to roast coffee beans with dark roast, then the euclidean value from the sample point to the dark reference point value must be the smallest value compared to the other three values. If the value is obtained, the process of taking coffee beans image by webcam will stop and the process is complete. If not, the process of taking the image of the coffee beans will again be carried out. Eulidean distance flowchart algorithm can be seen in Figure 1.

Determining the reference point values for each roast level of coffee beans
In determining the reference point value, each 100 images of coffee beans will be collected from each roast level. It is customary to correct unwanted signals like noise due to lighting variability and we use median filter for noise removal in this experiment [15]. In this experiment we use a gaussian blur to reduce the noise and perform a masking [16]. After pass the filtering process then the RGB values will be extracted from 100 coffee bean images and the average values will be calculated. In this experiment, we use the RGB value not the HSV because we want to measure the distance from each channel [17]. The average RGB value obtained from each roast level which will later be used as a reference point. In its implementation, 100 coffee bean samples will be captured in dynamic conditions. Dynamic conditions are conditions where the coffee beans are being stirred. In this condition, the image of the coffee beans will continue to be taken during the stirring process at each roast level. The reference point value (RGB) of each level of coffee beans roast can be seen in Table 1.

Determination of reference point values for each roast level of coffee beans
The range of euclidean values at each level of coffee bean roast needs to be established. The color ranges were used to determine the light, medium, and dark coffee beans, so we need a good lighting to get an  [18]. To get a good lighting in purpose to get the same quantity and quality of lighting we add lamp at the top of roaster (could be seen in the roaster design). After get an appropriate lighting then our next aim is to find out how much euclidean value of coffee beans roast can have a good performance. For example when we want to roasts with light level and we has made a menu selection and waiting for the roasting process to finish. Apparently the webcam has succeeded in capturing the image of coffee beans with light roast classification results, but the roasted coffee beans apparently still do not fully have a color that matches the light roast [19], [20]. Then after several time, on a PC monitor was found that the euclidean distance algorithm has succeeded in getting the light roast classification value within range by using an euclidean value from the sample point against the light reference point (choosing the smallest value compared to the other three roasting level values). This is what causes the euclidean range of values at each roast level to be sought. By determining the range of euclidean values, roasted coffee beans will get better and more in accordance with the reference data. Determining the range of euclidean values can be done by adding and subtracting the initial average RGB value from the standard deviation value. After that we can find the euclidean value from the initial average RGB value to the "average RGB value of addition and subtraction with the standard deviation". Because the focus of this research is roasting coffee beans with dark roast level, the euclidean value range of dark roast will be sought. Average RGB values, standard deviations, maximum and minimum RGB values of the reference point on dark roast can be seen in Table 2.  From the two calculation results above, it can be seen that the euclidean value between the initial average RGB value against the maximum and minimum RGB values is the same, which is 9.75. This happens because the result of subtracting the initial average RGB value from the maximum and minimum RGB values will produce its own standard deviation, which causes the two euclidean values to be the same. − The value of 9.75 will be set to be the range of euclidean values for coffee beans at the level of dark roast.
Later, if the results of the classification of dark roast coffee beans predicted by the euclidean distance algorithm are greater than 9.75, the roasting process will be continued until the euclidean distance algorithm gets the euclidean value of dark roast coffee beans under 9.75.

Roaster design and implementation of webcam to Raspberry Pi 3
The roaster design can be seeing in Figure 2 and as shown in description: − The device used to rotate the coffee in the container is a DC motor and blade. The DC motor will be connected to the blade which will then drive the blade (rather than use a static roaster in here we use a dynamic roaster with blade) [21]. The components used to control the DC motor are the motor driver and the microcontroller, the Arduino Mega 2560. − Furthermore, the equipment used to regulate the baking temperature is a K type thermocouple, heater, relay, and MAX6675 module. The thermocouple will be attached to the side of the container to determine the baking temperature. While the relay will regulate the life of the heater off. The goal is to hold the heater to a fixed temperature. Then the MAX6675 module is to measure the voltage from the thermocouple output which will be sent to the Arduino Mega 2560 microcontroller. − Next, the tools used to monitor the roasting are webcam and Raspberry Pi 3. − Then because there is damage to the Raspberry Pi 3, which is a low voltage that causes the output voltage of some GPIOs on the Raspberry to be 0 V which should be 3.3 V. So that in this study an additional microcontroller is carried out, namely Arduino Mega 2560.

3077
The hardware implementation of the color change detection system of roasted coffee beans is by connecting the webcam to one of the Raspberry Pi 3 USB ports. The webcam used is Logitech C525 with a resolution of 8 MP. The webcam implementation to Raspberry Pi 3 can be seen in Figure 3.

RESULTS AND DISCUSSION
In this part will be explained an experiment result and analysis based on the result.

Euclidean distance algorithm testing results in detecting color change of coffee beans at medium and dark roast level
Euclidean distance algorithm has succeeded in detecting changes in the color of coffee beans at the level of medium and dark roast. The process of classifying coffee beans at the level of medium and dark roast using this algorithm produces quite a lot of data, so we only display part of the process [22]- [24]. Image of roasted coffee beans will be captured by webcam in Figure 4 and its roast level will be classify using an euclidean distance algorithm During the roasting process, if the results have not reached the specified roast level, the process will continue. If the result is "continue" it means that the RGB value obtained is still in the green bean status or the euclidean value of the dark roast level has not reached the specified value range. In this case the range of euclidean values at the level of dark roast has a value of 9.75. Then the webcam will re-capture the image of new coffee beans once every 30 seconds [25]. If the result is "wrong" it means that the classification results obtained by the euclidean distance algorithm do not match the roast level that has been selected, so the image retrieval of the coffee beans must still be continued until the euclidean distance algorithm gets the specified roast level. The flow process could be seen in Figure 5. For the example the roast level that we used is dark, the roasting process of coffee beans will pass through green bean conditions, light roast and medium roast. If the results of the classification of the roast of coffee beans from the euclidean distance algorithm are medium or light, then the result is "wrong" (can be see in Table 3).

Accuracy of euclidean distance algorithm in detecting color change of coffee beans at medium and dark roast level
To find out the accuracy or compatibility level of roasted coffee beans, the RGB value of the roasted coffee beans with the reference coffee beans RGB value using euclidean distance algorithm will be compared. We do this by taking back 10 images of medium and dark roasted coffee beans that have been roasted in a stirred state and see whether the ten images are included in the classification of the level of dark roast using the euclidean distance algorithm in Figure 6. When an image of a coffee bean is feed into the algorithm, the RGB image value and the euclidean value of the sample point (the RGB value of the image entered) are displayed for each reference point (average RGB value) on the PC monitor. Then we will see which euclidean value is the smallest and the classification results will be displayed on a PC monitor. For example, if the euclidean value of the sample point with respect to the light reference point is the smallest, "Result = Light Coffee" will be displayed as a result of its classification. If the reference point for the smallest medium will be displayed "Result = Medium Coffee" and so will the dark and green bean. If an image of coffee beans at a roast level such as dark is input into the program and the classification results displayed on a PC monitor are "Results = Dark Coffee" then it can be concluded that the predicted results of the euclidean distance algorithm are true. Also, if the results of the classification do not match the original image of the coffee beans, the conclusion is false. The results of testing the ten images of coffee beans at each roast level using the euclidean distance algorithm can be seen in Table 4. Figure 6. Ten images of coffee beans that have been roasted at dark roast Based on the data obtained in Table 3, from 10 images of coffee beans at the level of dark roast captured by the webcam 1 data is obtained which is predicted to be false and 9 data that are predicted to be true by the euclidean distance algorithm, so that the accuracy of the euclidean distance algorithm in predicting beans dark coffee roast can be calculated is being as.
Based on the above calculations, the accuracy of the euclidean distance algorithm in predicting coffee beans is dark roast level of 90%. For the medium roast level, the accuracy is slightly lower that is 70% and could be seen in Table 5. Based on the data in Table 5, it can be seen that for medium roast level have 7 correct prediction and 3 wrong predictions. For the dark maturity is predicted have 9 correct predicton and 1 wrong prediction. To find the total euclidean algorithm accuracy could be calculated is being as: Total Accuracy=(7+9)/(7+3+1+9)*100%=16/20*100%=80% In here we get 80% accuracy for medium and dark roast level.

Compability level between the RGB value of coffee beans at medium and dark roast level with the RGB value of the referenced coffee beans
The RGB values of the 10 previous coffee bean images at the level of medium and dark roast can be seen in Table 4. Later the RGB values of the 10 captured coffee beans will be averaged and seen whether the RGB values of real time roasted coffee beans have in common with the RGB values of medium and dark coffee beans for reference. From the Table 5 we can see the average RGB value of 10 images of coffee beans that have been roasted at the level of medium and dark roast. The average RGB value of roasted coffee beans at medium roast is 121.64; 112.101; 100.09, and at dark roast is 110.52; 109.82; 104.18. Then the average RGB value will be compared with the average RGB value of the reference. This comparison of average RGB values is represented in the form of a bar graph like in Figure 7.
Based on the bar graph in Figure 7 it can be seen that the average RGB value of coffee beans with the level of dark roast that is ripe has a difference with the average RGB value of the reference. The difference/difference in the value of red, green, and blue between dark coffee beans that have been roasted dark coffee beans with dark references of 2.92; 3.17; 2.96. These three differences in values can be compared with the standard deviation values of 100 coffee bean images that have been processed for data. Average RGB values and standard deviation values from the reference coffee bean data at the level of dark roast can be seen in Table 2.
In Table 6 it can be seen that the standard deviations or differences in the values of red, green, and blue in dark roast coffee beans are worth 5.24; 5.72; 5.91. While from the data obtained in Figure 6, the difference in the value of red, green, and blue between dark coffee beans that have been roasted with dark coffee beans each reference is 2.92; 3.17; 2.96. These three differences are being as the standard deviation values obtained from the processing of 100 images of dark roast coffee beans, so it can be concluded that the average RGB value of roasted dark roast coffee beans is in accordance with the average RGB value of coffee beans the darkness of the reference. Or it can be said that the average RGB value of dark coffee beans that have been roasted in real time is still between the range of maximum RGB (MAX) and minimum RGB (MIN) reference data. This case is also applied to medium roast level, the difference in the value of red, green, and blue between medium coffee beans that have been roasted with medium coffee beans each reference is 1.8; 5.61; 8.68. Although the difference in red color is still in range but the margin in blue and green color for medium color is quite large.

CONCLUSION
Based on our experiment and testing we conclude that the design of a color change detection system on roasted coffee beans has been successfully carried out by using euclidean distance algorithm. The euclidean distance algorithm has succeeded in classifying roasted coffee beans at the level of dark roast through its color change. For an accuracy, the euclidean distance algorithm give a good performance that could predicting coffee beans color at the level of dark roast by 90% and 80% for overall. Also, we find that the average RGB value of roasted dark roast coffee beans are between the range of maximum and minimum of RGB values which means the result are valid if compared to original value of roasted coffee beans. We recommend that this system could robustly detect the dark roast level but for medium level still need more improvement. For future work, we will try a more sophisticated algorithm and improved hardware to perform some image processing by using a deep learning algorithm.