Next Article in Journal
Groundwater Recharge Prediction Using Linear Regression, Multi-Layer Perception Network, and Deep Learning
Previous Article in Journal
Optimization of the Multi-Start Strategy of a Direct-Search Algorithm for the Calibration of Rainfall–Runoff Models for Water-Resource Assessment
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Leaf Wetness Duration Models Using Advanced Machine Learning Algorithms: Application to Farms in Gyeonggi Province, South Korea

Applied Meteorology Research Division, National Institute of Meteorological Sciences, Seogwipo 63568, Korea
*
Author to whom correspondence should be addressed.
Water 2019, 11(9), 1878; https://doi.org/10.3390/w11091878
Submission received: 10 August 2019 / Revised: 2 September 2019 / Accepted: 4 September 2019 / Published: 10 September 2019
(This article belongs to the Section Water Resources Management, Policy and Governance)

Abstract

:
Leaf wetness duration (LWD) models have been proposed as an alternative to in situ LWD measurement, as they can predict leaf wetness using physical mechanism and empirical relationship with meteorological conditions. Applications of advanced machine learning (ML) algorithms in the development of empirical LWD model can lead to improvements in the LWD prediction. The current study developed LWD model using extreme learning machine, random forest method, and a deep neural network. Additionally, performances of these ML-based LWD models are evaluated and compared with existing models. Observed LWD and meteorological variable data are obtained from nine farms in South Korea. Temporal and geographical information were also used. Additionally, the priorities of the employed variables in the development of the ML-based LWD models were analyzed. As a result, the ML-based LWD models outperformed the existing models; the random forest led to the best performance for LWD prediction among the tested LWD models. Strengths of associations between input variables and leaf wetness were relative humidity, short wave radiation, air temperature, hour, latitude, longitude, and wind speed in descending order. Uses of the geographical and time information in development of LWD model can improve the performance of LWD model.

1. Introduction

Disease warning systems have been studied and developed to prevent and mitigate risks from diseases among crops. Such disease warning systems can be grouped into two categories: calendar-based systems and environment variable-based systems. While the former recommends chemical sprays based on fixed calendar dates or phenological stages, the latter recommends chemical sprays based on the level of disease risk estimated according to in situ environmental conditions [1,2]. The latter system may be a more efficient way for disease warning, as it can adaptively respond to the level of disease risk according to changes in environmental conditions. Relationships between leaf wetness (LW) and plant diseases have been studied in many literatures because risks of many bacterial and fungal diseases among a variety of crops have been associated with LW, particularly in conjunction with temperatures favorable to infection [3,4]. LW is defined as the presence of free water on the surface of a leaf [5]; it mainly comes from three sources: water intercepted by the leaf during a rainfall or fog event, overhead irrigation, or dew that is formed on the leaf. The duration of LW is termed as leaf wetness duration (LWD).
LWD is a key component of disease prediction models, as well as for decision support systems used for foliar fungal plant pathogens [6]. However, despite its importance in plant disease epidemiology, LWD is still considered a non-standard meteorological variable, unlike meteorological variables such as rainfall, temperature, or relative humidity (RH) [7]. Furthermore, there is no standard protocol for the measurement of LW; deployment protocols are not standardized and specifications of commercial LW sensors vary with shape, color, and coating [8,9,10]. Hence, LW or LWD data are frequently unavailable from official weather station networks. Additionally, available measurements are sometimes unreliable, as a result of the lack of standard protocols for maintenance and quality control. Accordingly, LW measuring instruments need to be operated with more consistent protocol, and LWD observations with more consistent quality control. For example, the LWD data measured at the same network operated by one organization or research laboratory should be able to be used to analyze epidemiology of plant disease or to develop disease warning system.
LWD models have been suggested as an alternative to in situ LWD measurements [11,12,13,14]. They can predict LW using a physical mechanism and the empirical relationship with meteorological conditions. For the physical model, the Penman-Monteith (PM) model has been widely used; it uses energy balance approach to predict LW [15]. The empirical models predict LW as mimicking the empirical or statistical relationships between LW and meteorological variables. Classification and regression tree/stepwise linear discriminant (CART), dew point depression, and number of hours of relative humidity (NHRH) have been broadly employed for the empirical LWD models [16,17,18]. Because the PM model requires net radiation data that it is an uncommon meteorological measurement, like the LWD, such net radiation data are often estimated from the empirical relation with other meteorological variables. Due to the use of net radiation estimates, the PM model often produces inaccurate LWD predictions [19]. Additionally, though empirical LWD models are relatively easy to handle, precision of LWD predictions with the empirical models vary with spatial and temporal characteristics of the data [20]. Thus, these models need proper calibration and evaluation to obtain accurate LWD prediction [21].
A few attempts have been conducted to use machine learning (ML) algorithms in the development of the empirical LWD model to improve LWD prediction performance [22,23,24,25,26]. Francl and Panigrahi [27] developed the LWD model using the artificial neural network (ANN). They reported that the performance of the ANN algorithm for the LWD model is comparable to the existing LWD models such as the PM, NHRH, and CART. Chtioui, et al. [28] used a generalized regression neural network (GRNN) to develop the LWD model. They found that the performance of the GRNN algorithm is superior to that of multiple linear regression for LWD prediction. Kim, et al. [29] developed the LWD model using a fuzzy logic system to incorporate energy balance principles and an empirical computation method. They reported that the fuzzy model classifies presence or absence of wetness more accurately than the CART model. Recently, various ML algorithms were proposed, and their performances outperform those of traditional ML algorithms for classification problems such as LW prediction in other fields [30,31,32]. Applications of advanced ML algorithms, e.g., random forest (RF), the extreme learning machine (ELM), the support vector machine and the gradient boosted model, in the development of the LWD model, can lead to improvements in the prediction performance of the LWD model. To the best of our knowledge, the advanced ML algorithms have not been tested in developing LWD models.
Therefore, the current study develops LWD models using advanced ML algorithms: ELM, RF, and deep neural network (DNN). Additionally, to investigate the applicability and suitability of the advanced ML algorithms to LWD prediction, performances of the LWD models developed by the tested ML algorithms are evaluated and compared with the existing models, such as the PM, CART, and NHRH. LWD and meteorological observations are then used to develop LWD models for nine farms in the Gyeonggi province, South Korea. Additionally, time and geographical information are used in the tested ML algorithms to account for the influence of temporal and spatial variation in LWD data. Subsequently, the priorities of the employed variables in development of the ML-based LWD model are analyzed by examining the importance of the variable in the RF. Results of the current study are expected to inform and provide a reference for the development of empirical LWD models using advanced ML algorithms, particularly the ELM, RF, and DNN, in other regions.

2. Data

Here, observed data of the LWD and meteorological variables such as air temperature (Tair), relative humidity (RH), short wave radiation, and wind speed (WS) are collected from sites operated by Gyeonggi-do Agricultural Research & Extension Services in the Rural Development Administration. Measuring instruments for Tair and RH are installed at a 1.5-m height above the ground. Measuring instruments for short wave radiation and WS are installed at a 2.0-m height above the ground. LW data was measured at each site using two adjacent flat plate sensors (Model 237, Campbell Scientific) deployed at 2.0-m and facing north at an angle of 45° to the horizontal [9]. A pair of LW sensors is used to assure wetness and dryness conditions. The sensors of meteorological variables, including Tair, RH, WS, short wave radiation, and LW, are installed at all the sites in the orchard and rice fields.
Weather measuring sites in nine farms are used. Their information and locations are presented in Table 1 and Figure 1, respectively. Temporal resolution of all used data is hourly. The observed meteorological and LW data can be downloaded from Gyeonggi-do Agricultural Research & Extension Services (https://nongupepi.gg.go.kr/). In the current study, LWD is defined as a sum of LW from 0 to 23 h. The observed data from 2009 to 2017 are used, and the observed data in 2013 are excluded in this study due to a large number of missing data.
Table 2 presents the means and standard deviations of the observed meteorological variables, LW, and LWD for the nine sites. The means and standard deviations of Tair and short-wave radiation is relatively consistent across the nine sites, whereas the means of RH, WS, and LW vary with the sites. The RH for sites 1 and 8 is smaller than for other sites; because sites 1 and 8 are respectively located in an urban area and near a highway, their locations may influence their RH. Means of WS for sites 1, 2, and 4 are larger than those for other sites. This may result from that these sites are located in coastal and near coastal regions. Means of LW ranges from 0.22 to 0.4 and its standard deviations ranges from 0.41 to 0.491. While the means of LW vary across the sites, the standard deviations are relatively consistent.
Figure 2 presents time series plots of LWD observations for four sites in 2017. As shown in Figure 2, large values of LWD are observed from the 121st to the 301st day. Fifty percent of annual precipitation falls during the summer rainy season (from June, the 152nd day to September, the 273rd day) in South Korea because of Changma (summer monsoons) and typhoons. Many rainfall events and high humidity lead to frequent dew formation in this season. Accordingly, the large values of LWD in this summer season result from the frequent presence of water on leaves from both rainfall events and high humidity.

3. Methods

3.1. Existing LWD Prediction Models

3.1.1. Classification and Regression Tree/Stepwise Linear Discriminant (CART)

Gleason, et al. [17] suggested the empirical LWD prediction model using CART. This model produces LW predictions using three hierarchical levels and two linear discriminants. Each hierarchical level has thresholds to determine the existence of dew on the leaves. Tair, RH, the dew point temperature (Tdew), and WS are all used in this model. The CART model assigns hourly data to one of four categories according to the threshold values of 3.7 °C for Tdew, 2.5 m/s for WS, and 87.8% for RH. In the first level, dryness is predicted when Tdew is equal to or greater than 3.7 °C; otherwise, the second level is considered. In the second level, wetness is predicted when inequality (1) is met and WS is less than 2.5 m/s; otherwise consideration moves on to the third level. Subsequently, wetness is predicted when Inequality (2) is met and RH is equal to or greater than 87.8%; otherwise, dryness is predicted. Inequalities (1) and (2) are formulated as follows:
( 1.6064 × T air + 0.0036 × ( T air ) 2 + 0.1531 × RH 0.4599 × WS × D 0.0035 × T air × RH ) > 14.4674
( 0.7921 × T air + 0.0046 × ( RH ) 2 2.3889 × WS 0.039 × T air × WS 0.1.0613 × WS × D ) > 37.0
where D is the difference between Tair and Tdew (D = Tair − Tdew). Since this model requires a smaller number of meteorological variables than the physical model, e.g., PM, it is easy to implement from a practical perspective.

3.1.2. Number of Hours of Relative Humidity (NHRH)

This model is one of the simplest models used to predict LWD. When the duration for which RH is higher than 90% exceeds a specific threshold, this model predicts an existence of dew on leaves. Sentelhas, et al. [18] reported that this model can provide prediction performance comparable with the PM model after correcting the LWD predictions based on local characteristics.

3.1.3. Penman-Monteith (PM)

The PM model has been applied to estimate latent heat flux (LE), which was used to determine the period of wetness on a leaf [11]. Since the PM model is a physical model, it can be applied in any location wherein the required meteorological observations are available. The main advantage of the PM model is that, unlike other physical models, it does not require air temperature measurements at the leaf level to estimate dew amount and duration. The LE for a mock leaf can be estimated for each interval of time using the PM approach [33]. The PM model is used as a representative of the physics-based LWD prediction model and uses a larger number of input variables than the CART and NHRH models. In particular, the predictive performance of the PM model is largely influenced by the precision of the net radiation (Rn) observation that is utilized. Sentelhas, et al. [15] reported that the PM model has low spatial variation in various meteorological conditions and provides a good performance with high applicability. The equation of LE is given in Equation (3):
LE   =   s R n + [ 1200 ( e s e a ) / ( r a + r b ) ] s + γ
where s, Rn, es, ea, γ , rb, and ra are the slope of the saturation vapor pressure curve (hPa), net radiation of the mock leaf (J/min/cm2), the saturated vapor pressure at the weather station air temperature (hPa), the actual air vapor pressure (hPa), the modified psychrometer constant (0.64 kPa/K is adopted in the current study), the boundary layer resistance for heat transfer, and the boundary layer resistance for WS, respectively. rb can be calculated by Equation (4):
r b   =   307 ( D WS ) 0.5
where D is the effective dimension of the mock leaf (0.07 m in this study). ra is calculated by Equation (4):
r a   =   ln [ ( Z s d ) / ( Z 0 ) ] 0.4 WS *
where Zs, d, Z0 and WS* are the height of the wetness sensor (m), the displacement height (0.5 Zc), the roughness length (0.13 Zc), and the friction velocity (m/s), respectively. Zc is the crop height. For LE estimates at the weather station where the wetness sensor is at the same level as the T, RH, and WS sensors, the boundary layer resistance for WS is not required in Equation (3). In the current study, Equation (3) is used to predict LE, since the levels from the Tair, RH, and WS sensors are different. When the LE is greater than zero, the LW is predicted at the specific time. Because the net radiation is not observed in the considered sites, it is estimated using the method suggested by Walter, et al. [34].

3.2. ML Algorithms

ELM, RF, and DNN for ML algorithms are used in the LWD prediction models. The input variables (known features) of the three models are Tair, RH, short wave radiation, WS, latitude, longitude, and hour on the 24-h-clock. Of these variables, the first four are meteorological variables. Because these variables have been used in the existing models, they can represent the meteorological influences on the LWD. The LWD can be altered depending on the location and time [20]. Geographical and time variables are used as input variables: latitude, longitude, and hour on the 24-h-clock. The training and test data sets are the same for the three ML algorithms. The data of all sites from 2009 to 2016, excluding 2013, are employed for the training data set. The data of all sites in 2017 are used for the test data sets. The occurrence of dew is predicted by the ML-based LW prediction models for every hour. The employed ML algorithms are briefly described in the following subsections.

3.2.1. Regularized Extreme Learning Machine

ELM is a single layer network with randomly generated weights and a bias between the input and hidden layers [35]. Unlike the traditional neural networks, which must iteratively optimize weights, the ELM model can be trained in a single iteration because of the randomized weight and bias. The ELM can be formulized by the following equation:
Y   =   H β
where Y , β , and H are the outputs, the weight matrix between hidden layer and the output layer, and the output vector of the hidden layer called nonlinear feature mapping, respectively. The outputs are calculated as follows:
H   =   f a ( X W + B )
where f a ( · ) , W , B , and x are the activation function, the weight matrix between the input layer and the hidden layer, the bias, and the inputs, respectively. Here, the sigmoid function ( f a ( x )   =   1 1 + exp ( x ) ) is used as the activation function in the ELM. Since the weights ( W ) and bias ( B ) are randomly generated and the activation function ( f a ( · ) ) is known in the ELM, H is comprised of deterministic variables from a data set. Thus, only β needs to be estimated for the ELM.
In the ELM, finding the appropriate weight set that avoids overfitting is important, as in an artificial neural network. Tuning weights in the EML can be considered as fitting a linear regression model using ordinary least square. Ridge regression was employed to attenuate multicollinearity in the data set by adding the norm of the parameters in the parameter estimation of the regression model [36]. In weight tuning, the ELM model also adapted this strategy. The ELM attempts to achieve a better generalization performance by attaining not only the smallest training error, but also the smallest norm of output weights. This minimization problem can take the form of ridge regression or regularized least squares as [37]:
min 1 2 β 2 + C 2 H β Y 2
where the first term of the objective function is the l2 norm regularization term which controls the complexity of the model. The second term is the training error associated with the learned model. C > 0 is a tuning parameter. The ELM gradient equation can be analytically solved, and the closed-form solution can be written as follows:
β ^   =   ( H T H + 1 C I ) 1 H T Y
where I is an identity matrix. The ELM model used herein is the regularized ELM model. A 10-fold cross validation is conducted to identify the optimal structure of the ELM model with the fixed tuning parameter (C = 0.5). The 10 folds are split with equal spans from the training data sets. Data of 9 folds are used to train the ELM model and data of the remaining fold that is not used for training is used to compute the root mean square error (RMSE) of the EML model between prediction and observation. Thus, 10 RMSEs are obtained from the 10-fold cross validation. The mean of the RMSE of the ELM model decreases as the number of nodes increases. The change in the RMSE mean is very small when the number of nodes is greater than 25. Twenty-five nodes would thus be the optimal number of nodes in the hidden layer for the ELM model of LWD prediction. The structure of the ELM model is presented in Figure 3.

3.2.2. Random Forest

The RF method has been widely applied in regression and forecasting problems [38,39,40]. The RF was proposed by Breiman [41] and uses a bagging (also referred to as bootstrapping in the field of statistics) to build a number of decision trees with controlled variance. Thus, the RF method consists of an ensemble of simple decision trees. Each decision tree in the RF is grown using randomly selected samples. Subsequently, the nodes in each tree use randomly selected features (called input variables). The RF has two major steps: (1) randomness and (2) ensemble learning.
The randomness in the RF comes from the random sampling of the entire data set and the selection of features with which every classification and regression tree is built. The data set is randomly sampled with replacement to create a subset with which to train one classification and regression tree. At each node, the optimal split rule is determined by using one of the randomly selected features from the employed features. Approximately two-thirds of the entire data set will be chosen as the training subset. Random selection without replacement is also performed on all features. The data set not included in the training subset is termed ‘out-of-bag’ and is applied to evaluate the performance of the RF model as well as the importance of the features.
The ensemble learning method in the RF means that all individual decision trees in a collection of decision trees (called the ensemble) contribute to the final prediction. A training subset is created after the random selection step. The classification and regression tree, without pruning, is used to construct a single decision tree. To grow K number of trees in an ensemble, this process (resampling a subset and training the individual tree) is repeated K times. The final predicted value comes from averaging the results of all individual trees. The ranger library in R package is used to construct a RF model in the current study [42]. The 10-fold cross validation is conducted to identify the optimal number of trees in the RF model. The RMSE mean of the RF decreases as the number of trees in the RF increases. The mean of the RMSE converges when the number of nodes is greater than 80. 80 may be the optimal number of trees for the RF model in this study this number is adopted for the number of trees in the RF model used to predict LWD.

3.2.3. Deep Neural Network

DNN is a multiple-hidden layer feed-forward network. The main difference between single- and multiple-hidden layer feed-forward networks is the number of hidden layers. A DNN has more than one hidden layer. The deep hidden layer of the DNN allows the emulation of a complex function relation between inputs and outputs by extracting the complicated feature structure [31]. In the current study, rectified linear unit (ReLU) function is used for the activation function of the DNN model. This application of the ReLU function provides a satisfactory performance in the neural network model with deep layers. For the employed DNN model, three hidden layers are adopted; each hidden layer has eighteen nodes. The structure of the DNN is manually optimized because there is no standard method to identify the optimal structure of DNN model. The structure of the employed DNN model is presented in Figure 4.
To train the DNN model, the stochastic gradient descent method with momentum and an adaptive learning rate (known as the decaying rate) is used. Pre-training is conducted to avoid wrong training and reduce computation time for training. An auto-encoder is applied for the pre-training of the DNN model. The initial weights, which are obtained from the auto-encoder, are adopted in the fine tuning of the DNN model. To implement the DNN model and train the DNN algorithm, the H2O library in R package is used.

3.3. Evaluation Measures

As evaluation criteria, the RMSE, Pearson correlation (Cor), means absolute error (MAE), and mean bias (mBias) are employed. The RMSE measures the accuracy of a prediction compared to an observation based on the variance of prediction and the square of bias between the observation and prediction. Since the RMSE can simultaneously measure both the precision and bias of a prediction, the RMSE has often been employed to evaluate the performance of prediction models. A lower value of the RMSE indicates a better performance of the model. Equation (10) presents the equation of the RMSE:
RMSE   =   1 n i = 1 n ( E i O i ) 2
where E i , O i , and n are the ith LWD prediction, the ith LWD observation, and the number of data, respectively. The Pearson correlation can be calculated as in the following equation:
Cor   =   i = 1 n ( E i E ¯ ) ( O i O ¯ ) i = 1 n ( E i E ¯ ) 2 i = 1 n ( O i O ¯ ) 2
where, E ¯ and O ¯ are the means of LWD prediction and observation, respectively. The MAE is a mean of absolute deviations between predictions and observations. The MAE provides information about average model-performance error. Equations of the MAE and mBias are given by Equations (12) and (13), respectively.
MAE   =   1 n i = 1 n | E i O i |
mBias   =   1 n i = 1 n ( E i O i )

4. Results

The quantile-quantile density plots of six LWD prediction models for all stations are presented in Figure 5. Since the PM method can predict LWD without null WS (= 0 m/s), the null WSs are replaced by a very small WS (= 0.1 m/s) for LWD computation in the PM model. For other models, null WS data are used in LWD calculation. Approximately 70% of data are that have the value of LWD is zero. Since too many data have zero values, the quantile-quantile density plot may inappropriately represent the performances of LWD prediction models for non-zero LWD data. Hence, the correct predictions for a zero value of LWD are excluded in Figure 5. All employed models show good performances for LWD prediction, particularly for large values of LWD (larger than 6 h). High density is observed in the diagonal lines for the ML-based models. Based on this density, the ML-based models perform better models than the existing models. Particularly, the RF and DNN models outperform the other models including the ELM, CART, NHRH, and PM models. In the correlation calculation, all data including those with a zero value of LWD are used, while correlation of the PM model is estimated using the data set with zero values for LWD data and replacement of null wind speed. Correlation estimates of the ELM, RF, and DNN models are 0.776, 0.811, and 0.798, respectively. Subsequently, the CART, NHRH, and PM models provide 0.661, 0.725, and 0.656, respectively. The ML-based prediction models thus provide larger correlations than the existing models. Based on correlation estimates, the RF is the best model for LWD prediction. The difference between performances of the RF and DNN models is very small.
To assess the performances of the employed models, the evaluation measures including Cor, RMSE, mBias, and MAE for all data are computed and presented in Table 3. In calculating these evaluation measures, the data set with zero values for LWD and replacement of the null WS are used for the PM, whereas the data set with zero values for LWD and the null WS are used for the other models. Based on the results of the evaluation measures for all data, the ML-based models lead to better performances than the existing models. The RMSEs of the ML-based models are less than 4.4 h while those of the existing model are larger than 4.6 h. While the MAEs of the existing model are larger than 3.0 h, those of the ML-based models are less than 2.9 h. The RF model provides the smallest RMSE and MAE; the RMSE and MAE of the RF are approximately 3.6 and 2.3 h, respectively. While the ELM, DNN, and PM models have large positive values of mBiases, the magnitude of mBiases for RF, CART, and NHRH are small.
The results presented in Figure 5 may improperly represent the prediction performance of the PM model because of the replacement of the null WS. The null WS can be a very small WS that is unable to be measured by installed instruments. Hence, the LWD prediction excluding the null WS data are used to assess the prediction performance of the PM model and for comparisons with the other LWD prediction models. Figure 6 presents the quantile-quantile density plots of six LWD prediction models for all stations excluding the correct predictions for zero values of the LWD and data having a null WS. When the WS is small, the dew on leaves accumulates and remains more easily. Hence, high density for large values of LWD observations disappear. The correlation estimates of all employed models become larger than those presented in Figure 5. Particularly, the correlation estimate of the PM model largely increases compared to the increments of other models. The correlation estimate of the PM model is still smaller than that of the ML-based models and NHRH model. Even if the data satisfying the condition of the PM model are used in evaluations of the model’s performance, the PM model is unable to outperform to the other models.
Figure 7 presents the RMSEs of five LWD models for all employed stations with replacement of the null WS. The data sets used in Figure 7 are identical to the data used in Table 3. Overall, the ML-based models provide smaller RMSEs compared to the existing models. For the employed sites, excluding sites 1, 2, and 8, all ML-based models lead to smaller RMSEs than the existing models. Thus, the smallest RMSEs are observed in the ML-based models for all sites. The RF leads to the smallest RMSE for sites 1, 2, 4, 5, 6, 8, and 9. For sites 3 and 7, the DNN provides the smallest RMSE. Overall, the results of RMSE comparisons suggests that the RF model is best-suited to LWD predictions. Among the existing models, the NHRH model shows the best performance. Based on the results of the RMSE, the CART model leads to the worst performance among the employed LWD model based on the results of RMSE.
The mBiases of the six LWD models for all employed stations are presented in Figure 8. The used data sets are the same as those used in Table 3. Overall, the ELM, RF, and PM models show positive values of biases. These models overestimate the LWD. These overestimations can be found at small values of LWD observations (from 0 to 3 h) in Figure 5. The CART and NHRH models provide negative values of biases for all sites excluding sites 4, 8, and 9; overall, these models underestimate the LWD. These underestimations can be found from 3- to 6-h LWD observations. For sites 4, 8, and 9, all employed models overestimate LWD predictions.
Figure 9 presents the MAEs of six LWD models for all employed stations. The used data sets are the same as those used in Table 3. Overall, results of the MAE are similar to the results of RMSE presented in Figure 7. The ML-based models lead to smaller MAE than the existing models for sites 3, 5, 6, 7, 8 and 9. Although the existing models have comparable performances to the ML-based models for sites 1, 2, and 4, the smallest MAEs are given by the RF that is one of the ML-based model. The best LWD prediction model is the RF model based on MAE. The RF model leads to the smallest MAEs for the employed sites except for sites 3 and 7. Values of the smallest MAEs for each site range from 1.5 to 3.2 h. The used LWD models provide good prediction performances for site 6. Overall values of MAEs for site 9 are larger than for other sites.
To investigate the detailed performance of the LWD prediction models, the times series of observed and predicted LWDs are compared. Time series plots of the LWD observations and LWD predictions by the RF, DNN, and NHRH methods for sites 1 and 4 in 2017 are presented in Figure 10. The three presented models successfully predict the LWD for the two sites. They also successfully determine dry days. For site 1 (Figure 10a), the NHRH occasionally largely overestimates the LWD when large values of LWD are observed. The RF and DNN models have similar performances based on a comparison by visual inspection. Overall, the magnitudes of LWD prediction by the DNN are slightly larger than those by RF.

5. Discussion

The size of the test data is more than three thousand (3285 data points). This number is limited to evaluate the performance of the model for reproducing long-term oscillation and trends. However, this number is sufficient to assess the performance of the model for their portability. Based on the results in this study, the ML-based models outperform existing models, such as the CART, NHRH, and PM models, for LWD prediction. Though the worst model of the ML-based models is the ELM model, its performance is better or equivalent to the existing models. The existing models are advantageous in that they can be applied for LWD predictions in any region. The PM model can predict LWDs without consideration of local characteristics when the required information is available. Additionally, many literatures reported that the performances of the CART and NHRH models are good enough for LWD prediction in many regions without any modifications [21,43,44,45]. On the other hand, the ML-based models can be used to predict LWD in the region of interest where the observed data are used to train the ML algorithms. In the current study, the ML-based models for LWD prediction are built using local data from farms in Gyeonggi province, South Korea. Additionally, the geographical information, i.e., latitude and longitude, are used for input variables of the ML algorithms to improve the predictive performance of the ML-based models. Inclusion of this local and geographical information lead to improvements in the LWD prediction performances of the ML-based models. The performances of the ML-based models, built in the current study, can valid with LWD predictions in Gyeonggi province, South Korea. To use the current ML-based models in other regions or other countries, the ML algorithms should be newly trained using the data in the regions of interest. The predictive performances of the existing models can be improved by adjusting parameters in the models based on the data in the regions of interest. Wang, et al. [24] reported that the performances of CART and NHRH models can be improved by adjusting parameters using local data. Because the ML-based models employed in this study can inherently reflect local and geographical characteristics in the LWD predictions, the performances of the ML-based models are superior to those of the existing models for the LWD data used here.
In the current study, the RF model is considered the best LWD prediction model in Gyeonggi province, South Korea. Based on evaluation results for all data, the RF model demonstrates the best performances. Based on evaluation results of individual sites, the RF model provides the best prediction performances. The RF model returns the smallest RMSE and MAE for seven sites. After the RF model, the DNN model gives the smallest RMSE and MAE. The CART model has proved its applicability and performance, with regard to the LWD predictions, across many literatures. Because the RF algorithm is an extension of the CART algorithm, the RF model should be able to provide more precise predictions than the CART algorithm with the same data set. When the input variables for the CART model, including Tair, RH, Tdew, and WS, are a part of the input variables for the RF model, performance of the RF model would be better than that of the CART model for LWD prediction. Therefore, among the ML algorithms, the RF model is a good candidate for implementation in LWD prediction models.
As mentioned in Section 3.2.2, the RF model can assess the importance of variables for target variable prediction. In the current study, permutation importance (also known as the mean decrease in accuracy) is used to assess the importance of each variable to LW by removing the association between that variable and the target. In permutation importance, the importance is calculated by randomly permuting the variable and measuring the resulting increase in error. Thus, the variable importance in the RF model indicates the mean of increments between mean square errors with and without the selected variable. Although the results of variable importance from the RF model may not represent the true associations between input variables and LW because of a lack of in-depth analysis, they can roughly show their associations.
The variable importance of the employed RF model is presented in Figure 11. In descending order, magnitudes of each variable importance are RH, short wave radiation, Tair, hour, latitude, longitude, and WS. The first three variables in the order are meteorological variables. The meteorological information is the most important in explaining the LW phenomena. From this result, it can be inferred that the physical mechanism to produce LW and the relations between LW and meteorological variables may be consistent across the regions of interest in the current study. If they are not consistent, importance of the geographical information would be larger than those of the three meteorological variables. The most and least important variables are RH and WS, respectively. The RH is obviously an important variable in the LWD prediction as it is used as an input variable in all existing models. Furthermore, the NHRH, that is, the best model among the existing models considered in this study, uses only RH in its LWD prediction.
The short-wave radiation is not the common variables measured at weather stations. The portability of the model may be limited due to availability of short wave radiation observation. The short-wave radiation can be estimated by remote-sensing (satellite) and empirical equations. Although the use of short wave radiation estimate can worsen the performance of the model, the model with short wave radiation may lead to a better performance than the model without short wave radiation because it was considered as the second most critical variable among the tested input variables. In a further study, the difference between uses of short wave radiation observation and estimates should be examined for improving our understanding of the LW prediction. In addition, the model with variables having high availability should be developed for LW prediction.
Based on the presented results of variable importance, the hour variable, which represents the diurnal cycle of LW, is more important than the geographical information. Since meteorological variables such as Tair and short-wave radiation have a diurnal cycle, they can represent the diurnal cycle. In fact, they may not represent a diurnal cycle because of the seasonal cycle of meteorological variables. Therefore, the importance of the time variable may be larger than that of the geographical information.
Performance of the model without latitude and longitude was assessed and compared with the proposed model. The results were not presented in this study. As a result, uses of latitude and longitude for the input variable improved the performance of the model. This result indicates that the use of latitude and longitude in the model may allow that the model considers spatial dependence between LW and errors in the model. Considering the spatial dependence of the LW and errors would lead to the improvement in the performance of the model. The further study should investigate the influences of the spatial dependence in the LW and the errors on the model and compare with influences of other variables. In addition, the proposed model can be used for the stations in South Korea because of consideration of the spatial dependence. For other countries, the model should be developed using the data measured at areas of interests.

6. Conclusions

LWD prediction models are developed using advanced machine learning algorithms including ELM, RF, and DNN for LWD data from farms in Gyeonggi province, South Korea. Performances of the ML-based models are compared with those of existing models such as the NHRH, CART, and PM models to examine their applicability and suitability in LWD prediction. The following conclusions are ascertained:
(1)
The ML-based models outperform the existing models in LWD prediction for farms in Gyeonggi province, South Korea. The ML-based models provide better performances than the existing models based on all employed evaluation measures. Additionally, the worst model among the ML-based models has a comparable performance in LWD prediction to the existing models. Its performance is better than or equivalent to those of the existing models based on all employed evaluation measures.
(2)
Among the employed models, the RF model is the best for LWD based on the results of evaluation measures for all data. Additionally, it leads to the best performance for seven among nine sites. Although the DNN demonstrates the best performance for two sites, performance of the RF model would be nearly comparable to that of the DNN. Thus, for ML algorithms used in LWD prediction models, the RF algorithm is a good candidate.
(3)
The strengths of associations between input variables and LW are, in descending order, RH, short wave radiation, Tair, hour, latitude, longitude, and WS based on the results of variable importance from the RF model. The first three variables in this ordered list are meteorological variables; the meteorological information is the most important in explaining LW phenomena. The hour variable representing the diurnal cycle is more important than geographical information and WS data.
In the current study, all observed data are used to build the LWD prediction models using ML algorithms. The mechanisms producing LW may be different in rainy and dry days. Hence, to improve the predictive performance of the LWD prediction model using a ML algorithm, the LWD prediction models should be built separately for rainy and dry days. Additionally, though the RF leads to the best performance among the employed ML algorithms in this study, the best algorithm may vary across different regions of interest. Because performance of ML algorithms strongly relies upon used data, appropriateness of the ML algorithms for LWD prediction models should be assessed with LWD data in other regions or countries. This would allow a more in-depth investigation of the suitability of ML algorithms to LWD prediction.

Author Contributions

Conceptualization, K.R.K. and J.-Y.S.; methodology, J.P. and J.-Y.S.; software, J.P.; validation, J.P. and J.-Y.S.; formal analysis, J.P.; investigation, K.R.K. and J.-Y.S.; resources, K.R.K. and J.P.; data curation, J.P.; writing—original draft preparation, J.-Y.S.; writing—review and editing, K.R.K.; visualization, J.P.; supervision, K.R.K.; project administration, K.R.K. and J.-C.H.; funding acquisition, J.-C.H.

Funding

This work was funded by the Korea Meteorological Administration Research and Development Program “Advanced Research on Biometeorology and Industrial Meteorology” under Grant (1365003004).

Acknowledgments

This work was funded by the Korea Meteorological Administration Research and Development Program “Advanced Research on Biometeorology and Industrial Meteorology” under Grant (1365003004).

Conflicts of Interest

Declare conflicts of interest or state “The authors declare no conflict of interest”.

References

  1. Gleason, M.L.; Ricker, M.D.; MacNab, A.A.; East, D.A.; Pitblado, R.E.; Latin, R.X. Disease-Warning Systems for Processing Tomatoes in Eastern North America: Are We There Yet? Plant Dis. 1995, 79, 113–121. [Google Scholar] [CrossRef]
  2. Gillespie, T.J.; Sentelhas, P.C. Agrometeorology and plant disease management: A happy marriage. Sci. Agric. 2008, 65, 71–75. [Google Scholar] [CrossRef]
  3. Huber, L.; Gillespie, T.J. Modeling Leaf Wetness in Relation to Plant Disease Epidemiology. Annu. Rev. Phytopathol. 1992, 30, 553–577. [Google Scholar] [CrossRef]
  4. Schmitz, H.F.; Grant, R.H. Precipitation and dew in a soybean canopy: Spatial variations in leaf wetness and implications for Phakopsora pachyrhizi infection. Agric. For. Meteorol. 2009, 149, 1621–1627. [Google Scholar] [CrossRef]
  5. Magarey, R.D.; Russo, J.M.; Seem, R.C.; Gadoury, D.M. Surface wetness duration under controlled environmental conditions. Agric. For. Meteorol. 2005, 128, 111–122. [Google Scholar] [CrossRef]
  6. Rowlandson, T.; Gleason, M.; Sentelhas, P.; Gillespie, T.; Thomas, C.; Hornbuckle, B. Reconsidering Leaf Wetness Duration Determination for Plant Disease Management. Plant Dis. 2015, 99, 310–319. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  7. Gleason, M.L.; Duttweiler, K.B.; Batzer, J.C.; Taylor, S.E.; Sentelhas, P.C.; Monteiro, J.E.B.A.; Gillespie, T.J. Obtaining weather data for input to crop disease-warning systems: Leaf wetness duration as a case study. Sci. Agric. 2008, 65, 76–87. [Google Scholar] [CrossRef]
  8. Miranda, R.A.C.; Davies, T.D.; Cornell, S.E. A laboratory assessment of wetness sensors for leaf, fruit and trunk surfaces. Agric. For. Meteorol. 2000, 102, 263–274. [Google Scholar] [CrossRef]
  9. Sentelhas, P.C.; Gillespie, T.J.; Gleason, M.L.; Monteiro, J.E.B.A.; Helland, S.T. Operational exposure of leaf wetness sensors. Agric. For. Meteorol. 2004, 126, 59–72. [Google Scholar] [CrossRef]
  10. Sentelhas, P.C.; Gillespie, T.J.; Santos, E.A. Leaf wetness duration measurement: Comparison of cylindrical and flat plate sensors under different field conditions. Int. J. Biometeorol. 2007, 51, 265–273. [Google Scholar] [CrossRef]
  11. Rao, P.S.; Gillespie, T.J.; Schaafsma, A.W. Estimating wetness duration on maize ears from meteorological observations. Can. J. Soil Sci. 1998, 78, 149–154. [Google Scholar] [CrossRef]
  12. Kim, K.S.; Taylor, S.E.; Gleason, M.L.; Koehler, K.J. Model to Enhance Site-Specific Estimation of Leaf Wetness Duration. Plant Dis. 2002, 86, 179–185. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Papastamati, K.; McCartney, H.A.; Bosch, F.V.D. Modelling leaf wetness duration during the rosette stage of oilseed rape. Agric. For. Meteorol. 2004, 123, 69–78. [Google Scholar] [CrossRef] [Green Version]
  14. Leca, A.; Parisi, L.; Lacointe, A.; Saudreau, M. Comparison of Penman–Monteith and non-linear energy balance approaches for estimating leaf wetness duration and apple scab infection. Agric. For. Meteorol. 2011, 151, 1158–1162. [Google Scholar] [CrossRef]
  15. Sentelhas, P.C.; Gillespie, T.J.; Gleason, M.L.; Monteiro, J.E.B.M.; Pezzopane, J.R.M.; Pedro, M.J. Evaluation of a Penman–Monteith approach to provide “reference” and crop canopy leaf wetness duration estimates. Agric. For. Meteorol. 2006, 141, 105–117. [Google Scholar] [CrossRef]
  16. Gillespie, T.J.; Srivastava, B.; Pitblado, R.E. Using Operational Weather Data to Schedule Fungicide Sprays on Tomatoes in Southern Ontario, Canada. J. Appl. Meteorol. Climatol. 1993, 32, 567–573. [Google Scholar] [CrossRef] [Green Version]
  17. Gleason, M.L.; Taylor, S.E.; Loughin, T.M.; Koehler, K.J. Development and validation of an empirical model to estimate the duration of dew periods. Plant Dis. 1994, 78, 1011–1016. [Google Scholar] [CrossRef]
  18. Sentelhas, P.C.; Dalla Marta, A.; Orlandini, S.; Santos, E.A.; Gillespie, T.J.; Gleason, M.L. Suitability of relative humidity as an estimator of leaf wetness duration. Agric. For. Meteorol. 2008, 148, 392–400. [Google Scholar] [CrossRef]
  19. Montone, V.O.; Fraisse, C.W.; Peres, N.A.; Sentelhas, P.C.; Gleason, M.; Ellis, M.; Schnabel, G. Evaluation of leaf wetness duration models for operational use in strawberry disease-warning systems in four US states. Int. J. Biometeorol. 2016, 60, 1761–1774. [Google Scholar] [CrossRef]
  20. Alvares, C.A.; de Mattos, E.M.; Sentelhas, P.C.; Miranda, A.C.; Stape, J.L. Modeling temporal and spatial variability of leaf wetness duration in Brazil. Theor. Appl. Climatol. 2014, 120, 455–467. [Google Scholar] [CrossRef]
  21. Mashonjowa, E.; Ronsse, F.; Mubvuma, M.; Milford, J.R.; Pieters, J.G. Estimation of leaf wetness duration for greenhouse roses using a dynamic greenhouse climate model in Zimbabwe. Comput. Electron. Agric. 2013, 95, 70–81. [Google Scholar] [CrossRef]
  22. Kim, K.S.; Taylor, S.E.; Gleason, M.L.; Villalobos, R.; Arauz, L.F. Estimation of leaf wetness duration using empirical models in northwestern Costa Rica. Agric. For. Meteorol. 2005, 129, 53–67. [Google Scholar] [CrossRef]
  23. Stella, A.; Caliendo, G.; Melgani, F.; Goller, R.; Barazzuol, M.; La Porta, N. Leaf Wetness Evaluation Using Artificial Neural Network for Improving Apple Scab Fight. Environments 2017, 4, 42. [Google Scholar] [CrossRef]
  24. Wang, H.; Sanchez-Molina, J.; Li, M.; Rodríguez Díaz, F. Improving the Performance of Vegetable Leaf Wetness Duration Models in Greenhouses Using Decision Tree Learning. Water 2019, 11, 158. [Google Scholar] [CrossRef]
  25. Park, J.S.; Seo, Y.A.; Kim, K.R.; Ha, J.C. Evaluating the prediction models of leaf wetness duration for citrus orchards in Jeju, South Korea. Korean J. Argric. For. Meteorol. 2018, 20, 262–276. [Google Scholar] [CrossRef]
  26. Marta, A.D.; De Vincenzi, M.; Dietrich, S.; Orlandini, S. Neural network for the estimation of leaf wetness duration: Application to a Plasmopara viticola infection forecasting. Phys. Chem. Earth 2005, 30, 91–96. [Google Scholar] [CrossRef]
  27. Francl, L.J.; Panigrahi, S. Artificial neural network models of wheat leaf wetness. Agric. For. Meteorol. 1997, 88, 57–65. [Google Scholar] [CrossRef]
  28. Chtioui, Y.; Panigrahi, S.; Francl, L. A generalized regression neural network and its application for leaf wetness prediction to forecast plant disease. Chemom. Intell. Lab. Syst. 1999, 48, 47–58. [Google Scholar] [CrossRef]
  29. Kim, K.S.; Taylor, S.E.; Gleason, M.L. Development and validation of a leaf wetness duration model using a fuzzy logic system. Agric. For. Meteorol. 2004, 127, 53–64. [Google Scholar] [CrossRef]
  30. Fan, J.; Yue, W.; Wu, L.; Zhang, F.; Cai, H.; Wang, X.; Lu, X.; Xiang, Y. Evaluation of SVM, ELM and four tree-based ensemble models for predicting daily reference evapotranspiration using limited meteorological data in different climates of China. Agric. For. Meteorol. 2018, 263, 225–241. [Google Scholar] [CrossRef]
  31. Tao, Y.; Gao, X.; Hsu, K.; Sorooshian, S.; Ihler, A. A Deep Neural Network Modeling Framework to Reduce Bias in Satellite Precipitation Products. J. Hydrometeorol. 2016, 17, 931–945. [Google Scholar] [CrossRef]
  32. Lu, X.; Ju, Y.; Wu, L.; Fan, J.; Zhang, F.; Li, Z. Daily pan evaporation modeling from local and cross-station data using three tree-basedmachine learning models. J. Hydrol. 2018, 566, 668–684. [Google Scholar] [CrossRef]
  33. Monteith, J.L.; Unsworth, M.H. Principles of Environmental Physics. In Principles of Environmental Physics, 4th ed.; Monteith, J.L., Unsworth, M.H., Eds.; Academic Press: Boston, MA, USA, 2013. [Google Scholar]
  34. Walter, I.A.; Allen, R.G.; Elliott, R.; Jensen, M.E.; Itenfisu, D.; Mecham, B.; Howel, T.A.; Snyder, R.; Brown, P.; Echings, S.; et al. ASCE’s Standardized Reference Evapotranspiration Equation. In Watershed Management and Operations Management 2000; ASCE: Reston, VA, USA, 2000; pp. 1–11. [Google Scholar]
  35. Huang, G.B.; Zhu, Q.Y.; Siew, C.K. Extreme learning machine: Theory and applications. Neurocomputing 2006, 70, 489–501. [Google Scholar] [CrossRef]
  36. Peña, M.; Dool, H.V.D. Consolidation of Multimodel Forecasts by Ridge Regression: Application to Pacific Sea Surface Temperature. J. Clim. 2008, 21, 6521–6538. [Google Scholar] [CrossRef] [Green Version]
  37. Huang, G.; Zhou, H.; Ding, X.; Zhang, R. Extreme Learning Machine for Regression and Multiclass Classification. IEEE Trans. Syst. Man Cybern. B Cybern. 2012, 42, 513–529. [Google Scholar] [CrossRef]
  38. Yang, T.; Asanjan, A.A.; Welles, E.; Gao, X.; Sorooshian, S.; Liu, X. Developing reservoir monthly inflow forecasts using Artificial Intelligence and Climate Phenomenon Information. Water Resour. Res. 2017, 53, 2786–2812. [Google Scholar] [CrossRef]
  39. Pang, B.; Yue, J.; Zhao, G.; Xu, Z. Statistical Downscaling of Temperature with the Random Forest Model. Adv. Met. 2017, 2017, 7265178. [Google Scholar] [CrossRef]
  40. Choi, C.; Kim, J.; Kim, J.; Kim, D.; Bae, Y.; Kim, H.S. Development of Heavy Rain Damage Prediction Model Using Machine Learning Based on Big Data. Adv. Met. 2018, 2018, 5024930. [Google Scholar] [CrossRef]
  41. Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  42. Wright, M.N.; Ziegler, A. Ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R. J. Stat. Softw. 2017, 77, 1–17. [Google Scholar] [CrossRef]
  43. Lee, K.J.; Kang, J.Y.; Lee, D.Y.; Jang, S.W.; Lee, S.; Lee, B.W.; Kim, K.S. Use of an Empirical Model to Estimate Leaf Wetness Duration for Operation of a Disease Warning System Under a Shade in a Ginseng Field. Plant Dis. 2016, 100, 25–31. [Google Scholar] [CrossRef] [PubMed]
  44. Bassimba, D.D.M.; Intrigliolo, D.S.; Dalla Marta, A.; Orlandini, S.; Vicent, A. Leaf wetness duration in irrigated citrus orchards in the Mediterranean climate conditions. Agric. For. Meteorol. 2017, 234–235, 182–195. [Google Scholar] [CrossRef]
  45. Beruski, G.C.; Gleason, M.L.; Sentelhas, P.C.; Pereira, A.B. Leaf wetness duration estimation and its influence on a soybean rust warning system. Australas. Plant Pathol. 2019, 48, 395–408. [Google Scholar] [CrossRef]
Figure 1. Locations of used meteorological sites.
Figure 1. Locations of used meteorological sites.
Water 11 01878 g001
Figure 2. Time series plots of LWD observations for four sites in 2017.
Figure 2. Time series plots of LWD observations for four sites in 2017.
Water 11 01878 g002
Figure 3. Structure of the extreme learning machine (ELM) model used in this study.
Figure 3. Structure of the extreme learning machine (ELM) model used in this study.
Water 11 01878 g003
Figure 4. Structure of the deep neural network (DNN) model used in this study.
Figure 4. Structure of the deep neural network (DNN) model used in this study.
Water 11 01878 g004
Figure 5. Quantile-Quantile plots of six LWD prediction models. Noted that null speed is replaced by 0.1 m/s for LWD prediction from the PM model, and the correct predictions for a zero value of LWD are excluded for all models.
Figure 5. Quantile-Quantile plots of six LWD prediction models. Noted that null speed is replaced by 0.1 m/s for LWD prediction from the PM model, and the correct predictions for a zero value of LWD are excluded for all models.
Water 11 01878 g005
Figure 6. Quantile-Quantile plots of six LWD prediction models without null wind speed data. Note that the correct predictions of zero values for LWD and data having null wind speeds are excluded.
Figure 6. Quantile-Quantile plots of six LWD prediction models without null wind speed data. Note that the correct predictions of zero values for LWD and data having null wind speeds are excluded.
Water 11 01878 g006
Figure 7. Root mean square errors (RMSEs) of five LWD prediction models for all employed stations. Note that the used data sets are the same as those used in Table 3.
Figure 7. Root mean square errors (RMSEs) of five LWD prediction models for all employed stations. Note that the used data sets are the same as those used in Table 3.
Water 11 01878 g007
Figure 8. Mean bias (mBias) of the five LWD prediction models for all employed stations. Note that the used data sets are the same as those used in Table 3.
Figure 8. Mean bias (mBias) of the five LWD prediction models for all employed stations. Note that the used data sets are the same as those used in Table 3.
Water 11 01878 g008
Figure 9. Mean absolute errors (MAEs) of the five LWD prediction models for all employed stations. Note that the used data sets are the same as those used in Table 3.
Figure 9. Mean absolute errors (MAEs) of the five LWD prediction models for all employed stations. Note that the used data sets are the same as those used in Table 3.
Water 11 01878 g009
Figure 10. Time series plots of LWD observations and LWD predictions by the RF, DNN, and NHRH methods for sites 1 and 4 in 2017.
Figure 10. Time series plots of LWD observations and LWD predictions by the RF, DNN, and NHRH methods for sites 1 and 4 in 2017.
Water 11 01878 g010
Figure 11. Plot of variable importance in the RF model.
Figure 11. Plot of variable importance in the RF model.
Water 11 01878 g011
Table 1. Information regarding meteorological sites.
Table 1. Information regarding meteorological sites.
SiteLatitude (Degree)Longitude (Degree)Elevation (m)Field Type
137.221127.03945Rice field
237.039126.86424Rice field
337.012127.30634Rice field
437.217126.7026Grape orchard
537.178127.302150Rice field
637.350127.62557Apple orchard
737.844127.50268Rice field
837.600127.25165Pear orchard
938.036127.201101Apple orchard
Table 2. Basic statistics of observed meteorological, leaf wetness (LW), and leaf wetness duration (LWD) data for the nine sites. Noted that the numbers in/out of the bracket are means and standard deviations, respectively.
Table 2. Basic statistics of observed meteorological, leaf wetness (LW), and leaf wetness duration (LWD) data for the nine sites. Noted that the numbers in/out of the bracket are means and standard deviations, respectively.
VariablesSite1Site2Site3Site4Site5Site6Site7Site8Site9
Air temperature12.0 (11.2)11.2 (11.0)11.5 (11.5)11.5 (11.2)10.7 (11.3)10.8 (12.0)10.6 (11.8)10.9 (11.3)10.3 (12.0)
Relative humidity66.7 (20.2)75.1 (20.3)73.4 (19.6)73.2 (20.6)72.0 (21.7)72.7 (23.3)73.0 (21.6)69.3 (23.3)71.5 (23.8)
Short wave radiation0.52 (0.82)0.55 (0.84)0.46 (0.75)0.54 (0.83)0.54 (0.83)0.51 (0.79)0.49 (0.80)0.46 (0.73)0.46 (0.77)
Wind speed1.368 (1.384)1.816 (1.779)0.954 (1.373)1.302 (1.496)0.963 (1.164)0.640 (0.919)0.733 (1.069)0.892 (0.966)0.823 (1.010)
LW0.259 (0.438)0.383 (0.486)0.379 (0.485)0.340 (0.474)0.365 (0.481)0.404 (0.491)0.373 (0.484)0.220 (0.414)0.296 (0.456)
LWD5.68 (6.40)8.62 (5.78)8.33 (5.92)7.47 (6.03)8.12 (5.94)8.74 (5.97)7.99 (6.38)4.90 (6.03)6.33 (6.15)
Table 3. Evaluation results for all data.
Table 3. Evaluation results for all data.
ELMRFDNNCARTNHRHPM
Cor.0.7760.8070.7980.6610.7250.656
RMSE4.4843.6814.0205.4294.6065.180
mBias1.7240.3881.1160.323−0.0281.831
MAE2.8512.3142.5023.4792.9993.526
Note that null wind speeds are replaced by 0.1 m/s in calculating evaluation measures for the PM model.

Share and Cite

MDPI and ACS Style

Park, J.; Shin, J.-Y.; Kim, K.R.; Ha, J.-C. Leaf Wetness Duration Models Using Advanced Machine Learning Algorithms: Application to Farms in Gyeonggi Province, South Korea. Water 2019, 11, 1878. https://doi.org/10.3390/w11091878

AMA Style

Park J, Shin J-Y, Kim KR, Ha J-C. Leaf Wetness Duration Models Using Advanced Machine Learning Algorithms: Application to Farms in Gyeonggi Province, South Korea. Water. 2019; 11(9):1878. https://doi.org/10.3390/w11091878

Chicago/Turabian Style

Park, Junsang, Ju-Young Shin, Kyu Rang Kim, and Jong-Chul Ha. 2019. "Leaf Wetness Duration Models Using Advanced Machine Learning Algorithms: Application to Farms in Gyeonggi Province, South Korea" Water 11, no. 9: 1878. https://doi.org/10.3390/w11091878

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop